使用 GL_TRIANGLE_STRIP 或索引 GL_TRIANGLES 绘制动态数量的四边形是不是更有效
Posted
技术标签:
【中文标题】使用 GL_TRIANGLE_STRIP 或索引 GL_TRIANGLES 绘制动态数量的四边形是不是更有效【英文标题】:Is It More Efficient to Use GL_TRIANGLE_STRIP or Indexed GL_TRIANGLES to a Draw a Dynamic Number of Quads使用 GL_TRIANGLE_STRIP 或索引 GL_TRIANGLES 绘制动态数量的四边形是否更有效 【发布时间】:2013-03-25 14:55:57 【问题描述】:我正在用 C++ 开发一个简单的基于 sprite 的 2D 游戏,它使用 OpenGL 进行硬件加速渲染,使用 SDL 进行窗口管理和用户输入处理。因为它是一个 2D 游戏,我只需要绘制四边形,但由于精灵的数量是动态的,我永远不能依赖于有一个恒定数量的四边形。因此,我需要通过我的 VBO 每帧重新缓冲所有顶点数据(因为可能比上一帧中的四边形更多或更少,因此缓冲区的大小可能不同)。
到目前为止,我的原型程序创建了一个窗口,并允许用户使用向上和向下箭头键在对角线行中添加和删除四边形。现在我正在绘制的四边形是简单的、无纹理的白色方块。这是我正在使用的代码(在 OS X 10.6.8 和带有 OpenGL 2.1 的 Ubuntu 12.04 下编译并正常工作):
#if defined(__APPLE__)
#include <OpenGL/OpenGL.h>
#endif
#if defined(__linux__)
#define GL_GLEXT_PROTOTYPES
#include <GL/glx.h>
#endif
#include <GL/gl.h>
#include <SDL.h>
#include <iostream>
#include <vector>
#include <string>
struct Vertex
//vertex coordinates
GLint x;
GLint y;
;
//Constants
const int SCREEN_WIDTH = 1024;
const int SCREEN_HEIGHT = 768;
const int FPS = 60; //our framerate
//Globals
SDL_Surface *screen; //the screen
std::vector<Vertex> vertices; //the actual vertices for the quads
std::vector<GLint> startingElements; //the index where the 4 vertices of each quad begin in the 'vertices' vector
std::vector<GLint> counts; //the number of vertices for each quad
GLuint VBO = 0; //the handle to the vertex buffer
void createVertex(GLint x, GLint y)
Vertex vertex;
vertex.x = x;
vertex.y = y;
vertices.push_back(vertex);
//creates a quad at position x,y, with a width of w and a height of h (in pixels)
void createQuad(GLint x, GLint y, GLint w, GLint h)
//Since we're drawing the quads using GL_TRIANGLE_STRIP, the vertex drawing
//order is from top to bottom, left to right, like so:
//
// 1-----3
// | |
// | |
// 2-----4
createVertex(x, y); //top-left vertex
createVertex(x, y+h); //bottom-left vertex
createVertex(x+w, y); //top-right vertex
createVertex(x+w, y+h); //bottom-right vertex
counts.push_back(4); //each quad will always have exactly 4 vertices
startingElements.push_back(startingElements.size()*4);
std::cout << "Number of Quads: " << counts.size() << std::endl; //print out the current number of quads
//removes the most recently created quad
void removeQuad()
if (counts.size() > 0) //we don't want to remove a quad if there aren't any to remove
for (int i=0; i<4; i++)
vertices.pop_back();
startingElements.pop_back();
counts.pop_back();
std::cout << "Number of Quads: " << counts.size() << std::endl;
else
std::cout << "Sorry, you can't remove a quad if there are no quads to remove!" << std::endl;
void init()
//initialize SDL
SDL_Init(SDL_INIT_VIDEO | SDL_INIT_TIMER);
screen = SDL_SetVideoMode(SCREEN_WIDTH, SCREEN_HEIGHT, 0, SDL_OPENGL);
#if defined(__APPLE__)
//Enable vsync so that we don't get tearing when rendering
GLint swapInterval = 1;
CGLSetParameter(CGLGetCurrentContext(), kCGLCPSwapInterval, &swapInterval);
#endif
//Disable depth testing, lighting, and dithering, since we're going to be doing 2D rendering only
glDisable(GL_DEPTH_TEST);
glDisable(GL_LIGHTING);
glDisable(GL_DITHER);
glPushAttrib(GL_DEPTH_BUFFER_BIT | GL_LIGHTING_BIT);
//Set the projection matrix
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrtho(0, SCREEN_WIDTH, SCREEN_HEIGHT, 0, -1.0, 1.0);
//Set the modelview matrix
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
//Create VBO
glGenBuffers(1, &VBO);
glBindBuffer(GL_ARRAY_BUFFER, VBO);
void gameLoop()
int frameDuration = 1000/FPS; //the set duration (in milliseconds) of a single frame
int currentTicks;
int pastTicks = SDL_GetTicks();
bool done = false;
SDL_Event event;
while(!done)
//handle user input
while(SDL_PollEvent(&event))
switch(event.type)
case SDL_KEYDOWN:
switch (event.key.keysym.sym)
case SDLK_UP: //create a new quad every time the up arrow key is pressed
createQuad(64*counts.size(), 64*counts.size(), 64, 64);
break;
case SDLK_DOWN: //remove the most recently created quad every time the down arrow key is pressed
removeQuad();
break;
default:
break;
break;
case SDL_QUIT:
done = true;
break;
default:
break;
//Clear the color buffer
glClear(GL_COLOR_BUFFER_BIT);
glBindBuffer(GL_ARRAY_BUFFER, VBO);
//replace the current contents of the VBO with a completely new set of data (possibly including either more or fewer quads)
glBufferData(GL_ARRAY_BUFFER, vertices.size()*sizeof(Vertex), &vertices.front(), GL_DYNAMIC_DRAW);
glEnableClientState(GL_VERTEX_ARRAY);
//Set vertex data
glVertexPointer(2, GL_INT, sizeof(Vertex), 0);
//Draw the quads
glMultiDrawArrays(GL_TRIANGLE_STRIP, &startingElements.front(), &counts.front(), counts.size());
glDisableClientState(GL_VERTEX_ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, 0);
//Check to see if we need to delay the duration of the current frame to match the set framerate
currentTicks = SDL_GetTicks();
int currentDuration = (currentTicks - pastTicks); //the duration of the frame so far
if (currentDuration < frameDuration)
SDL_Delay(frameDuration - currentDuration);
pastTicks = SDL_GetTicks();
// flip the buffers
SDL_GL_SwapBuffers();
void cleanUp()
glDeleteBuffers(1, &VBO);
SDL_FreeSurface(screen);
SDL_Quit();
int main(int argc, char *argv[])
std::cout << "To create a quad, press the up arrow. To remove the most recently created quad, press the down arrow." << std::endl;
init();
gameLoop();
cleanUp();
return 0;
目前我正在使用带有 glMultiDrawArrays() 的 GL_TRIANGLE_STRIPS 来渲染我的四边形。这行得通,而且在性能方面似乎相当不错,但我想知道是否将 GL_TRIANGLES 与 IBO 结合使用以避免重复顶点是一种更有效的渲染方式?我做了一些研究,有些人认为索引 GL_TRIANGLES 通常优于 GL_TRIANGLE_STRIPS,但他们似乎也假设四边形的数量将保持不变,因此 VBO 和 IBO 的大小不必每帧都重新缓冲.这是我对索引 GL_TRIANGLES 最大的犹豫:如果我确实实现了索引 GL_TRIANGLES,除了每帧重新缓冲整个 VBO 之外,我还必须在每帧重新缓冲整个索引缓冲区,这也是因为四边形的动态数量。
所以基本上,我的问题是:鉴于由于四边形的动态数量,我必须在每一帧将所有顶点数据重新缓冲到 GPU,切换到索引 GL_TRIANGLES 来绘制四边形会更有效吗?还是我应该坚持我目前的 GL_TRIANGLE_STRIP 实现?
【问题讨论】:
我认为在您不必担心 GL_TRIANGLES 与 GL_TRIANGLE_STRIP 之前,您应该尽量减少 glBufferData() 调用。最简单的优化:保留一个脏标志,用于存储自上次 glBufferData() 调用以来是否调用了 createQuad/removeQuad,并且仅在设置标志时重新创建缓冲区。 这是一个很好的建议,谢谢!我一定会实现的。 【参考方案1】:使用未编入索引的GL_QUADS
/GL_TRIANGLES
和glDrawArrays()
调用可能会很好。
SDL_Surface *screen;
...
screen = SDL_SetVideoMode(SCREEN_WIDTH, SCREEN_HEIGHT, 0, SDL_OPENGL);
...
SDL_FreeSurface(screen);
Don't do that:
返回的表面由
SDL_Quit
释放并且不能由调用者释放。此规则还包括对SDL_SetVideoMode
的连续调用(即调整大小或分辨率更改),因为现有表面将自动释放。
编辑:简单的顶点数组演示:
// g++ main.cpp -lglut -lGL
#include <GL/glut.h>
#include <vector>
using namespace std;
// OpenGL Mathematics (GLM): http://glm.g-truc.net/
#include <glm/glm.hpp>
#include <glm/gtc/random.hpp>
using namespace glm;
struct SpriteWrangler
SpriteWrangler( unsigned int aSpriteCount )
verts.resize( aSpriteCount * 6 );
states.resize( aSpriteCount );
for( size_t i = 0; i < states.size(); ++i )
states[i].pos = linearRand( vec2( -400, -400 ), vec2( 400, 400 ) );
states[i].vel = linearRand( vec2( -30, -30 ), vec2( 30, 30 ) );
Vertex vert;
vert.r = (unsigned char)linearRand( 64.0f, 255.0f );
vert.g = (unsigned char)linearRand( 64.0f, 255.0f );
vert.b = (unsigned char)linearRand( 64.0f, 255.0f );
vert.a = 255;
verts[i*6 + 0] = verts[i*6 + 1] = verts[i*6 + 2] =
verts[i*6 + 3] = verts[i*6 + 4] = verts[i*6 + 5] = vert;
void wrap( const float minVal, float& val, const float maxVal )
if( val < minVal )
val = maxVal - fmod( maxVal - val, maxVal - minVal );
else
val = minVal + fmod( val - minVal, maxVal - minVal );
void Update( float dt )
for( size_t i = 0; i < states.size(); ++i )
states[i].pos += states[i].vel * dt;
wrap( -400.0f, states[i].pos.x, 400.0f );
wrap( -400.0f, states[i].pos.y, 400.0f );
float size = 20.0f;
verts[i*6 + 0].pos = states[i].pos + vec2( -size, -size );
verts[i*6 + 1].pos = states[i].pos + vec2( size, -size );
verts[i*6 + 2].pos = states[i].pos + vec2( size, size );
verts[i*6 + 3].pos = states[i].pos + vec2( size, size );
verts[i*6 + 4].pos = states[i].pos + vec2( -size, size );
verts[i*6 + 5].pos = states[i].pos + vec2( -size, -size );
struct Vertex
vec2 pos;
unsigned char r, g, b, a;
;
struct State
vec2 pos;
vec2 vel; // units per second
;
vector< Vertex > verts;
vector< State > states;
;
void display()
// timekeeping
static int prvTime = glutGet(GLUT_ELAPSED_TIME);
const int curTime = glutGet(GLUT_ELAPSED_TIME);
const float dt = ( curTime - prvTime ) / 1000.0f;
prvTime = curTime;
// sprite updates
static SpriteWrangler wrangler( 2000 );
wrangler.Update( dt );
vector< SpriteWrangler::Vertex >& verts = wrangler.verts;
glClear( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );
// set up projection and camera
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
double w = glutGet( GLUT_WINDOW_WIDTH );
double h = glutGet( GLUT_WINDOW_HEIGHT );
double ar = w / h;
glOrtho( -400 * ar, 400 * ar, -400, 400, -1, 1);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glEnableClientState( GL_VERTEX_ARRAY );
glEnableClientState( GL_COLOR_ARRAY );
glVertexPointer( 2, GL_FLOAT, sizeof( SpriteWrangler::Vertex ), &verts[0].pos.x );
glColorPointer( 4, GL_UNSIGNED_BYTE, sizeof( SpriteWrangler::Vertex ), &verts[0].r );
glDrawArrays( GL_TRIANGLES, 0, verts.size() );
glDisableClientState( GL_VERTEX_ARRAY );
glDisableClientState( GL_COLOR_ARRAY );
glutSwapBuffers();
// run display() every 16ms or so
void timer( int extra )
glutTimerFunc( 16, timer, 0 );
glutPostRedisplay();
int main(int argc, char **argv)
glutInit( &argc, argv );
glutInitWindowSize( 600, 600 );
glutInitDisplayMode( GLUT_RGBA | GLUT_DEPTH | GLUT_DOUBLE );
glutCreateWindow( "Sprites" );
glutDisplayFunc( display );
glutTimerFunc( 0, timer, 0 );
glutMainLoop();
return 0;
仅使用顶点数组即可获得不错的性能。
理想情况下,您的大部分/全部 dt
s 应该是
【讨论】:
感谢 SDL 提示,很高兴知道!如果可能,我宁愿避免使用 GL_QUADS,因为它在现代 OpenGL 实现中已被弃用。因此,如果我使用非索引 GL_TRIANGLES,每个四边形添加额外的 2 个顶点可能不会对性能产生任何显着影响? 可能,除非您要移动和绘制 >20,000 个四边形/三角形对。 给我一些,我可以制作一个你可以在你的硬件上试用的演示。 经过测试,任何性能下降都可以忽略不计,所以看起来使用非索引 GL_TRIANGLES 是相当合理的(而且它们绝对比使用 glMultiDrawArrays 和三角带!)。非常感谢您的时间和专业知识,非常感谢! :)以上是关于使用 GL_TRIANGLE_STRIP 或索引 GL_TRIANGLES 绘制动态数量的四边形是不是更有效的主要内容,如果未能解决你的问题,请参考以下文章
如何使用 GL_TRIANGLE_STRIP 绘制一个矩形框?