1

Topic: OpenGL ES

Thought I'd start a new thread for the discussion of OpenGL ES in general, rather than hijacking one of the platform threads where we're discussing it.

Goofing around and doing some planning, I added a flag to mapc to make it output SOL data in CSV format.  Then I ran the current (3524) build and imported the output into a Google docs spreadsheet.  This provides a means of sorting and charting current resource usage, and specifically noting its extremes.

This is significant because there's a constraint common to most implementations of our target API, GL ES 1.1: the use of 16-bit unsigned integers in element array buffers.  This imposes a limit of 65536 unique vertices in each individual draw call.  While I'm still putting thought into a mechanism to allow overflow into multiple buffers, I wanted to see how close we currently are to needing such a feature in order to determine whether the implementation complexity is worth the effort.

The output of mapc doesn't tell us the answer exactly, but we can assume the ultimate vertex load is directly related to the vert count.  The largest vert count map is currently map-fwp/adventure.sol with 23528, followed by map-fwp/spacetime.sol with 12220.  FWP takes the top 5 spots, followed by MYM in the next 2, and finally the largest main-game map comes in at number 8 with only 9502.

All of these numbers fall well short of 65536, so for the time being I'm comfortable with the basic approach.  Of course, static limits are bad, but this one is GL's and the work-around will probably be a bit ugly.

Anyway there ya go.

2

Re: OpenGL ES

THE STRUCTURE OF NEVERBALL

The core Neverball data structure represents all of the different objects that make up both the SOL level files and the simulation state of the running game.  This data structure is fairly complex and, as of version 1.5, is composed of 21 different C structures.

Of course each of these structures has a meaningful name, but in addition each has a unique letter.  V is for vertex position.  T is for texture coordinate.  L is for lump.  M is for material.  Most of these letters make sense, but some are a bit of a stretch, like Z for goal, and X for switch.  Some structures do double duty, such as the S which gives the side plane of a lump. Being a plane definition, S is also used to represent normal vectors. The five unused letters happen to be C, K, O, Q, and Y.

This lettering allows a very terse sort of Hungarian notation, where the type and meaning of a variable is immediately inferred from its name.  So VP is a vertex pointer, LI and LJ are lump indices, SC is a side count, and so on.

This organization dates all the way back to Super Empty Ball, the earliest versions of which had 14 lettered structures.  Recognizing this system’s growing complexity, Parasti refactored it all into subsets of structures for static level state, dynamic level state, and rendering.  He preserved the naming convention and produced a really nice modularization of what was originally very monolithic.

So a Neverball level consists of 21 separate vectors of these structures.  Each level has a vector of Ts, a vector of Ss, etc, which we call TV and SV respectively. These vectors are accessed by index, TI, TJ, TK, etc.  Mapc ensures that all vectors are absolutely optimal, and that each element of every vector is unique.  To reuse an object, we simply reuse its vector index.  This is a very efficient representation for data on-disk and in memory.

DRAWING THIS STRUCTURE

Real time 3D geometry is almost universally represented by vertices connected by triangles.  Each vertex includes a 3D vertex position, 3D normal vector, and 2D texture coordinate.  So in Neverball terms, each 3D vertex is defined by a VI, SI, and TI.

A 3D triangle includes 3 such 3D vertices. In Neverball, each piece of triangular geometry is represented by a G structure, which has a material and three full vertex definitions.  So, a Neverball triangle looks like this: [ MI, VI, SI, TI, VJ, SJ, TJ, VK, SK, TK ].

Our existing display list renderer is very simple.  For each triangle, it enables material MI and calls glVertex, glNormal, and glTexCoord for each V/S/T set of indices.  Easy.

Our new vertex array renderer must be more complex.  An OpenGL vertex array doesn’t allow for separate vectors for position, normal, and texture coordinates.  Instead, each 3D vertex definition must include ALL three.  Even if we’re rendering a large flat plane where all normal vectors are the same, we must duplicate the normal vector for every position vector that we want to specify.  This might seem like a less efficient representation, but it actually gives better performance due to cache coherence.

Our problem is that the Neverball data structure does not resemble anything like what OpenGL expects.

We could simply expand each G structure out into its corresponding structure elements.  This is what Lazrhog did, and it obviously worked well.  It’s not optimal though, because each V/S/T sub-element of one G structure is very likely to occur in one or more nearby G structures, and our straightforward expansion would duplicate that data.  Theoretically, in the limiting case of an infinite sphere, each V/S/T will occur 6 times, and we definitely don’t want our levels expanding anywhere near 6 times their normal size.  As I mentioned in the previous post, OpenGL ES version 1.1 actually limits us to only 65536 V/S/T sets at any one moment, so this data expansion is not just inefficient, it pushes us quickly toward our constraints.

OPTIMIZING THIS AND THAT

What I’d rather do is detect V/S/T duplication and eliminate it.  This can be accomplished in one of two ways...

First, the level loader could analyze each G structure, comparing it with all others, looking for duplicates.  I’ve implemented this before (in my OBJ optimizer library) and it can be very fast given a clever kind of skip-list structure.  However, this optimization would occur every time a level is loaded, which I find inelegant.

I’d prefer the second approach, which is to do the optimization in mapc once and for all.  Unfortunately as I noted above, the Neverball data structure does not match the vertex array layout, so the current SOL file is not capable of representing the optimized version of the level.

SO SHUT UP AND CHANGE THE SOL ALREADY

My intention at this time is to introduce a 22nd core structure which will add a layer of indirection to the geometry definition, making it amenable to representation with vertex arrays and optimizable by an offline process such as mapc, while side-stepping the impending doom of the 65536 limit.  It will be structure O.

It’s pretty simple: an O structure will consist of a single complete 3D vertex [ VI, SI, TI ].  The G structure will be modified to index these O structures [ MI, OI, OJ, OK ].  In this way, an optimal vertex array will consist only of O structures, rather than sub-elements of G structures.  Since mapc will take responsibility for ensuring that each O is unique, each resulting vertex array will therefore be as small as possible.

The level loader will extract a vertex array by forming a linear vector of all of the O structures touched by each material.  It will extract the element array by remapping the indices in the G structure into this collapsed O vector.  The end result will be significantly more efficient than the display list renderer could ever have been.

The limitation will be that no level may have more than 65536 O structures touched by a single material.  None of our current levels come anywhere near this limit (not by a light-year) and if a mapper does go absolutely nuts with some material, then his recourse is to duplicate that material and apply the duplicate to the overflow.

WTF

I’ve just been pondering these issues for a few days.  I thought some of you might be interested a better understanding of how the game works internally, and it helps me think to write it down.  In general, that’s how things will change as we transition toward a world in which 3D-capable mobile devices outnumber PCs, and Neverball lives on thanks to OpenGL ES.

3

Re: OpenGL ES

hello rlk, im personally looking forward to a opengles compatible neverball.

First I dont claim to know all about gl and/or gles, but your comment that vertex, texture, normal vectors have to be provided every time confused me. As I didnt think it was the case and if im wrong somewhere help me see where.
For example if you only have vertex data you only need to (theres also the opengl equivalent code):

#if !defined(HAVE_GLES)
  glBegin(GL_QUADS);
  glVertex2f(-10,-10);
  glVertex2f(10,-10);
  glVertex2f(10,10);
  glVertex2f(-10,10);
  glEnd();
#else
    GLfloat q3[] = {
        -10,-10,
        10,-10,
        10,10,
        -10,10
    };
 
    glEnableClientState(GL_VERTEX_ARRAY);
    glVertexPointer(2, GL_FLOAT, 0, q3);
    glDrawArrays(GL_TRIANGLE_FAN,0,4);
    glDisableClientState(GL_VERTEX_ARRAY);
#endif

So if you need texture coords you would only have:

glBindTexture(GL_TEXTURE_2D, carac->TextureName);
#if !defined(HAVE_GLES)
      glBegin(GL_QUADS);
 
      glTexCoord2f(0,0);
      glVertex3f(pos[0]-tailleX/2, pos[1]-tailleY/2, 0);
      glTexCoord2f(1,0);
      glVertex3f(pos[0]+tailleX/2, pos[1]-tailleY/2, 0);
      glTexCoord2f(1,1);
      glVertex3f(pos[0]+tailleX/2, pos[1]+tailleY/2, 0);
      glTexCoord2f(0,1);
      glVertex3f(pos[0]-tailleX/2, pos[1]+tailleY/2, 0);
 
      glEnd();
#else
      GLfloat vtx1[] = {
        pos[0]-tailleX/2, pos[1]-tailleY/2, 0,
        pos[0]+tailleX/2, pos[1]-tailleY/2, 0,
        pos[0]+tailleX/2, pos[1]+tailleY/2, 0,
        pos[0]-tailleX/2, pos[1]+tailleY/2, 0
      };
      GLfloat tex1[] = {
        0,0,
        1,0,
        1,1,
        0,1
      };
 
      glEnableClientState(GL_VERTEX_ARRAY);
      glEnableClientState(GL_TEXTURE_COORD_ARRAY);
 
      glVertexPointer(3, GL_FLOAT, 0, vtx1);
      glTexCoordPointer(2, GL_FLOAT, 0, tex1);
      glDrawArrays(GL_TRIANGLE_FAN,0,4);
 
      glDisableClientState(GL_VERTEX_ARRAY);
      glDisableClientState(GL_TEXTURE_COORD_ARRAY);
#endif

And the same for normals and colors. But as an example if you have the same color for each vertex it isnt necessary to pass a array of the same colors just glColor4f needs to be called.
Although if one color for one vertex is different for one vertex then a complete array matching the vertex size would have to be passed.

4

Re: OpenGL ES

I'm gonna enjoy this thread smile

I guess that whilst the current target is OpenGLES 1.1, the final target must now be 2.0, but let's take things one step at a time...

To answer pickle, I don't think that matters, as if we use vertex buffer objects, all this data will be loaded into video memory at level start, and not each iteration.  Reducing the overall data footprint is key as RLK has noted.

5

Re: OpenGL ES

ok given your comment I had to do some homework and I have a better idea of the difference.
A good tutorial on the subject: http://nehe.gamedev.net/data/lessons/le … ?lesson=45

1.X vs 2.0 has its benefits and issues. First I see is that with a 2.0 render its going to significantly shrink the number of mobile devices neverball will run on. On the other hand programmable shaders should be more efficient.
Ideally it would be best if both methods are supported (at least until ES 1.X devices hit end of life). I dont think we will see many more new devices without ES 2.0.

6

Re: OpenGL ES

Lazrhog wrote:

the final target must now be 2.0

Why so?

7

Re: OpenGL ES

parasti wrote:
Lazrhog wrote:

the final target must now be 2.0

Why so?

because 2.0 is going to be around for a lot longer than 1.1 in mobile devices

8

Re: OpenGL ES

Pickle wrote:

ok given your comment I had to do some homework and I have a better idea of the difference.

out of interest, are you the same 'Pickle' from the Pandora forums ?  I got fed up with the Pandora so have cancelled.

9

Re: OpenGL ES

These days I tend to write everything in a style that strives toward the intersection of all OpenGL versions, so I'm planning ahead for a move from ES 1.1. to 2.0.  It won't be too difficult... it'll involve the introduction of vertex and fragment programs, plus the use of explicit uniforms for transformations and lighting.  The work I'm doing now to convert everything to VBOs remains 100% valid.

10

Re: OpenGL ES

rlk wrote:

These days I tend to write everything in a style that strives toward the intersection of all OpenGL versions, so I'm planning ahead for a move from ES 1.1. to 2.0.  It won't be too difficult... it'll involve the introduction of vertex and fragment programs, plus the use of explicit uniforms for transformations and lighting.  The work I'm doing now to convert everything to VBOs remains 100% valid.

Excellent smile

11

Re: OpenGL ES

I 'believe' the Wiz is GL-ES 1.1, but don't quote me on that. The 533MHz ARM chip (overclockable to 800+) should run neverball very nicely. PS. I don't think we need animated backgrounds on such low res handhelds, the old gradient backgrounds would be fine. Of course for the iPhone 3GS+ or iPad (or indeed, Android tablets) the animated background would be OK.

PS. I need coffee!

Currently Playing:
Celeste and Electronic Super Joy

12

Re: OpenGL ES

Don't worry, it has to go through OpenGLES 1.1 to get to 2.0 so there will be a version that is compatible

13

Re: OpenGL ES

phew, that's a relief!

Currently Playing:
Celeste and Electronic Super Joy

14

Re: OpenGL ES

@Lazrhog indeed I am the same. I have read your comments about canceling and its understandable, but on the other hand the pandora is a real nice device. Hopefully things are getting better with a constant flow of boards. I can easily support a pandora version once opengles support is ready.

The wiz and caanoo are 1.X Lite, I did a little research and VBO's should be supported. The biggest concern with the the wiz is the small texture memory size, which if i remeber right is 12 mb. Cannoo I think is 32 mb.

15

Re: OpenGL ES

I've finished making the proposed change to the Neverball data structure and SOL file, I've propagated the change through mapc and the game renderer, and I've committed the change to the GLES branch.  This change does NOT enhance OpenGL ES compatibility at all, or even begin the transition toward vertex arrays, but it does make the modifications necessary to enable these things.  You shouldn't even see a difference.  If you try it, and you see anything strange, please let me know.

I've made some comparisons between the current trunk and the GLES branch.  Perhaps the most significant is a minor drop in mapc performance.  One of uau's optimizations had to be removed because it depended on the old geom structure, which has changed. In addition, the surface smoother can actually provide added opportunities for SOL compaction, so a partial second optimization pass is made.

The trunk mapc processes all 364 maps in 1m24s on my Macbook Pro, and the GLES mapc does it in 1m51s.  That's an increase of 32%.  Both of these values give the timing of the second of two full runs, which helps isolate the effects of the file cache.

Offsetting this is a minor improvement in file size.  The 364 SOL files combined are 410 MB in the trunk, and 396 MB in GLES, with many of the main-game SOLs shrinking by 20%. Here's a Google spreadsheet that lays out these results in detail.

So we give a little and we get a little.  Ultimately though, renderer efficiently will improve drastically thanks to these changes, and of course the ES port will become possible.

16

Re: OpenGL ES

I love all the spreadsheets, the data spreads excite me smile

17

Re: OpenGL ES

FWIW, SOLs are 83M in the gles branch and 95M in trunk. The spreadsheet totals roughly confirm this, so it's just the totals in your post that are skewed.

18

Re: OpenGL ES

parasti wrote:

FWIW, SOLs are 83M in the gles branch and 95M in trunk. The spreadsheet totals roughly confirm this, so it's just the totals in your post that are skewed.

Oops!  My script that summed the file sizes accidentally included the size of the data directory in the total.  I didn't even stop to wonder why the total was so large.  Let's see, which thwomp best says it... D:

19

Re: OpenGL ES

RLK, your avatar says it exactly the same as the thwomp big_smile

Actually, your avatar IS a thwomp...

Great work so far!!

Currently Playing:
Celeste and Electronic Super Joy

20

Re: OpenGL ES

Pickle wrote:

@Lazrhog indeed I am the same. I have read your comments about canceling and its understandable, but on the other hand the pandora is a real nice device. Hopefully things are getting better with a constant flow of boards. I can easily support a pandora version once opengles support is ready.

The wiz and caanoo are 1.X Lite, I did a little research and VBO's should be supported. The biggest concern with the the wiz is the small texture memory size, which if i remeber right is 12 mb. Cannoo I think is 32 mb.

On the iPhone version I did, the memory footprint was under 25Mb as that is the limit.

21

Re: OpenGL ES

Lazrhog wrote:
Pickle wrote:

The wiz and caanoo are 1.X Lite, I did a little research and VBO's should be supported. The biggest concern with the the wiz is the small texture memory size, which if i remeber right is 12 mb. Cannoo I think is 32 mb.

On the iPhone version I did, the memory footprint was under 25Mb as that is the limit.

is that just textures memory or total memory consumption?
Wiz has 64 mb and Caanoo has 128 mb and Pandora has 256 mb for total memory.

22

Re: OpenGL ES

Total memory consumption, I went through and reduced a lot of textures from the standard desktop sizes

23

Re: OpenGL ES

Wiz/Caanoo can definitely use VBOs (vertex buffer), which was an addition in 1.1. Using Termula under Wiz, it looks as tho HALF the RAM is missing?!, ie: only 16MB available. This may be a constraint for a quick return to the system menu. I believe the VR card has its own dedicated memory, but don't quote me on that...

Currently Playing:
Celeste and Electronic Super Joy

24

Re: OpenGL ES

GPH devices hand memory a bit odd, the gp2x for example had lower 32 mb directly accessible, where the upper 32 has to be manually allocated to be used. The wiz is similar but not as extreme or as useful. 48 mb is directly accessible with 16 being only accessible through manual allocation, but at least 12 of this is for the 3d chip. So pretty much for neverball to run its going to need to run with 48 mb of system memory (minus whatever linux uses) and 12 mb for graphics (textures).
One thing to note is i experienced texture loss with quake stuff, textures just didnt show up. So it should be possible to at least get something running even if textures exceed the size.

25

Re: OpenGL ES

hmmm, I wonder if Termula (Wiz) managed to use the entire 16MB by disabling 3D? Funny that RAM measured exactly 16MB...

Currently Playing:
Celeste and Electronic Super Joy