HW instancing test

Discuss about anything related to the Irrlicht Engine, or read announcements about any significant features or usage changes.
Post Reply
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

HW instancing test

Post by hendu »

I was curious on hw instancing, so I quickly wrapped the extension to test it. OpenGL only, requires GL2 + the ARB_draw_instanced extension.

In case anyone would find use for these, two-part patch on top of 1.7.2:

http://pastebin.ca/2074037
http://pastebin.ca/2074038
Radikalizm
Posts: 1215
Joined: Tue Jan 09, 2007 7:03 pm
Location: Leuven, Belgium

Post by Radikalizm »

Any statistics about any performance gain you're getting with this?
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Post by hendu »

I'm afraid not, due to lack of suitable hardware I'm working on llvmpipe.

edit: it did cut the measured polys from 6k to 90 (I adapted the pseudo-instancing sphere test from here). So in fillrate-bound situations it should help a lot.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Post by devsh »

it wont help with fillrate at all, it will help with the draw calls overhead, vertex data upload, vertex throughput
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Post by hybrid »

I don't see a reason why a newly introduced method is required. Can't we just add a new parameter (with default value 1) to drawMeshBuffer? Also, could you please add a small example how the vertex shaders would get proper data for the positioning and the reaction to the counter value? Once I have test code I can easily add this, also for d3d9 which has, IIRC, a distinct draw call for this case.
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Post by hendu »

I know this is a dirty way with more duplication than necessary, it was to get it running quickly.
There is some need for separation, because it does not make sense to instance something that is not in a hw buffer (vbo / EHM_STATIC).

I'm not in a hurry to get it merged, since as mentioned I don't have any capable hardware right now to test with.

Re example - I used the same method as the pseudo-instancing code snippet, passing an uniform array of 60 matrices. One could also use texcoords. But shaders are required, there's no fixed-function way to handle gl_InstanceIDARB.

The proper way (=best performing) to pass this data in would be via instanced arrays - another extension. I have not tested this yet, as I'd need the hw to do my grass etc with this technique.

Vertex shader:

Code: Select all

#extension GL_ARB_draw_instanced : enable

uniform mat4 viewProjection;
uniform mat4 instanceWorldArray[60];

void main() {

	mat4 wp = viewProjection * instanceWorldArray[gl_InstanceIDARB];
	gl_Position = wp * gl_Vertex;

	gl_TexCoord[0] = gl_MultiTexCoord0;
}
The code to set those matrices is the same as in the pseudo-instancing code:
http://irrlicht.sourceforge.net/phpBB2/ ... instancing
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: HW instancing test

Post by devsh »

Think about revisitng the subject and using tessellation shaders to create the extra copies (triangles quickly) so you only upload one mesh buffer. Tesselation makes 1056 triangles per one triangle of input at highest levels. This means 1056 triangles at 30x more the processing cost (which is pretty good), your only problem then is how to pass the transformations.

Tesselation being 440x faster than creating triangles with geom shader would really win.
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: HW instancing test

Post by hendu »

Tesselation would limit hw to gl4 cards, while instancing has much wider hw support. I'm thinking it would be heavier too?
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: HW instancing test

Post by devsh »

think about HW instancing... requires OGL 3.0 card, 2 or 3 years ago you'd have said the same thing to what you did a few months ago (not many people have that card etc.). Tessellation is ideal for high polygon meshes where storing 100s extra duplicates would kill your caches. Think of a 8k vertex mesh, that is 0.5mb of data. 100 instances make 50mb in meshbuffer, way past the optimum of vertex data in one draw call.
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: HW instancing test

Post by hendu »

I was talking of proper instancing, where only one copy is stored?
3DModelerMan
Posts: 1691
Joined: Sun May 18, 2008 9:42 pm

Re: HW instancing test

Post by 3DModelerMan »

Instancing would need to work in D3D9 also. D3D9 has full support for it.
That would be illogical captain...

My first full game:
http://www.kongregate.com/games/3DModel ... tor#tipjar
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: HW instancing test

Post by devsh »

I am talking about storing only one mesh copy, but generating the instances a lot faster than geometry instancing.
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: HW instancing test

Post by hendu »

Benchmarks please :)
Post Reply