Call overhead stats

Discuss about anything related to the Irrlicht Engine, or read announcements about any significant features or usage changes.
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: Call overhead stats

Post by hendu »

This doesn't have much to do with shaders, it's about reducing the CPU overhead of irrlicht.
Ie, irrlicht was doing things it wouldn't have needed to do.


Books on optimization? Drepper has written several papers on the topic, they're freely available on his homepage.

I recommend the optimization tutorials 1 and 2, and "What every programmer should know about memory".
ACE247
Posts: 704
Joined: Tue Mar 16, 2010 12:31 am

Re: Call overhead stats

Post by ACE247 »

Just to check in again, are any of these changes actually going to be applied to svn? I see they are just hanging there on the tracker, or are the main irr devs too busy currently?
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Re: Call overhead stats

Post by hybrid »

Yes, we had a major release just a few days ago. Development will start soon again.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: Call overhead stats

Post by devsh »

somebody could start off with a SSE3 implementation of vector3df or at least matrix4 with 16 byte alignment (Declspec!)

and drop a lot of setTransform() calls
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: Call overhead stats

Post by hendu »

SSE3 implementation of vector3df
I've tried that. It was slower than the current, fully inline-able template version.
and drop a lot of setTransform() calls
I don't remember there being a lot of those in vain. The issue with those was calling setTransform(identitymatrix) not going through the fast path of glLoadIdentity, but instead uploading the identity matrix every time.
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Re: Call overhead stats

Post by hybrid »

Yeah, these fixes will be the first ones to go into the engine once I come back to working on it.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: Call overhead stats

Post by devsh »

I think a number of us tried to do SSE vector but never actually used SSE correctly
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: Call overhead stats

Post by hendu »

Oh, the actual calculations were faster. But vector3df's are very short-lived, you usually create it, do one or two calculations, extract the components.

And the vector packing and unpacking overhead were more than what was gained from the faster calculation.
Nadro
Posts: 1648
Joined: Sun Feb 19, 2006 9:08 am
Location: Warsaw, Poland

Re: Call overhead stats

Post by Nadro »

This week I'll be back to Irrlicht development (in last time I was very busy), so patches from this thread will be integrated with core soon.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
Nadro
Posts: 1648
Joined: Sun Feb 19, 2006 9:08 am
Location: Warsaw, Poland

Re: Call overhead stats

Post by Nadro »

I merged with a trunk patches related to reset materials, depth and metrices. I also send a commit which improves shader constant handling. OpenGL get location call overhead is also fixed in this revision.

Now a constant ID is returned by getVertex/PixelShaderConstantID. After shader is created You should run this method only once for each constant. When You will have a constant ID, You have to pass it into setVertex/PixelShaderConstant.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: Call overhead stats

Post by hendu »

I see, thanks.

Note that the get*ConstantID function is still a bit inefficient - I would add a sort right after all names have been added, and then use binary_search in get*ConstantID. Still, having these on the user's side does free resources.

I do wonder whether user pushback will be big - afterall, all other wrappers accept names.
Nadro
Posts: 1648
Joined: Sun Feb 19, 2006 9:08 am
Location: Warsaw, Poland

Re: Call overhead stats

Post by Nadro »

Hmmm... we have to call get*ConstantID only once and I think that combination sort + binary_search will be more expensive than standard search method (differences should be really small). Of course binary_search would be usefull when get*ConstantID would be call many times but it's not required.

Of course I can add sort + binary_search to an interface, but I'm not sure if this is needed.

What about performance changes? I checked example no. 10 and I saw some FPS more than before :) I think that for heavily shader based apps, boost should be more visible.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: Call overhead stats

Post by hendu »

The sort would only happen once, after shader linking?
Nadro
Posts: 1648
Joined: Sun Feb 19, 2006 9:08 am
Location: Warsaw, Poland

Re: Call overhead stats

Post by Nadro »

Yep, I know but time spend to sort will be equal to one standard search pass for get*ConstantID, but after sort You need binary search, thats why I compared sort + bin search vs standard search, because both combinations should be called only once.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: Call overhead stats

Post by hendu »

Huh, are we talking about the same thing?

Current cost for ten uniforms: 10 linear searches = 100 work units
Proposed cost: 1 sort + 10 binary searches = 10 + 23 = 33 work units
Post Reply