Call overhead stats
Call overhead stats
Here are some stats on the GL calls of a single frame, from my irr-using project.
This is one of the later frames, so no initialization calls are there to mix up
measurements. The frame was picked at random.
Total number of calls in one frame: 19677 (~20k!)
Average call took: 0.253 usec
Total CPU time spent in these calls: 4994 usec (~5ms!)
----
So far looking pretty bad. Let's cut out all draw calls and remeasure (glDraw*).
All calls, without draw calls took: 3111 usec (~3ms!)
Still almost 3ms of overhead.
----
Most expensive calls, after cutting out draw calls, glFlush, and glClear:
glBeginQueryARB, glBindFramebufferEXT, glClientActiveTexture.
----
Calls, sorted by how many times each was called during this one frame:
http://pastebin.com/RS4621aE
We see here that a couple thousand calls are made to set texture parameters that aren't used
in most meshes (texture slots 2 and 3), and to set texture matrices that aren't used at all
in this app.
Also, normalizing, multisampling, and fog keep getting disabled without ever being enabled.
The uniform cache should also help. (how's it going by the way?)
----
The full data (1.6mb unpacked, 113kb compressed) for this frame is here:
http://7swbff.alterupload.com/en/
This is one of the later frames, so no initialization calls are there to mix up
measurements. The frame was picked at random.
Total number of calls in one frame: 19677 (~20k!)
Average call took: 0.253 usec
Total CPU time spent in these calls: 4994 usec (~5ms!)
----
So far looking pretty bad. Let's cut out all draw calls and remeasure (glDraw*).
All calls, without draw calls took: 3111 usec (~3ms!)
Still almost 3ms of overhead.
----
Most expensive calls, after cutting out draw calls, glFlush, and glClear:
glBeginQueryARB, glBindFramebufferEXT, glClientActiveTexture.
----
Calls, sorted by how many times each was called during this one frame:
http://pastebin.com/RS4621aE
We see here that a couple thousand calls are made to set texture parameters that aren't used
in most meshes (texture slots 2 and 3), and to set texture matrices that aren't used at all
in this app.
Also, normalizing, multisampling, and fog keep getting disabled without ever being enabled.
The uniform cache should also help. (how's it going by the way?)
----
The full data (1.6mb unpacked, 113kb compressed) for this frame is here:
http://7swbff.alterupload.com/en/
Re: Call overhead stats
I think that the most of this overhead calls will be solved in OpenGL 3 driver.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
Re: Call overhead stats
Too bad that I'm targeting GL 2 hardware, eh?
Certainly not a reason not to improve the current driver. Is there a legit reason to set texture params for unused slots?
Certainly not a reason not to improve the current driver. Is there a legit reason to set texture params for unused slots?
Re: Call overhead stats
Of course we will try merge all improvments (compatible with OGL2) related to OGL3 into existing OGL2 driver, but firstly I have to prepare a proper implementation of GL3 driver.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
-
- Admin
- Posts: 14143
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
Re: Call overhead stats
Well, reason is that we have no full state tracing. And especially the texture based states are hard to track, as we would have to keep the state for each texture. Instead, we set some states regardless of the previous state. This shouldn't be called for empty textures slots, though, is it? Also, if you have a good idea for cheap state tracking for those other features, we could also cache this and avoid the expensive calls or the massive calls.
Which calls for normalization, multisampling and fog do you mean?
Which calls for normalization, multisampling and fog do you mean?
Re: Call overhead stats
Yes, many params are being set for empty textures.Well, reason is that we have no full state tracing. And especially the texture based states are hard to track, as we would have to keep the state for each texture. Instead, we set some states regardless of the previous state. This shouldn't be called for empty textures slots, though, is it?
I'll look into each of these later, just wanted to get the numbers out now.Also, if you have a good idea for cheap state tracking for those other features, we could also cache this and avoid the expensive calls or the massive calls.
@Nadro
You spoke earlier of the uniform cache, does it exist currently?
glDisable for each.Which calls for normalization, multisampling and fog do you mean?
-
- Admin
- Posts: 14143
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
Re: Call overhead stats
The glDisable calls should only occur once for each frame, as most of them are guarded by the "if (reset || state_changed)" tests. Don't know why they occur for each meshbuffer here. Basically the same holds for the calls to glSetTexParami, all of them are called in a loop which skips empty texture slots. Only the texture matrix calls are made unconditionally. IIRC, I did this in order to have unconditional access to all texture matrices without relation to the textures in the same slot. And having an identity call in most cases I hoped to have only minor overhead here. Change could be considered here, though.
Maybe you could step through the render call and check why both tests above fail?
Maybe you could step through the render call and check why both tests above fail?
Re: Call overhead stats
Too much trouble to do that right now (debug build with visible symbols), but COpenGLSLMaterialRenderer.cpp:212 looks highly suspect:
Note the true.BaseMaterial->OnSetMaterial(material, material, true, this);
-
- Admin
- Posts: 14143
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
Re: Call overhead stats
Ah ok, I usually only do fixed pipeline stuff, so absolutely not sure about the reason for the reset here. Could be that this also stems for ancient times where the material settings were not fully synced, and a reset was a simple way to fix states for sure. We should consider this to be fixed in 1.9 then.
Re: Call overhead stats
Because 1.8 is "freezed", an uniform cache will be implement into 1.9. It should be one of the first commits in 1.9.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
Re: Call overhead stats
Any modern VCS has branches - such new things could be in feature branches, all ready to be merged once trunk is unfrozen. Feature freeze has no effect on development in modern timesNadro wrote:Because 1.8 is "freezed", an uniform cache will be implement into 1.9. It should be one of the first commits in 1.9.
It's ok, I just wanted to know if it's done yet or not, to avoid duplicating work. Perhaps I'll write one this weekend then.
BTW, also seems that zero materials make use of the *services param. API design flaw that would've been found with -Wunused-param
-
- Admin
- Posts: 14143
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
Re: Call overhead stats
You might need this, though, if you require special access to the driver internals. At least that's my understanding of the param. As with other occasions of this compiler parameter, I highly doubt that it has practical use for library APIs with much object orientation.
Re: Call overhead stats
Think of it as developer motivation to get a version out (also we already have 1.7, trunk, ogl-ES and shader-pipeline branches which are already more active branches than active developers).hendu wrote: Any modern VCS has branches - such new things could be in feature branches, all ready to be merged once trunk is unfrozen. Feature freeze has no effect on development in modern times :D
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Re: Call overhead stats
Those are quite long-lived branches, such feature ones should be cheap and cheerful. And inactive until deleted, unless more work is needed on the feature.CuteAlien wrote:Think of it as developer motivation to get a version out (also we already have 1.7, trunk, ogl-ES and shader-pipeline branches which are already more active branches than active developers).
So, when's 1.8 releasing with this new motivation?
-
- Admin
- Posts: 14143
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
Re: Call overhead stats
The names branches are the active ones. the only dead one, which was just not deleted, is the animation system branch.
1.8 will go out once we have fixed the shadow bugs, and checked that the transparency bugs are really fixed.
1.8 will go out once we have fixed the shadow bugs, and checked that the transparency bugs are really fixed.