Call overhead stats
Re: Call overhead stats
Hmmm, I think that correct result for linear searching is 55 for ten uniforms and sorting require more than 10 units.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
-
- Posts: 17
- Joined: Mon Sep 17, 2012 8:54 pm
Re: Call overhead stats
Sorry to bother, but what machines does this affect? On my systems I haven OpenGL 3.2 or better and most demos run at >1000fps, which seems to imply this issue doesn't exist in my case. (unless I'm presuming that the ms draw calls are faster or something on my machine)
Re: Call overhead stats
@ping-pong2012
It affect all platforms. With current trunk You should have even better FPS, but the real performance boost should be visible in more complicated scenes than our examples included in Irrlicht (when CPU has more work to do). In my last commit I added next improvments in this cache (eg. texture states cache).
It affect all platforms. With current trunk You should have even better FPS, but the real performance boost should be visible in more complicated scenes than our examples included in Irrlicht (when CPU has more work to do). In my last commit I added next improvments in this cache (eg. texture states cache).
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
Re: Call overhead stats
Hm, I think Yoran run the shader example friday and FPS improved there as well :-)
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Re: Call overhead stats
Nadro, the last commit looks wrong (the getCacheStatus). Everything you protect by it is per-unit state, but the cacheStatus is per-texture-object.
edit: I see it's inverted, so it's used as an additional reset. Meaning more calls. Is it necessary?
edit: I see it's inverted, so it's used as an additional reset. Meaning more calls. Is it necessary?
Re: Call overhead stats
Hi,
What about getCacheStatus? (I changed organization of caching system in my last commit, so now it's IsCached variable) It is set to false only when resetAllRenderStates is require (it's Irrlicht requirments) and at a first use of texture. We can minimize "if" calls by reproduction some gl code eg:
if(!IsCached)
{
// call glTexParameteri for each texture state
// set values to all texture states cache
}
else
{
if(compare cache with material value)
{
// call glTexParameteri
// set value to texture states cache
}
// other parameters in the same way as upper
}
Currently "if" with "IsCached" condition is used separately for each texture state, but I think that 5 more "if" conditions aren't problematic from performance side.
What about getCacheStatus? (I changed organization of caching system in my last commit, so now it's IsCached variable) It is set to false only when resetAllRenderStates is require (it's Irrlicht requirments) and at a first use of texture. We can minimize "if" calls by reproduction some gl code eg:
if(!IsCached)
{
// call glTexParameteri for each texture state
// set values to all texture states cache
}
else
{
if(compare cache with material value)
{
// call glTexParameteri
// set value to texture states cache
}
// other parameters in the same way as upper
}
Currently "if" with "IsCached" condition is used separately for each texture state, but I think that 5 more "if" conditions aren't problematic from performance side.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
Re: Call overhead stats
The issue is that it doesn't make sense. Why cache a per-unit value with a per-object flag? It's just wrong.
In addition, it doesn't seem to hurt, but the only effect in can have is negative - more gl calls.
In addition, it doesn't seem to hurt, but the only effect in can have is negative - more gl calls.
Re: Call overhead stats
glTexParameteri isn't related to texture object, currently activated by glActiveTexture? I have this info from this link: http://www.opengl.org/discussion_boards ... r-glTexEnv
BTW. Where You see more gl calls? You mean version with (wrongly?) cache compared to (properly?) cache system or without cache? Without cache a glTexParameteri was called in each setBasicRenderStates call, now calls are reduced very well. I checked all with gDEBugger and all works good.
BTW. Where You see more gl calls? You mean version with (wrongly?) cache compared to (properly?) cache system or without cache? Without cache a glTexParameteri was called in each setBasicRenderStates call, now calls are reduced very well. I checked all with gDEBugger and all works good.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
Re: Call overhead stats
You're right, I remembered it the other way.
Re: Call overhead stats
OK, so I'm going to improve other parts which can be cached
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
Re: Call overhead stats
Today I did small test by gDEBugger in our 10.Shaders environment.
In v1.8 we have 1085 OGL calls per frame.
In the latest trunk we have 391 calls per frame.
As You can see even in thats small scenes an improvement is really big. Cause of our OpenGL driver design (mixed OGL 1.x and OGL 2.x) will be hard to efficient remove all unimportant OGL calls, but in upcoming OpenGL3 core profile driver we'll have very efficient OGL calls handler, so CPU overhead from an engine side will be really small, it will be definitly the fastest Irrlicht driver.
In v1.8 we have 1085 OGL calls per frame.
In the latest trunk we have 391 calls per frame.
As You can see even in thats small scenes an improvement is really big. Cause of our OpenGL driver design (mixed OGL 1.x and OGL 2.x) will be hard to efficient remove all unimportant OGL calls, but in upcoming OpenGL3 core profile driver we'll have very efficient OGL calls handler, so CPU overhead from an engine side will be really small, it will be definitly the fastest Irrlicht driver.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes
Re: Call overhead stats
Nice, sooner or later were getting there to have a nice fast irrlicht.
Also I thought I'd just mention, on my Laptop with Nvidia Optimus graphics(A Intel HD4000 on CPU die and Nvidia GT630 on PCI-E) For some reason I get considerably higher FPS in Irrlicht with the Intel Card(Which is OGL3.3 16Shader Units, has less mem bandwidth) Than with the Nvidia Card(Has OGL4.3, 96 Shader Units, double the mem bandwidth)
Any ideas as to why that might be? It just seems wrong that a worse card does better. Granted the Irrlicht examples are quite simple, and in more complex scenes the Nvidia card comes out top. It still should not lag behind the Intel card even for the simple examples or whats going on?
Also I thought I'd just mention, on my Laptop with Nvidia Optimus graphics(A Intel HD4000 on CPU die and Nvidia GT630 on PCI-E) For some reason I get considerably higher FPS in Irrlicht with the Intel Card(Which is OGL3.3 16Shader Units, has less mem bandwidth) Than with the Nvidia Card(Has OGL4.3, 96 Shader Units, double the mem bandwidth)
Any ideas as to why that might be? It just seems wrong that a worse card does better. Granted the Irrlicht examples are quite simple, and in more complex scenes the Nvidia card comes out top. It still should not lag behind the Intel card even for the simple examples or whats going on?
Re: Call overhead stats
ya i noted a similar thing my main laptop run the terrain demo at 800 fps but currently i'm on a laptop that has only an intel IGP and the terrain demo runs about the same speed and sometimes even a little faster that's akward and even worst it's the same on OGL and directx
Re: Call overhead stats
Will you also optimize DirectX calls?
"There is nothing truly useless, it always serves as a bad example". Arthur A. Schmitt
Re: Call overhead stats
I have no such plans, I will optimize only OpenGL calls.
Library helping with network requests, tasks management, logger etc in desktop and mobile apps: https://github.com/GrupaPracuj/hermes