Reducing CPU loading when using 100s scenenodes
Reducing CPU loading when using 100s scenenodes
I´m creating scenes with about 500 mesh cubes, each cube face has its own surface, load time is about 15 seconds on a 2GHz PC with 2 GB system memory and G33 intel graphics support. The Irrlich examples run from 250 to 350 FPS (slower at higher resolutions) to give an idea of system speed (it isn´t the fastest, but thats not the issue).
With this quantity of mesh-cubes, and with 4 out of 6 cube faces mapped with a 1x1 pixel texture to reduce memory usage yet allow independent side coloring, the CPU is out of steam, even though there is hardly any animation, just say 4 meshes rotating slowly.
I´ve profiled the code, reduced the overhead of my own code to about 20% of CPU time, but the Irrlicht code is taking about 40-50% of the CPU, leaving little time for the system and UI, which draws very slowly.
If anyone knows how Irrlich works internally, and whether there is a way to reduce processing with 500 mesh-file scene nodes, please post here. Thnxs
With this quantity of mesh-cubes, and with 4 out of 6 cube faces mapped with a 1x1 pixel texture to reduce memory usage yet allow independent side coloring, the CPU is out of steam, even though there is hardly any animation, just say 4 meshes rotating slowly.
I´ve profiled the code, reduced the overhead of my own code to about 20% of CPU time, but the Irrlicht code is taking about 40-50% of the CPU, leaving little time for the system and UI, which draws very slowly.
If anyone knows how Irrlich works internally, and whether there is a way to reduce processing with 500 mesh-file scene nodes, please post here. Thnxs
Last edited by robmar on Mon Aug 29, 2011 12:26 pm, edited 1 time in total.
-
Radikalizm
- Posts: 1215
- Joined: Tue Jan 09, 2007 7:03 pm
- Location: Leuven, Belgium
Re: Reducing CPU loading when using 100s scenenodes
It's surprising how many times this very same question pops up every month
When you're drawing 500 independent cube scene nodes, you're making 500 separate draw calls which slows down your rendering process a lot
You need to batch your cubes together to reduce draw calls, check the project announcements forum for an implementation
Also, next time try to use the search function first; this question has been asked and answered a lot of times in the past
When you're drawing 500 independent cube scene nodes, you're making 500 separate draw calls which slows down your rendering process a lot
You need to batch your cubes together to reduce draw calls, check the project announcements forum for an implementation
Also, next time try to use the search function first; this question has been asked and answered a lot of times in the past
Re: Reducing CPU loading when using 100s scenenodes
I think the point is the CPU usage. You won't lower that by batch processing (at least on Windows systems from my experience), the framerate will go up but still the same CPU usage. Try IrrlichtDevice::yield or IrrlichtDevice::sleep with 1ms or so (http://irrlicht.sourceforge.net/docu/cl ... evice.html) to reduce CPU load (and framerate as well).
Re: Reducing CPU loading when using 100s scenenodes
CreateDeviceEx() function will automaticaly regulate CPU usage in my experience on Windows.
Regards,
Jake
Regards,
Jake
Re: Reducing CPU loading when using 100s scenenodes
CreateDeviceEx only allows passing some more parameters and shouldn't affect CPU usage. I suppose Brainsaw got the answer - use sleep.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
-
Radikalizm
- Posts: 1215
- Joined: Tue Jan 09, 2007 7:03 pm
- Location: Leuven, Belgium
Re: Reducing CPU loading when using 100s scenenodes
CreateDeviceEx() does the same as CreateDevice(), you just get more parameters to play with; as far as I know CPU usage is purely managed by the OS schedulerJake007 wrote:CreateDeviceEx() function will automaticaly regulate CPU usage in my experience on Windows.
Regards,
Jake
@Brainsaw: If it's just the fact of the CPU almost maxing out (which is quite normal for a rendering engine) then you're right, but creating 500 independent cube nodes and preparing and rendering them all separately will affect both CPU and GPU drastically (and is just not really the best idea), so I suggest the OP tries both solutions at the same time
Re: Reducing CPU loading when using 100s scenenodes
The "500 Cubes" should of course be optimized, but some time ago when I developed the IrrEdit Project Manager (http://bulletbyte.de/products.php?sub=irr&show=iepm) I realized that it takes 50% of my DualCore CPU although just some GUI elements are drawn, so I added a sleep(10) and now it's OK (like 2%).
Re: Reducing CPU loading when using 100s scenenodes
Thanks everyone for your the replies! I´ve optimized my code as much as I can, but the issue is still with the initial start-up time required to load the 500 mesh-file cubes, and I guess without a faster graphics system and CPU there is little that can be done, easily if at all.
Once loaded, the frame rate is quite good even with the CPU at 96% loading (I calculate the Sleep time for the 3D thread based on the time it took to draw the last frame and the required frame-rate, but always give a minimum 3ms Sleep period to make sure the GUI remains responsive), but reduces to about 80 fps at higher resolutions than 640 x 480.
I had hoped to use the IsVisible() function to stop drawing updates to surfaces that have video or animations when not actually on the screen, but that function is protected!
Is there any fast way to determine that a scenenode is not visible to the current view so I can cut back on updating off-view objects?
Once loaded, the frame rate is quite good even with the CPU at 96% loading (I calculate the Sleep time for the 3D thread based on the time it took to draw the last frame and the required frame-rate, but always give a minimum 3ms Sleep period to make sure the GUI remains responsive), but reduces to about 80 fps at higher resolutions than 640 x 480.
I had hoped to use the IsVisible() function to stop drawing updates to surfaces that have video or animations when not actually on the screen, but that function is protected!
Is there any fast way to determine that a scenenode is not visible to the current view so I can cut back on updating off-view objects?
Re: Reducing CPU loading when using 100s scenenodes
There is ISceneManager::isCulled which can check that. You can also set culling for scenenodes with ISceneNode::setAutomaticCulling.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
-
hybrid
- Admin
- Posts: 14143
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
Re: Reducing CPU loading when using 100s scenenodes
How do you load the meshes? Which format especially? I guess you can optimize a lot there. Maybe you can een get away from the mesh loading, and create the meshes on the fly. Just load some config file which tells you the surface information.
Re: Reducing CPU loading when using 100s scenenodes
Thanks for the replies, the isCullied method is useful to know, so many methods!
That´s a good point about the meshes, I use getMesh to load an xml defined ,irrmesh cube with 6 surfaces. I suppose I could load that mesh just once, and reuse on all the "mesh cubes", as I guess there is no cache for meshes, is that correct?
I then use addAnimatedMeshSceneNode with the mesh to create the scenenode.
And finally attach from surfaces previously created.
That´s a good point about the meshes, I use getMesh to load an xml defined ,irrmesh cube with 6 surfaces. I suppose I could load that mesh just once, and reuse on all the "mesh cubes", as I guess there is no cache for meshes, is that correct?
I then use addAnimatedMeshSceneNode with the mesh to create the scenenode.
And finally attach from surfaces previously created.
Re: Reducing CPU loading when using 100s scenenodes
I just tested isCullied, and as the scenenode under test goes off-view, for a brief time isCullied returns true, then it returns false even though the object is still off-view. The object is a standard cube scenenode, that otherwise seems to be fine, with one mapped surface, so I´ve no idea yet why isCullied returns false when it is off-view. The camera is rotating in the centre of a circle of objects, the cube object has frame-rate updates occuring to its side-surfaces, and isCulled is meant to halt those updates when the view of the camera rotates off it.
Any ideas what could be happening?
Rather than having to call each frame isCullied and use up CPU time, a call-back would be useful to reactivate the drawing when it returns to the view, that is if there is a place in the driver code that gets hit when it is on-view, has to be in there somewhere...
Any ideas what could be happening?
Rather than having to call each frame isCullied and use up CPU time, a call-back would be useful to reactivate the drawing when it returns to the view, that is if there is a place in the driver code that gets hit when it is on-view, has to be in there somewhere...
Re: Reducing CPU loading when using 100s scenenodes
Okay, the culling is now working! Seems that I needed to setAutomaticCulling( EAC_FULLSTRUM_BOX ) rather than EAC_BOX.
I couldn´t find header notes to explain the difference is is exactly...but I guess only the EAC_FULSTRUM.. version does view checking, and the other does something different, not sure what though...?
I couldn´t find header notes to explain the difference is is exactly...but I guess only the EAC_FULSTRUM.. version does view checking, and the other does something different, not sure what though...?
Re: Reducing CPU loading when using 100s scenenodes
I´ve traced through and see that the CMeshCache class clears cached meshes on deleting the device. I guess thats why loading the mesh just once didn´t make much difference as it was already cached, and also why one shouldn´t drop meshes.
Looking at the camera operation, its behaviour shows that the direction of view (target) changes are being smoothed out to prevent sudden view changes, which works really nicely!
Its sort of all very impressive...
Looking at the camera operation, its behaviour shows that the direction of view (target) changes are being smoothed out to prevent sudden view changes, which works really nicely!
Its sort of all very impressive...
Last edited by robmar on Wed Aug 31, 2011 5:30 pm, edited 1 time in total.
-
Lonesome Ducky
- Competition winner
- Posts: 1123
- Joined: Sun Jun 10, 2007 11:14 pm
Re: Reducing CPU loading when using 100s scenenodes
I'm betting that EAC_BOX compares the box to the camera's frustum's bounding box, which is a quicker comparison that never falsely reports that it is offscreen, but it also doesn't always correctly report that it is offscreen because the frustum's bounding box can be quite a different shape.

