Custom scene node results in huge performance hit

If you are a new Irrlicht Engine user, and have a newbie-question, this is the forum for you. You may also post general programming questions here.
Post Reply
ThePurpleAlien
Posts: 11
Joined: Fri May 15, 2009 4:20 am
Location: Canada

Custom scene node results in huge performance hit

Post by ThePurpleAlien »

I have created a very simple custom scene node based on tutorial 3. It's basically a billboard scene node but it doesn't automatically rotate to face the camera. My code is identical to tutorial 3 except that the geometry is 4 vertices and two triangles to make the billboard, and it has a texture. If the texture uses an alpha channel, then I set the material type to EMT_TRANSPARENT_ALPHA_CHANNEL.

I'm using this scene node to draw 2D sprites in 3D space for a side-scrolling game. The level data is a 2D tile-based map, so I draw each tile using one of these scene nodes. The level I'm working with now is 200 x 200 tiles. That's 40000 tiles total, but there's a lot of empty space in the level so there's on the order of maybe 10000 solid tiles each having a scene node to draw it. At any given time, only a small part of the level is visible, maybe a few hundred tiles, most of the level is off-screen.

The problem is that even though there's not much on screen, my game's performance is hugely affected by the size of the level. Just rendering the 200x200 map, without doing anything else, I'm down to around 60fps. This means that the large number of scene nodes are chewing up all my CPU time, even though 95% of them are off screen.

My question is, is this to be expected, or does it sound like something is wrong? I setup the bounding box like in tutorial 3. Shouldn't the automatic culling keep the off-screen scene nodes from using up too much cpu time? Or is the overhead of 10000 scene nodes too high?

What would be a good approach for this type of game? Should I make each scene node handle something like a 10x10 chunk of the level so there'd only be 400 scene nodes? Should I make the whole level one giant scene node? What's a good trade-off to get reasonable performance?

Thanks a lot for any help!
slavik262
Posts: 753
Joined: Sun Nov 22, 2009 9:25 pm
Location: Wisconsin, USA

Post by slavik262 »

The problem isn't Irrlicht specific at all. It is the number of scene nodes involved. State changes, such as setting the active material, setting shaders, etc., and draw calls have a certain overhead on your GPU. Now imagine what you're telling your graphics card to do. Even if there's, say, 100 tiles on screen, you're switching materials (it doesn't matter if you're switching to another copy of the same material - the scene node doesn't know that unless you code it to know that) 100 times, and telling your GPU to draw 100 times. This is a huge overhead compared to what it would be if you combined the tiles into a couple of draw calls. Even if you had 200 x 200 billboards, performance would be slow.

The solution? Batching (combine things). Make a scene node that draws many of these tiles at once by combining them into a few large meshes instead of hundreds of small ones. If all the tiles are being drawn in one scene node, switch materials as little as possible. Use a texture atlas to combine textures, eliminating the need to switch materials.

PM me if you have questions about batching and I'll be happy to help.
Nalin
Posts: 194
Joined: Thu Mar 30, 2006 12:34 am
Location: Lacey, WA, USA
Contact:

Post by Nalin »

I did something different for a project I did. Since the tiles did not really need to change, during the level load, I created a couple images. I copied my individual tiles to the images to construct my level, then converted the images into textures. That way, I could potentially have my entire level be one single texture.

I would first calculate the maximum texture size for my hardware:

Code: Select all

core::dimension2du max_texture_size = video_driver->getMaxTextureSize();
Then I would calculate how many textures I would need for my level:

Code: Select all

core::dimension2du texture_count(max_texture_size.Width / (level_width_in_tiles * tile_size_in_pixels), max_texture_size.Height / (level_height_in_tiles * tile_size_in_pixels));
Then I could create my textures and copy my tiles into them:

Code: Select all

video::IImage* level_image = video_driver->createImage(video::ECF_A8R8G8B8, core::dimension2du(level_segment_width, level_segment_height));
...
tile_set_image->copyTo(level_image, position_of_tile, tile_rect_on_tile_sheet);
ThePurpleAlien
Posts: 11
Joined: Fri May 15, 2009 4:20 am
Location: Canada

Post by ThePurpleAlien »

Thanks for the help. The level tiles are static, other than hiding them if they get destroyed in the game, so no material switching is happening. I'll try making a scene node that handles a roughly screen-sized mesh of tiles. Thanks for the tip on texture atlases. I'll also try the method of building large textures out of the tile images. Cheers.
slavik262
Posts: 753
Joined: Sun Nov 22, 2009 9:25 pm
Location: Wisconsin, USA

Post by slavik262 »

The idea behind texture atlases is that you don't need to build the tiles into large textures because you can just repeat identical tiles by just using the same UV coordinates. All you need is a texture that can fit each type of tile only once.

Another thing - once you combine lots of tiles into a few large meshes, you'll probably get even more fps if you set the hardware hint to EHM_STATIC. Basically it works by storing the points on the Graphics Card's memory so that the CPU doesn't have to send them every time they get drawn. It works best for meshes with a fairly large amount of points though, so make sure you combine your tiles into a few big meshes first.
ThePurpleAlien
Posts: 11
Joined: Fri May 15, 2009 4:20 am
Location: Canada

Post by ThePurpleAlien »

Thanks to slavik262 and Nalin for your suggestions. I tried both.

I modified my scene node to handle a patch of 20x20 tiles rather than one scene node per tile. This improved performance slightly, doubling the frame rate, but not as much as I hoped. I discovered that irrlicht was still calling the render function on all my scene nodes, so culling of off-screen scene nodes wasn't working, but I saved some overhead by having fewer scene nodes.

The key was setting the automatic culling method to EAC_FRUSTUM_BOX which culls using the camera's real view frustum, not just the camera's view box (EAC_BOX) which is the default. As this is a side-scrolling game, all my off-screen tiles were still in the camera's view box so the default culling method didn't cull anything. Using frustum culling, the frame rate is now up to 750 which is more what I think it should be for a very simple scene.
Post Reply