Stress Test

Discuss about anything related to the Irrlicht Engine, or read announcements about any significant features or usage changes.
patricklucas
Posts: 34
Joined: Sun Jul 06, 2008 5:05 am
Location: NC, USA

Stress Test

Post by patricklucas »

I created an application to check Irrlicht's ability to handle a large number of nodes. An OSS project I'm interested in working with wants a node visualizer that can handle "millions of nodes" so I've been experimenting to see if Irrlicht could handle it, after optimizations of course.

There is not much needed except displaying millions of cubes, providing a camera, and allowing each cube to be addressed and its properties modified programatically.

I created a cube of 1,000,000 cubenodes of size 10, separated by 10. I get ~0.60601 FPS. Better than I expected, actually.

A few things I notice:
  • The application uses 1,379,568K of memory. This implies that each cube node is over 1K. Can this be brought down by me, the programmer? Perhaps by extending a class with fewer features?
  • It obviously uses only one core of my cpu. Is there something I can do to thread off some of the rendering tasks that take up so much cpu time?
Screenshot: (The titlebar is FPS - Lower here; average ~0.6)
Image
psychophoniac
Posts: 101
Joined: Wed Dec 03, 2008 5:33 pm
Location: ger

Post by psychophoniac »

i think irrlicht is completely single-threaded (for good reasons :P ), so no optimising here.
i love skateboarding!
BlindSide
Admin
Posts: 2821
Joined: Thu Dec 08, 2005 9:09 am
Location: NZ!

Post by BlindSide »

Use hardware instancing.

Give me some time I'll produce something that performs better. Can you paste the code you used for creating all the cubes, etc, for comparison purposes? I wanna know the camera's exact position relative to the center of the cube cluster.
ShadowMapping for Irrlicht!: Get it here
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Post by hybrid »

Did you ensure to use EHM_STATIC? You also have to lower the minimum vertices of the vertex buffers, otherwise this won't make a difference as VBOs would not be used (It's currently defaulting to 500 vertices, while the cube has probably much less).
patricklucas
Posts: 34
Joined: Sun Jul 06, 2008 5:05 am
Location: NC, USA

Post by patricklucas »

BlindSide wrote:Use hardware instancing.

Give me some time I'll produce something that performs better. Can you paste the code you used for creating all the cubes, etc, for comparison purposes? I wanna know the camera's exact position relative to the center of the cube cluster.
Unfortunately I can't get you the exact camera position there because I moved it (at a blazing .6 fps) to get outside of the cube. I did mainly go backwards from the starting position of:

Code: Select all

cam->setPosition(vector3df(0,30,-40));
cam->setTarget(vector3df(0,5,0));
Here's the code I used to generate the cube o' cubes:

Code: Select all

void makeCubeArray(int arrSize, f32 cubeSize, f32 spacing, scene::ISceneManager* smgr)
{
	int off = (arrSize * (cubeSize + spacing)) / 2;

	for (int i = 0; i < arrSize; i++)
	{
		for (int j = 0; j < arrSize; j++)
		{
			for (int k = 0; k < arrSize; k++)
			{
				scene::IMeshSceneNode* node = smgr->addCubeSceneNode(cubeSize);
				node->setPosition(vector3df(
					(i * (cubeSize + spacing)) - off,
					(j * (cubeSize + spacing)) - off,
					(k * (cubeSize + spacing)) - off));

				if (node)
					node->setMaterialFlag(video::EMF_LIGHTING, false);
			}
		}
	}
}
Also, I calculated fps myself with a deltaT rather than use the built-in fps counter for obvious reasons.
patricklucas
Posts: 34
Joined: Sun Jul 06, 2008 5:05 am
Location: NC, USA

Post by patricklucas »

Also, I called that method with:

Code: Select all

makeCubeArray(100, 10, 10, smgr);
patricklucas
Posts: 34
Joined: Sun Jul 06, 2008 5:05 am
Location: NC, USA

Post by patricklucas »

hybrid wrote:Did you ensure to use EHM_STATIC? You also have to lower the minimum vertices of the vertex buffers, otherwise this won't make a difference as VBOs would not be used (It's currently defaulting to 500 vertices, while the cube has probably much less).
How do I do this? I found EHM_STATIC in the docs but how do I use it? It's under scene::, does that mean it's a scenemanager property or a node property?
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Post by hybrid »

It's a mesh property, you have to call setHardwareMappingHint(EHM_STATIC). The minimum number of vertices for the VBOs is configured via the driver.
patricklucas
Posts: 34
Joined: Sun Jul 06, 2008 5:05 am
Location: NC, USA

Post by patricklucas »

I added the following code, but it had no effect. The program used about the same amount of memory and had the same average framerate.

Code: Select all

node->getMesh()->setHardwareMappingHint(scene::EHM_STATIC);
I don't know how to get a count of the number of vertices.
BlindSide
Admin
Posts: 2821
Joined: Thu Dec 08, 2005 9:09 am
Location: NZ!

Post by BlindSide »

I highly doubt static vbos will help unless you use some kind of batching scheme.
ShadowMapping for Irrlicht!: Get it here
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
patricklucas
Posts: 34
Joined: Sun Jul 06, 2008 5:05 am
Location: NC, USA

Post by patricklucas »

BlindSide wrote:Use hardware instancing.

Give me some time I'll produce something that performs better. Can you paste the code you used for creating all the cubes, etc, for comparison purposes? I wanna know the camera's exact position relative to the center of the cube cluster.
If you want me to, I can recreate it and get you the camera coords and target.
Steel Style
Posts: 168
Joined: Sun Feb 04, 2007 3:30 pm
Location: France

Post by Steel Style »

For those who want to test it but are too lazy to do the code

Code: Select all

#include "Irrlicht.h"

using namespace irr;
using namespace video;
using namespace scene;

#pragma comment (lib, "Irrlicht.lib")

void makeCubeArray(int arrSize, f32 cubeSize, f32 spacing, scene::ISceneManager* smgr)
{
   float off = (arrSize * (cubeSize + spacing)) / 2;

   for (int i = 0; i < arrSize; i++)
   {
      for (int j = 0; j < arrSize; j++)
      {
         for (int k = 0; k < arrSize; k++)
         {
            scene::IMeshSceneNode* node = smgr->addCubeSceneNode(cubeSize);
			node->setPosition(core::vector3df(
               (i * (cubeSize + spacing)) - off,
               (j * (cubeSize + spacing)) - off,
               (k * (cubeSize + spacing)) - off));

            if (node)
               node->setMaterialFlag(video::EMF_LIGHTING, false);
         }
      }
   }
}

int main()
{
	IrrlichtDevice* device = createDevice(EDT_DIRECT3D9);
	IVideoDriver* driver = device->getVideoDriver();
	ISceneManager* smgr = device->getSceneManager();

	ICameraSceneNode* cam = smgr->addCameraSceneNodeFPS();
	cam->setPosition(core::vector3df(0,0,0));
	cam->setTarget(core::vector3df(0,0,0));

	makeCubeArray(100, 10, 10, smgr);
	core::stringw title;

	while (device->run()) if (device->isWindowActive())
	{
		driver->beginScene(true,true,SColor(255,200,200,200));
		smgr->drawAll();

		title = L"FPS :";
		title += driver->getFPS();
		device->setWindowCaption(title.c_str());
		driver->endScene();
	}
	smgr->clear();

	device->drop();
}

In my case I tried with 27 000 cubescene node before the makeCubeArray I got exactly 13 888K after the creation of the cube scene node 54 620K so approximatly 1.5K per node.

with the folowing line (I computed the size of member involved in the creation) :

Code: Select all

int memory = sizeof(SMaterial) + sizeof(IMeshSceneNode) + sizeof(f32) + sizeof(SMesh) + sizeof(SMeshBuffer)+ sizeof(S3DVertex)*12 + sizeof(u16)*36;
I got 1108 Bytes and I should have missed some other variable so they are no more mystery behind it .. I think . And Blindside this result is on the loading no matter what the angle of the camera. But may be FPS performance can improve.
JP
Posts: 4526
Joined: Tue Sep 13, 2005 2:56 pm
Location: UK
Contact:

Post by JP »

Can't believe no one's mentioned this already.. although maybe blindside's comment was based on this...

If you're rendering a million scene nodes then you're doing a million render calls. Much better to put all those cubes into one scene node (or at least fewer) and then you get a fraction of the number of render calls which is incredibly hugely mammothly faster.

Might be difficult if you need to move/scale/rotate those nodes independently but it would still be possible.
Image Image Image
Steel Style
Posts: 168
Joined: Sun Feb 04, 2007 3:30 pm
Location: France

Post by Steel Style »

I created an application to check Irrlicht's ability to handle a large number of nodes.
Yes JP but since he would simply want to know if Irrlicht was abe to hande large number of node it's the correct test. That was also why I was more talking about memory than fps. This way we know that we got a least 1.5K in memory by node.

But for fps increment they are several technique.
bitplane
Admin
Posts: 3204
Joined: Mon Mar 28, 2005 3:45 am
Location: England
Contact:

Post by bitplane »

Steel Style wrote:

Code: Select all

int memory = sizeof(SMaterial) + sizeof(IMeshSceneNode) + sizeof(f32) + sizeof(SMesh) + sizeof(SMeshBuffer)+ sizeof(S3DVertex)*12 + sizeof(u16)*36;
I didn't try it yet, but it looks like the test is very specific to cube nodes with no texture assigned. Other mesh nodes share a cached mesh, not that this will help though.

You aren't ever going to draw a million nodes on screen at once, it makes no sense to traverse a list of a million nodes (CPU hungry) each loop or to do a million draw calls each loop (as BlindSide and JP said). The focus for performance improvements should be on batching and culling.

You can't make any decent RAM saving if you batch stuff, as you sacrifice speed for RAM in this case. Unless you implement instancing, and then you're making the assumption that all meshes are the same one in different positions.

You can't really judge all this in the general case, the test needs to be more specific. Any optimizations you can make will rely on understanding what your test is modeling.
Submit bugs/patches to the tracker!
Need help right now? Visit the chat room
Post Reply