Accelerate Rendering

SimonHawe · Post by **SimonHawe** » Wed Aug 27, 2008 2:09 pm

Hi to all,
I am currently writing an Insect Eye Simulation using Irrlicht. Such an eye can be roughly described as a big number of cameras attached to a half sphere.
So what I did is adding only one camera with a very small Field of View to the scene, changing its position on the sphere and each timing rendering the scene with drawAll(). The result is really cool but as you might expect it takes quiet a lot of time grabbing one complete image(for 1000 different viewing directions ~ 1 second).
Now my question is if you could think of a way to reduce this time as both resolution and field of view of a single rendering step can be very low.
Cheers and thanks
Simon

JP · Post by JP » Wed Aug 27, 2008 2:25 pm

how are you combining the renders? are you using shaders at all in the process? shaders could probably do something like this in some way i would have thought...

i'm guessing the bottleneck isn't really in the complexity of your scene but the amount of renders you're having to do. you're doing 1000 renders? maybe you could try doing fewer, if possible? obviously if you wanted to apply this to a complex scene then you'd need to render the scene pretty damned quickly each time so you'd want to take advantage of things like occlusion culling.

do you have a screenshot of the final image you could show us so we understand better what you're doing?

sounds like a cool little project anyway!

SimonHawe · Post by **SimonHawe** » Wed Aug 27, 2008 2:53 pm

The problem with the number of rendering steps is, that i really need a high number of different viewpoints and viewing directions. I already thought of something different like only rendering 6 orthogonal viewing directions(6 sides of a cube) and projecting the result on a sphere, but this is not what i want cause i will miss some information.
But anyway at what point do you suggest I could use shaders? I already increased whole processing speed by using threads, but at the point I call drawAll() threads do not really help anymore cause rendering is performed on the GPU and only the rendering is crucial for my task.
So what i roughly do without concerning threading part is
for all wanted camera position and viewing direcionts{
set viewport
set camera position
set camera orientation, target vector, up vector
scenemanger->drawAll()
}
Simon
P.S.: I will upload a screen shot soon

JP · Post by JP » Wed Aug 27, 2008 3:03 pm

where are you using threads? you know irrlicht isn't threadsafe right? So anything to do with resources such as textures will feck up your program big time if you're doing them both in seperate threads without very careful handling.

to be fair if you're rendering a scene 1000 times you're never gonna get any faster than 1fps at the best of times i wouldn't think, even with an empty scene.

SimonHawe · Post by **SimonHawe** » Wed Aug 27, 2008 3:12 pm

Yeah I know that it isn't and I really using threads carefully no worries. Yeah I already thought that I can not accelerate it anymore, perhaps using 10 graphic cards

but thank you anyway for your help
Simon

rogerborg · Post by **rogerborg** » Wed Aug 27, 2008 3:54 pm

Can we rewind to the requirements? Does it have to be an accurate simulation, or just a visual approximation of it?

Do you need to update every facet every frame? Presumably the (human) viewer of the results will tend to concentrate on the centre of the (flat) screen, particularly if the viewpoint is moving forwards.

If so, then you could update the outer facets more irregularly, without losing too much visual impact.

The other cheat that occurs is to render to a texture, using a FOV that's large enough to cover the total FOV for (e.g.) 3 facets.

You could then draw appropriately offset portions of that texture to the facets surrounding the centre of that particular camera view direction.

Only the centre facet would have the correct view, but the surrounding facets should look more or less correct, particularly if there's motion going on.

For example, if you're notionally using a 10 degree camera separation and a 20 degree FOV, then instead render every 30 degrees using a 40 degree FOV.

Draw half of the texture (i.e. half its width and height) to each of 9 facet viewports, using 0 / 0.25 / 0.5 width and height inset portions of it as appropriate.

That would let you divide the required number of renders by 9, hopefully without the fakery being too obvious.

rogerborg · Post by **rogerborg** » Wed Aug 27, 2008 6:00 pm

Well, I had a laugh doing it.

Remember to do a release build of both Irrlicht and your app.

Code: Select all

#include <irrlicht.h>
#include <iostream>

using namespace irr;
using namespace core;
using namespace scene;
using namespace video;
using namespace io;
using namespace gui;

#pragma comment(lib, "Irrlicht.lib")

class MyEventReceiver : public IEventReceiver
{
public:
    // This is the one method that we have to implement
    virtual bool OnEvent(const SEvent& event)
    {
        // Remember whether each key is down or up
        if (event.EventType == irr::EET_KEY_INPUT_EVENT)
            KeyIsDown[event.KeyInput.Key] = event.KeyInput.PressedDown;

        return false;
    }

    // This is used to check whether a key is being held down
    virtual bool IsKeyDown(EKEY_CODE keyCode) const
    {
        return KeyIsDown[keyCode];
    }

    MyEventReceiver()
    {
        for (u32 i=0; i<KEY_KEY_CODES_COUNT; ++i)
            KeyIsDown[i] = false;
    }

private:
    // We use this array to store the current state of each key
    bool KeyIsDown[KEY_KEY_CODES_COUNT];
};


int main()
{
    // I'd normally use OpenGL, but render to texture seems much, much faster in D3D.
    // Also, the OpenGL render-to-texture is upside down...
    // Create a screen that's power of 2 sized, or else the render target
    // (which is a pow^2 texture) may be larger than the screen, and refuse to set itself.
    IrrlichtDevice *device =
        createDevice(EDT_DIRECT3D9, core::dimension2d<s32>(1024, 512));
    if (device == 0)
        return 1; // could not create selected driver.
   
    MyEventReceiver receiver;
    device->setEventReceiver(&receiver);   

    video::IVideoDriver* driver = device->getVideoDriver();
    scene::ISceneManager* smgr = device->getSceneManager();

    device->getFileSystem()->addZipFileArchive("../../media/map-20kdm2.pk3");

    scene::IAnimatedMesh* mesh = smgr->getMesh("20kdm2.bsp");
    scene::ISceneNode* node = 0;
   
    if (mesh)
        node = smgr->addOctTreeSceneNode(mesh->getMesh(0), 0, -1, 1024);

    if (node)
        node->setPosition(core::vector3df(-1300,-144,-1249));

    ICameraSceneNode * camera = smgr->addCameraSceneNode();

    // Play with these values
    const f32 totalFov = 180.f;
    const int facets = 7; // Should be odd.
    const int subFacetsPerFacet = 5; // Should be odd

    const f32 cameraFovDegrees = totalFov / ((facets + 1) / 2.f);
    const f32 cameraSeparationDegrees = cameraFovDegrees / 2.f;

    const dimension2di facetScreenDimensions(driver->getScreenSize().Width / facets,
                                             driver->getScreenSize().Height / facets);

    camera->setFOV(cameraFovDegrees * DEGTORAD);

    device->getCursorControl()->setVisible(false);

    u32 then = device->getTimer()->getTime();

    ITexture * renderTarget = driver->createRenderTargetTexture(facetScreenDimensions);

    const int middleFacet = ((facets - 1) / 2);
    int drawFacetsFromMiddle = 0;

    while(device->run())
    if (device->isWindowActive())
    {
        const u32 now = device->getTimer()->getTime();
        const f32 delta = (f32)(now - then) / 1000.f;
        then = now;

        driver->beginScene(false, true, video::SColor(0,200,200,200));

        const vector3df baseCameraDir = (camera->getTarget() - camera->getAbsolutePosition()).normalize();
        vector3df cameraRotations = baseCameraDir.getHorizontalAngle();

        vector3df minimumRotations = cameraRotations;
        minimumRotations.Y -= (cameraSeparationDegrees * ((facets - 1) / 2));
        minimumRotations.X -= (cameraSeparationDegrees * ((facets - 1) / 2));

        position2di drawPosition(0, 0);

        for(int facetV = 0; facetV < facets; ++facetV)
        {
            int diffFromMiddle = abs_(middleFacet - facetV);
            if(diffFromMiddle > drawFacetsFromMiddle)
                continue;

            drawPosition.Y = facetV * facetScreenDimensions.Height;
            cameraRotations.X = minimumRotations.X + (facetV * cameraSeparationDegrees);

            const f32 rowInset = (facetV % 2) ? 0.0f : -.50f;

            for(f32 facetH = rowInset; facetH < facets; facetH += 1.f)
            {
                diffFromMiddle = abs_(middleFacet - (int)facetH);
                if(diffFromMiddle > drawFacetsFromMiddle)
                    continue;

                drawPosition.X = (s32)(facetH * facetScreenDimensions.Width);
                cameraRotations.Y = minimumRotations.Y + (facetH * cameraSeparationDegrees);

                vector3df lookAtVector(0, 0, 100);
                lookAtVector.rotateYZBy(cameraRotations.X);
                lookAtVector.rotateXZBy(-cameraRotations.Y);

                camera->setTarget(camera->getAbsolutePosition() + lookAtVector);

                driver->setRenderTarget(renderTarget, true, true);
                smgr->drawAll();

                driver->setRenderTarget(0, false, false);

                rect<s32> sourceRect(0, 0, 0, 0);
                rect<s32> destRect(0, 0, 0, 0);

                for(int subV = 0; subV < subFacetsPerFacet; ++subV)
                {
                    sourceRect.UpperLeftCorner.X = 0;
                    sourceRect.UpperLeftCorner.Y = facetScreenDimensions.Height * subV / (subFacetsPerFacet + 1);
                    destRect.UpperLeftCorner.Y = drawPosition.Y + (subV * facetScreenDimensions.Height / subFacetsPerFacet);

                    for(int subH = 0; subH < subFacetsPerFacet; ++subH)
                    {
                        sourceRect.UpperLeftCorner.X = facetScreenDimensions.Width * subH / (subFacetsPerFacet + 1);
                        sourceRect.LowerRightCorner.X = sourceRect.UpperLeftCorner.X + facetScreenDimensions.Width / (1 + (subFacetsPerFacet / 2));
                        sourceRect.LowerRightCorner.Y = sourceRect.UpperLeftCorner.Y + facetScreenDimensions.Height / (1 + (subFacetsPerFacet / 2));

                        destRect.UpperLeftCorner.X = drawPosition.X + (subH * facetScreenDimensions.Width / subFacetsPerFacet);
                        destRect.LowerRightCorner.X = destRect.UpperLeftCorner.X + (facetScreenDimensions.Width / subFacetsPerFacet) + 1;
                        destRect.LowerRightCorner.Y = destRect.UpperLeftCorner.Y + (facetScreenDimensions.Height / subFacetsPerFacet) + 1;

                        driver->draw2DImage(renderTarget, destRect, sourceRect);
                    }
                }
            }
        }

        drawFacetsFromMiddle++;
        if(drawFacetsFromMiddle > middleFacet)
            drawFacetsFromMiddle = 0;

        cameraRotations = baseCameraDir.getHorizontalAngle();

        if(receiver.IsKeyDown(KEY_LEFT))
            cameraRotations.Y -= delta * 100.f;
        if(receiver.IsKeyDown(KEY_RIGHT))
            cameraRotations.Y += delta * 100.f;
        if(receiver.IsKeyDown(KEY_UP))
            cameraRotations.X -= delta * 100.f;
        if(receiver.IsKeyDown(KEY_DOWN))
            cameraRotations.X += delta * 100.f;

        vector3df lookAtVector(0, 0, 100);
        lookAtVector.rotateYZBy(cameraRotations.X);
        lookAtVector.rotateXZBy(-cameraRotations.Y);
        const vector3df cross = lookAtVector.crossProduct(camera->getUpVector());

        vector3df cameraPosition = camera->getAbsolutePosition();

        if(receiver.IsKeyDown(KEY_KEY_W))
            cameraPosition += lookAtVector * delta;
        if(receiver.IsKeyDown(KEY_KEY_S))
            cameraPosition -= lookAtVector * delta;
        if(receiver.IsKeyDown(KEY_KEY_A))
            cameraPosition += cross * delta;
        if(receiver.IsKeyDown(KEY_KEY_D))
            cameraPosition -= cross * delta;

        camera->setPosition(cameraPosition);
        camera->setTarget(cameraPosition + lookAtVector);
        camera->updateAbsolutePosition();

        driver->endScene();
    }

    renderTarget->drop();
    device->drop();
    return 0;
}

sio2 · Post by **sio2** » Wed Aug 27, 2008 7:46 pm

SimonHawe wrote:Yeah I already thought that I can not accelerate it anymore

1. Ensure all scene data is in static VBO's.
2. MRT
3. Huge RTT target; subdivide into virtual viewports
4. Hardware instancing/shader instancing.

That's just a few ideas off the top of my head.

scotchfaster · Post by **scotchfaster** » Fri Oct 17, 2008 8:15 pm

sio2 wrote:
SimonHawe wrote:Yeah I already thought that I can not accelerate it anymore
1. Ensure all scene data is in static VBO's.
2. MRT
3. Huge RTT target; subdivide into virtual viewports
4. Hardware instancing/shader instancing.

That's just a few ideas off the top of my head.

Question on #3: I have a similar problem in that I'm rendering to a big (512 x 512) texture, and I'm finding that CD3D9Texture::lock() is eating up 65% of my program's time. Would virtual viewports and smaller RTTs help, and how exactly do I make virtual viewports? Would I just render the same scene many times to smaller textures?

Thanks!