Speed And Modifications

Discuss about anything related to the Irrlicht Engine, or read announcements about any significant features or usage changes.
playerdark
Posts: 64
Joined: Mon Aug 01, 2005 5:06 am

Speed And Modifications

Post by playerdark »

Hello,

first of all I want to apologize if this topic has been discussed before, and I am sure it has. However, it is a quite important topic for me right now and I would just like to throw in a few observations, describe my current situation and maybe something constructive comes out of it.

I started over a year ago with Irrlicht. We are working on a small commercial MMPG, actually version 2, the current version is already running satisfactorily but only with Ultima Online style graphics that use simple DX7 surface sprites. Now we were looking for a basic 3D library that saves us the bother of doing all the low level nitty gritty stuff. Irrlicht was one of the first choices and I started by making a small demo. I used Cal3D for the animation which worked quite well, however we decided to give Ogre a try since it's so hyped. After wasting quite some time without any results and not much response from the Ogre community, we tried other commercial libraries out there, but they were too restricted. No other library offers the flexibility, easy of use and the relative power Irrlicht offers, so let me say a big thank you to Niko and the people who help making this possible :D

But now comes my problem. I never realized before that Irrlicht has some restrictions based on it's design and those restrictions are a real showstopper for our project. I'm sure it's my fault for not investigating this earlier, but I realized too late the Irrlicht has no support for DX hardware buffers, all vertices and indices are kept in user memory which is a a great performance hit. Furthermore, there is no flexible vertex format. After replacing Cal3D with our own skeletal system, we wanted to do hardware skinning and realized that the vertex format did not allow the passing of the necessary infos to the shaders. then there's the GUI that seems to draw every single character to the screen in a separate blit operation instead of building up a buffer for multiple frames. I havent investigated that too deeply yet, since I just discovered that, perhaps theres a way around this. I am sure, other problems will show up as I go but that's what I have seen so far.

Now, I'm not complaining. We have started to alter Irrlicht to add the missing stuff, a new vertex format, new hardware vertex and index buffer, new renderfunctions and so on. It's all relatively easy since it is all restricted to the Dx9 driver, the null driver and the IDevice.h file (btw thanks for taking out the double .h files that used to be in source and include :) ) , so we're making good progress and I am sure, we can alter Irrlicht enough to get the speed factor we need, but the problem with that is of course that we decouple ourselves from the public Irrlicht development. Should there ever be a significantly improver irrlicht 2 it will be hard for us to switch.

The idea now was, if anybody here is interested in the changes we are making. We would be willing to upload the modified Irrlicht engine if somebody want's to / can imnplement those changes into Irrlicht maybe? I wouldn't assume that anybody would be interested in our work if this were a bigger project like Ogre, but from what I gathered on the forums I feel that maybe there might be sufficient interest here for the changes we make. If not, forget my post :P

I should add that we are only interested in DX9 rendering. We don't use OpenGL or the software renderer, so sorry if we focus only on the DX9 side. If somebody is interested in this sort of thing, please let me know.
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Post by hybrid »

All this comes up from time to time, you're right. But since we did not yet a satisfactory solution you should throw in your design and thoughts and maybe push the development to a point where we can fix the final architecture and add all this stuff into Irrlicht. So go for it :!:
playerdark
Posts: 64
Joined: Mon Aug 01, 2005 5:06 am

Post by playerdark »

I think I can release the sourcebits nbow that I added. Like I said, this is not a smooth design nor is it in any way general, for example, my changes work only with DX9 and probably only with Shader 3.0. I also have no step by step instruction how to change the source. I just include the sourcefiles I changed
here

Basically the changes consist of the following things:

- Added new vertex types to D3DVertex.h which can be used with Shaders only. You will have to load your own shaders to use these vertices. There is one vertex that allows up to 4 bone IDs and bone weight as well as a precalculated morph value to be passed to the shader

- The DX9 device registers those vertices during device creation and stores the DX vertex descriptors in a static array.

- There is a new draw function at the end of the d3d9device.cpp code, actually 2 versions. One allows for the vertices with the boneweights, either with or without the additional morph channel, which is supplied as an additional input stream in order to allow skeletal meshes that use morphs (faces) and those that do not (rest of the body)

- The other draw function is for the more simple vertextypes.

- None of the vertices allows normal passing becasue I do pre light calculations and don't care much about dynamic light effects

- There are vertex and index buffers that you can allocate in the D3D 9 device. Actually these handles are whet you have to pass to the draw functions. This way, you can hold your buffers on the graphic card as opposed to the standard Irrlicht way of transporting the whole buffer to the graphic card each frame with the UP draw functions.

There are some changes to the GUI elements as well, mostly static text and button since I deem them most important. The other GUI elements can be brought up when the user needs to enter something and then removed from screen, although it would be good if changes like the ones I made were commonplace in all classes.

Basically, theres a rendertarget in the gui element baseclass which is used by the static text and the button, when these elements are dirty. Once the rendertarget is filled with the letters and button graphic, the element is declared clean. In subsequent draw() calls this buffer is drawn instead of the text. This can save several hundred draw calls each frame!

The GUI is a major slowdown due to the fact that each letter or element is drawn to the screen each time. A solution like the one shown here is cruical for it's speedup. the dirty mechanism could be much more sophisticated. For example, I use a 32 bit value with only 2 bits used, one for a size change, one for a text change. By making this more detailed, surflous redraw could be prevented. Like I said, for me the most important thing is to get Irrlicht to a stage where I can use it for my project, I don't have the time to think about and test general solutions.

If anybody is interested in the demo I have written with this system (which also contains some example for shaders) you can download it here
playerdark
Posts: 64
Joined: Mon Aug 01, 2005 5:06 am

Post by playerdark »

Oh I forgot: I made some changes to the matrix class as well. By eliminating the array access to the elements internally, some common functions like matrix multiplication can gain 5-15% speed
IPv6
Posts: 188
Joined: Tue Aug 08, 2006 11:58 am

Post by IPv6 »

Can you tell more about implementing HW buffers? Did you add support to your custom nodes or it is propagated on all scene nodes? If yes how you handle locking/filling in/across frames? Thank you!
playerdark
Posts: 64
Joined: Mon Aug 01, 2005 5:06 am

Post by playerdark »

I am not using the meshes that come with Irrlicht. Basically I use the scenenodes, derive my own class from it and then usually fill it with my mesh data. the meshdata now consists of a handle to a vertex and an index buffer, as created from the new function in the DX9 device, in the most simple case. Then I override the render() function and there I call the new draw function in the DX9 device.

I should add that this approach of course renders you with the problem of getting data since the usual Irrlicht art pipeline doesn't work. Again, I can only speak for myself here. I have created my own art pipeline to extract skeletal info with the Cal3D exporter for example and then created converter to transfer it into my own binary format. In fact I have a whole animation and world engine on top of irrlicht, but wih these modifications, Irrlicht is a good base library that takes away all the work that is associated with dealing with 3DX and scene graph management

Also, what I found convenient is to have separate, specialized scenenodes for certain things. At the application level I think generalization should not be used when it gets in the way of convenience or speed. I have one special scenenode for the skybox, one for the water, two for bodies and a special one that is based on a modified Irrlicht terrain node, which now also uses HW buffers instead of meshes. It's relatively easy to change the terrain node code to using hardware buffers. If anybody needs it, i can upload it too, but it may not work since I might have added some dependencies to my own library system, but as a template to do your own terrain its ok.
IPv6
Posts: 188
Joined: Tue Aug 08, 2006 11:58 am

Post by IPv6 »

This would be really nice if you consider uploading here examples of your HW powered scene nodes - as references and base point for futher modifications. Besides that learning from different implementations can help futher generalization of this feature, to be properly implemented at core level (i think).

May i ask some more questions about HW nodes (i`m personally interested in that feature :) ). As i understant you creating new buffer for every new customized mesh. Are they static or dynamic and, if dynamic, did you add optimizations based on pushing several animation stages in one HW buffer to tweak indices only (the main benefit of HW buffers)? Or you are using base mesh and whole animation in shader (BTW, what is the lower version of shaders can do GPU based animation, solely?)
playerdark
Posts: 64
Joined: Mon Aug 01, 2005 5:06 am

Post by playerdark »

I can't really upload nodes, because my engine stores and loades objects at a much higher level. For example the whole terrain is managed and then cut down into individual terrain nodes. The body system uses a skeleton and several nodes for some body parts like the torso, wings, and the head is a complete morph animation system, so I dont really have seperate nodex or so.

The buffers are as static as can be. For example, terrainnodes keep one vertex buffer with all the vertices, but the indexbuffer changes as the terrain node rebuilds the triangles that are visible. The waternode has the vertices and the indices static and does the water animation in the vertex shader. The nodes for the torso have the vertices and indices static, but during the render process a set of bone matrices is passed into the shader and the shader then uses the bone indices that are associated with each vertex to look up the bone matrices and transform the vertices according to the actual skeleton position. On a modern graphic card you can have up to about 55-60 bones in one rendercall, perhaps more, so you can do hardware skinning with a lot of bones in one call. The head, that also needs morph animation for the facial expression, uses a static vertex buffer as stream source 0 but also a dynamic buffer as stream source 1 that contains only the morph information. By using multiple streams as shader input, you can hold the static data in the graphic card and you need to load only the dynamic part every frame.

What shaders you use really depends on your demands. I use shader 3.0 as minimum, because our bodies have a great number of bones and our game is scheduled for release in 2 years, so I can safely assume shader 3,0 as minimum, but you can apply those principles even with shader 1.0 I guess, it really depends more on how you write the shader rather than what you do in Irrlicht.


By the way I realize that not everyone who reads this wants to install the demo so I just include some screenshots from an earlier version here which I have on the server. The shadow doesn't work right now, because the Irrlicht shadow node does have problems when used in conjunction with shaders. I haven't figured out why. If somebody who reads this could give me a hint, I would be very grateful :)

Image
Image
Image
sio2
Competition winner
Posts: 1003
Joined: Thu Sep 21, 2006 5:33 pm
Location: UK

Post by sio2 »

IPv6 wrote:This would be really nice if you consider uploading here examples of your HW powered scene nodes
There are a couple of demos on my website that use HW more efficiently that current Irrlicht. Also, my latest fur demo "KittyCat" uses IrrSpintz. IrrSpintz is a version of Irrlicht that has some support for DX9 Vertex/Index Buffers.
Virion
Competition winner
Posts: 2149
Joined: Mon Dec 18, 2006 5:04 am

Post by Virion »

I have a question, why don't add in the features of IrrSpintz into the irrlicht engine? :?:
Spintz
Posts: 1688
Joined: Thu Nov 04, 2004 3:25 pm

Post by Spintz »

sio2 wrote:
IPv6 wrote:This would be really nice if you consider uploading here examples of your HW powered scene nodes
There are a couple of demos on my website that use HW more efficiently that current Irrlicht. Also, my latest fur demo "KittyCat" uses IrrSpintz. IrrSpintz is a version of Irrlicht that has some support for DX9 Vertex/Index Buffers.
OpenGL also has support for Vertex and Index buffers.

You simply load a mesh, and then call mesh->UploadMeshData( IVideoDriver* vDriver ). Then you can create as many nodes from that mesh and they will all use the uploaded mesh data. The IAnimatedMesh and IMesh UploadMeshData calls are only different in that the IAnimatedMesh call will create a dynamic vertex buffer so the data can be updated when animating.
Last edited by Spintz on Sun Mar 04, 2007 2:35 pm, edited 1 time in total.
Image
IPv6
Posts: 188
Joined: Tue Aug 08, 2006 11:58 am

Post by IPv6 »

Thanks for great details! personally i am new with shaders (my current work is casual 2d game, so i don`t need them) but next project will be in real 3d, so i`m looking for info how they can be used and implemented with efficiency. Doing skeletal animation at shader level is simply amazing, i always wonder how it is working in real games, thanks for info!!!

BTW: screenshots are not visible :/ 404

P.S. and thanks to Sprintz for sharing its code :) first place where i started to search HW-related implementations was irrSprintz. Now i`m looking for other ideas to get all cases in general. It is a pity that it is not possible to merge it into irrlicht, too many API breaks in too many areas... btw, Sprintz, can you leave links to games that are using it? as i understand, you are using it in proffesional development?
sio2
Competition winner
Posts: 1003
Joined: Thu Sep 21, 2006 5:33 pm
Location: UK

Post by sio2 »

IPv6 wrote:BTW: screenshots are not visible :/ 404
Do you mean the screenshots on my website? It could be a glitch from my host - give it another try. The screenshots are an essential part of my website. :mrgreen:
Spintz
Posts: 1688
Joined: Thu Nov 04, 2004 3:25 pm

Post by Spintz »

IPv6 wrote:...P.S. and thanks to Sprintz for sharing its code :) first place where i started to search HW-related implementations was irrSprintz. Now i`m looking for other ideas to get all cases in general. It is a pity that it is not possible to merge it into irrlicht, too many API breaks in too many areas... btw, Sprintz, can you leave links to games that are using it? as i understand, you are using it in proffesional development?
First off, there's no R in Spintz!!!! :P

At my job, we use a custom written 3D engine that was written by a guy that used to work for us a couple years ago. And honestly, I don't do much with it anymore, I'm on other projects at the moment.

My work on IRRspintz is completely hobby work, for fun. The only people, that I know of, that are using it are Warchief [ Warboard ] and sio2 has done the Kitty Fur Demo with IRRspintz. Oziriz I believe will start using it since I've fixed his OpenGL issues( recently replaced all the extension crap with GLee and COpenGLDriver is SO MUCH CLEANER NOW and the support for extensions is better/easier to maintain as well.
IPv6
Posts: 188
Joined: Tue Aug 08, 2006 11:58 am

Post by IPv6 »

2Spintz: OOPS!!! i`m really sorry!! Just didn`t notice it before and it seems that i mixed your name with irr`s flying around this forum :) Sorry!!!!!! No offense! :roll:

2playerdark: i see, i`m working through proxy and it filters your website for unkown reason (for me)... Will take a look from another location soon (just curious :) )
Post Reply