Optimizing code and managing your project

REDDemon · Post by **REDDemon** » Thu Mar 24, 2011 9:37 am

Here is a list of usefull things that I have learned in last few months:

there are some general rules for optimizing you code (I'm not saying that those rules are always valid)

0) Want to use/try C++11 features? see that:
http://gameprog.it/articles/90/c-11-get ... c4KNKwlGC8
Actually GCC 4.8.X is only compiler with full support. Clang have full support but I see snippets from people that still does not compile even with correct code(few templates).. Visual studio is a big question mark.

1) Swtich ON your compiler optimization. Takin into example Code::blocks, you should enable O2 or O1 in debuggin mode:
Debuggin with compiler optimization enabled is more helpfull in finding bottlenecks (you know that Irrlicht Image loaders are twice faster with O2 or Oexpensive than without any optimization?). In my project (FreeImageMixer) I recompiled Irrlicht only with image loaders and enabling expensive optimization. Images are now loaded very fast

.

2)be const correct:
http://irrlicht.sourceforge.net/phpBB2/ ... hp?t=37573

3) don't abuse inlining:

there are only few cases when inlining is usefull:

-the assembly of inlined function is shorter than the code for calling that function. (and i think some compilers do that automatic)

-you are calling a small function few times and from a loop. (if you start calling the function from everywhere you increase execution time because you are abusing of you limited cache memory) But you never now where are you going to use certain functions.

-sometimes you cannot inline functions when you are using dlls/libraries, usually compilers gives warning.

-remember that if you write the body of a function in the header of that function the function will be considered inlined (that's why is preferable keeping things separed in a header and a .cpp file). This depends also on the compiler.

-you are using a function just for calling another function.

4)don't abuse namespaces. there are lots of libraries that have classes
or methods with the same name that are just putted in different namespaces. "using namespace" is usefull for testing some code in a simple way, but when you write libaries or part of your game is often bad,try avoid abusing it.
So you must also put you work in its own namespace if you are thinkin of reusing it.

5)in the header of a function if you use a pointer or a reference to a class you don't need to include the header that define that class.

example:

Code: Select all

 
#include <ISceneNode.h>
 
class MyLibUsingSceneNodes
{
    void passSceneNodeAsPointer(irr::scene::ISceneNode*);
}

you can write

Code: Select all

 
namespace irr
{
    namespace scene
    {
        class ISceneNode; //I just warn the compiler that later we will have a pointer/reference to ISceneNode
 
    }
}
 
class MyLibUsingSceneNodes
{
    void passSceneNodeAsPointer(irr::scene::ISceneNode*=0);
}

and then you can include "ISceneNode" only in the .cpp file

Code: Select all

 
#include "myclass.h"
#include <ISceneNode.h>
 
void MyLibUsingSceneNodes::passSceneNodeAsPointer(irr::scene::ISceneNode* node)
{
    node->dosomething.
}

why is that so good? Simple. Forwards are usefull if you are not using a class in a header. If you need to include lots of headers in each class
probably you are doing bad coding. having lots of stuff included in headers instead of source files increase compile time both for you and for your library's users. Not only, having most of the code in headers (unless you are using templates) will make the compiler to inline wrongly lot of stuff resulting in bigger size of final executable.

6) profiling your code can help find where the execution time is spent. If you find that a function drains 80% of the total performance maybe you should look if you can improve it. On some compiler you can't profile when you are debugging. There are lots of ways for profile your code.

8 )if possibile don't use multithreading, that's an advanced feature. that causes problems. When MT is done usually a overhead is added so
a program that is multythreaded on 2 CPU of 2 Ghz each will be faster if done without MT on 1 CPU of 4Ghz (of course certain things are possible only when mutlythreading so sometimes you must use MT). What you have to know is how MT work at hardware level in the case. Data is copied from RAM to cache, so there can be a CPU working on some data and that CPU will not see changes to that data until it reaload that data from RAM. So in general is better avoiding writing/loading to the same data from multiple processes. And if you need to access data from multiple processes you need then some syncronization primitive (lock, RW lock, semaphore etc.). Every thread has its own stack.

9) reducing code size always is good. compiler do that for you, but you want also to do that your self, commenting code is good. If you optimize a function you must keep the original somewhere (for example leave it commented near the function). optimized code is harder to understand and 90% of times you don't need to optimize it. cleaning and keep code ordered is good. But don't spend 90% of your time on that or you will stop developing.

10) keep a good style and don't move from it, if you change style every moment you have to spend more time on understand what you have wrote. There are good code formatters (codeblocks have one built-int with different possible styles. I found that Astyle is good but i usually add some tab spaces to precompiler directives.). Irrlicht style is good because is from Irrlicht developers wich have years of experience

11) templates are double edged swords too. they are flexible but increase compiled code size. Just take a look to "irrMath.h" or to the List class for having a Idea. Templated code will born with time. You first create some test code, then you improve it. You re-factor the code several time, you add new features. When you need to start using the same piece of code again again maybe it's possible to use it as a template (And some times is very adviced).

12) In C++ lot of work is done by external multimedia libraries. The most performance is gained when you REALLY KNOW THE LIBRARIES YOU ARE USING. Using a good library in the right way is the best way for make a fast and stable program. Using to much libraries is not very good (not for perfromance, but because you are adding dependencies wich can introduce new bugs and need specifical libraries for each platform).

13) A game essentially need to process some data . If data are not processed the game can't be played. That's why over-abstracting is not always good (this can simplify certain problems, but you will have a performance hit). A good design can help you a lot in doing that. (there are lots of websites and books about that topic).

17) try to avoid debug specific code. You probably don't want to debug the debuggin code

18) use "assert" (#include <cassert>). You need to break your program as soon as possible when errors occurs. this allows you to fix bugs soon and make almost impossible ignore certain bugs.

19) always enable all compiler warnings. You don't believe how may bugs will be fixed by that.

20) read C++ FAQs.

21) "So you want to write a game" (don't know just I don't linked that before! you must read that)
http://irrlicht.sourceforge.net/forum/v ... =5&t=43770

AND IF YOU ARE INTERESTED IN INSANE OPTIMIZATIONS (so.. if you have time to lose):

7) when you release your project you must try different compiler optimization configurations. (stripping simbols from binary is a must.to.do). sometimes a particular setting is faster than another one. Unluckily you can change those optimizations only per project with most IDES (in C::B you can achieve that, but that's change your file to read-only). In Visual studio is more easy.

Optimizations like O1,O2,O3 ar mutually exclusive. In general you should prefer to enable those optimizations in code wich have many loops (for, while etc.) In other code you should prefer Osize wich reduce compiled size and improve caching (indirectly increase performance).

The ideal solution should be to use a different optimization for each source file, but perfromance increase is unpredictable without profiling. And there are milions (or more) of possibile compiler settings. So you should try to think about that in bottleneck code (there is lot of research about that). On more modern machines Osize hit performance because they have very big cache size and so cache improvement is secondary respect to other things like consoles. Most common optimization is O2 wich reduce also the size (not much as Osize). Reduced size is also very valuable if your executable /dll will be downloaded from your server due to reduced size. You should also think about using LZMA compressed packages.. There is also people wich studies wich formats can be compressed better also.

14) Single compiling unit is a tecnique that allows to produce lightweight and fast code and reduces also compiling time of several times.
http://en.wikipedia.org/wiki/Single_Compilation_Unit

BE AWARE. a small change to your code need to recompile the whole unit. On certain compilers using a single compile unit has almost no improvements because those compilers always do that optimization. But that's also the deal. Single compilation unit optimize inter-dependent code by producing a better automatic inlining of functions and by improving registers usage.

As example: In my engine there are several sub-systems. Each subsystem has reduced dependencies on other subsystems, so making a single compile unit for the whole engine gives not all benefits. What I done is to make single compile units for every subsystem of my engine. Then I tried to find best compiler options for each subsystem. Doing that I saved 30% of reduced executable size and 20% more speed (of course only software is speeded, not drawcalls). since there are just several sub-systems that was not hard.

15) use "memcpy" instead of copying arrays with a "for" loop. memcpy is optimized on most compilers and is also faster with algined memory. Don't believe to people wich have a faster memcpy hardcoded in C. I tried several solutions and all was slower than original memcpy. A code faster than memcpy is possible only on certain platforms with old compilers.
reference of memcpy function.
http://www.cplusplus.com/reference/clib ... ng/memcpy/

you should note that behaviour is undefined if destination overlaps with source bytes. (can crash, can copy correctly but can also change data). For small arrays probably is better a loop, but you never know that unless you look at disassembly and profile your app. You can also find that memcpy is faster also for small arrays.

memcpy is faster. Probably you profiled and find out it is slower simply because your for-loop was already cached. When your data get moved to Ram from cache you will have some overhead (wich you should putted into profile results.. don't warry bad profiling is pretty common. You don't need only nanoseconds timers). but memcpy can also copy data wich is not cached using DMA, and that's ways better.

16) using memory alignement can increase performance in just few places, but allows you to use SIMD tecniques. memory alignement increase memcpy speed.

22) Learn basics things about your IDE: you should always know how to change compiler settings, how to debug, how to profile, and how to look at disassembly code. Shorter assembly code does not mean faster code. Certain instructions takes more time and can stall your CPU's pipeline.

Feel free to contribute

Radikalizm · Post by **Radikalizm** » Thu Mar 24, 2011 1:35 pm

A couple of things:

Be careful with forward declarations (which you mentioned in #5), overusing these can lead to problems while writing code, since you're going to expect the class functionality to be available, which isn't true in forward declarations
I stick with the rule that forward declarations should only be used when needed to avoid possible cyclic references or to hide internal data from users
Proper header formatting will guarantee you that code found inside headers will not be included twice, if the concern is purely about executable size and compilation time:

Code: Select all

#ifndef HEADER_IDENTIFIER // HEADER_IDENTIFIER should be replaced with an unique identifier for this header file
#define HEADER_IDENTIFIER

/*Header body goes here*/

#endif

If an include in a header file creates such an overhead during compile-time, you should think about refactoring some code

I don't know which experiences you have had with multithreading, but looking at your post it seems that they weren't all that great
Multithreading is an intimidating concept to understand and implement, but I wouldn't advise people not to use it, I would advise people to look into it and to get a good understanding of the concepts
The important thing is that people should understand what kind of multithreading models there are, which situations they can be used in and which applications could benefit from a multithreaded implementation
It is true that multithreading can and will create an overhead in simple scenes, but when it comes to complex scenes with lots of good candidates for parallelization it can definitely give a nice performance boost
The performance gain of multithreaded systems lies with the system implementation, its elegance (eg. the system being able to switch to single-threaded mode for simple scenes) and an optimal parallelization of the target code (see Amdahl's law: http://en.wikipedia.org/wiki/Amdahl's_law)

I think I can agree with you on most of the other points you made, although I wouldn't say namespaces are bad as part of a library or a game as long as they are not over-used

REDDemon · Post by **REDDemon** » Tue Mar 29, 2011 6:15 am

Code Guards! i have forget the most important thing!

As you sad multy thhreading is not a simple topic.

about my experience I don't use multythreading in videogames (not for now).

I used at a first step MT for image processing with multiple CPUs (a very simple way for using multithreading where you really have a nice performance boost). Now i switched also to GPU wich is 1000 times faster in image processing and textures generation.

By default in my library little images are processed in parallel CPU threads, while for bigger images the GPU is used. I wanted to use OpenCL but my laptop seems unable to support latest drivers so i have only to play with old shaders.
Probably you are experienced enough in MT but i don't think i will use it in a game, most of the code I use is sequential and can't be parallelized very well. Of course certain specific task are possible only using MT. (parallel file loading, optimized A* , certain AI algorithms, most audio libraries do sounds in threads ecc.)

Of course i think you will use MT in your game engine, that's good if you provide some parallel functions. Maybe your innovative approach can be good also for new developers, in general new developers have to take a think if it is really necessary doing MT.

I never sad that namespaces are always bad. with "using namespace" i was referring to the "using namespace" directive that must not be abused since it can bring to nameconflicts some times (updated the first post)

Radikalizm · Post by **Radikalizm** » Tue Mar 29, 2011 11:51 am

REDDemon wrote:I never sad that namespaces are always bad. with "using namespace" i was referring to the "using namespace" directive that must not be abused since it can bring to nameconflicts some times (updated the first post)

Ah yes, I misunderstood you there

But you are right, the 'using' directive should not be abused

Brainsaw · Post by **Brainsaw** » Tue Mar 29, 2011 1:50 pm

Futhermore: on my system the Code::Blocks code completion doesn't work if I use too many namespaces with "using <namespace>"

.

ChaiRuiPeng · Post by **ChaiRuiPeng** » Tue Mar 29, 2011 3:37 pm

a simple solution i am phasing into my codebase is to use unique prefixes like bullet library does.

i always do this

trGameObject

trAnotherClass

although only for classes and structs. that are in the global setting

i still have a namespace, but i am gradually phasing that out in my files.

bitplane · Post by **bitplane** » Tue Mar 29, 2011 4:04 pm

ChaiRuiPeng wrote:i still have a namespace, but i am gradually phasing that out in my files.

Why? Namespaces make your code less ugly and easier to read, they basically just add your "tr" prefix but allow you to use a longer, more specific prefix instead. Using prefixes like that is an old-hat C thing that we had to do in the old days, but nowadays it's an ugly way to do things. Dumping all your classes in the root namespace isn't a very tidy thing to do.

REDDemon · Post by **REDDemon** » Wed Mar 30, 2011 7:45 am

yup there are also coding style guides that forbid that. maybe i can update and add some link to them..

@brainsaw.

I know that, C::B has a bad code completion. It is very fast when you start typing with a namespace like:

Code: Select all

irr::scene::...

(so that's another reason for put everything in a namespace)
of course you can avoid that by improving code completion script, but this will slow down a lot.

And another issue is that C::B is missing methods description when you are doing "code completion".

So it is not like Visual Studio that every function shows its description.

beside that C::B is Opensource,cross-platform and free

pippy3 · Post by **pippy3** » Wed Mar 30, 2011 7:57 am

bitplane wrote:
ChaiRuiPeng wrote:i still have a namespace, but i am gradually phasing that out in my files.
Why? Namespaces make your code less ugly and easier to read, they basically just add your "tr" prefix but allow you to use a longer, more specific prefix instead. Using prefixes like that is an old-hat C thing that we had to do in the old days, but nowadays it's an ugly way to do things. Dumping all your classes in the root namespace isn't a very tidy thing to do.

namespaces are handy for libraries, but local code it's a bit too much imo.

REDDemon · Post by **REDDemon** » Wed May 18, 2011 9:19 am

added points 12 and 13 and edited point 11.

REDDemon · Post by **REDDemon** » Sat Apr 14, 2012 9:10 am

I continued to edit the main post for several months. Now some bump

. I reviewed mostly all points and added several points. I didn't keep trace of the changes so I can't mark in red what is new

serengeor · Post by **serengeor** » Sat Apr 14, 2012 12:06 pm

Typo in the main post "3) don't abuse iniling:"

REDDemon · Post by **REDDemon** » Sun Apr 15, 2012 9:03 am

ah ok Thx

fixed

Valhalas · Post by **Valhalas** » Tue Apr 17, 2012 4:43 pm

15) use "memcpy" instead of copying arrays with a "for" loop. memcpy is optimized on most compilers and is also faster with algined memory. Don't believe to people wich have a faster memcpy hardcoded in C. I tried several solutions and all was slower than original memcpy. A code faster than memcpy is possible only on certain platforms with old compilers.

I would like to add, that memcpy should be used with caution - with small arrays the gain of performance is really insignificant and you may be better off just using good-old for loop. If one is not sure what he's doing memcpy can cause a lot of pain, I can't even count the times I had to track weird crashes, caused by someones poorly implemented memcpy function...

REDDemon · Post by **REDDemon** » Wed Apr 18, 2012 5:57 pm

Thanks for suggestion. also added 1 point

Irrlicht Engine

Optimizing code and managing your project

Optimizing code and managing your project

Re: Optimizing code and managing your project

Re: Optimizing code and managing your project

Re: Optimizing code and managing your project

Re: Optimizing code and managing your project

Re: Optimizing code and managing your project