there are some general rules for optimizing you code (I'm not saying that those rules are always valid)
0) Want to use/try C++11 features? see that:
http://gameprog.it/articles/90/c-11-get ... c4KNKwlGC8
Actually GCC 4.8.X is only compiler with full support. Clang have full support but I see snippets from people that still does not compile even with correct code(few templates).. Visual studio is a big question mark.
1) Swtich ON your compiler optimization. Takin into example Code::blocks, you should enable O2 or O1 in debuggin mode:
Debuggin with compiler optimization enabled is more helpfull in finding bottlenecks (you know that Irrlicht Image loaders are twice faster with O2 or Oexpensive than without any optimization?). In my project (FreeImageMixer) I recompiled Irrlicht only with image loaders and enabling expensive optimization. Images are now loaded very fast .
2)be const correct:
http://irrlicht.sourceforge.net/phpBB2/ ... hp?t=37573
3) don't abuse inlining:
there are only few cases when inlining is usefull:
-the assembly of inlined function is shorter than the code for calling that function. (and i think some compilers do that automatic)
-you are calling a small function few times and from a loop. (if you start calling the function from everywhere you increase execution time because you are abusing of you limited cache memory) But you never now where are you going to use certain functions.
-sometimes you cannot inline functions when you are using dlls/libraries, usually compilers gives warning.
-remember that if you write the body of a function in the header of that function the function will be considered inlined (that's why is preferable keeping things separed in a header and a .cpp file). This depends also on the compiler.
-you are using a function just for calling another function.
4)don't abuse namespaces. there are lots of libraries that have classes
or methods with the same name that are just putted in different namespaces. "using namespace" is usefull for testing some code in a simple way, but when you write libaries or part of your game is often bad,try avoid abusing it.
So you must also put you work in its own namespace if you are thinkin of reusing it.
5)in the header of a function if you use a pointer or a reference to a class you don't need to include the header that define that class.
example:
Code: Select all
#include <ISceneNode.h>
class MyLibUsingSceneNodes
{
void passSceneNodeAsPointer(irr::scene::ISceneNode*);
}
Code: Select all
namespace irr
{
namespace scene
{
class ISceneNode; //I just warn the compiler that later we will have a pointer/reference to ISceneNode
}
}
class MyLibUsingSceneNodes
{
void passSceneNodeAsPointer(irr::scene::ISceneNode*=0);
}
Code: Select all
#include "myclass.h"
#include <ISceneNode.h>
void MyLibUsingSceneNodes::passSceneNodeAsPointer(irr::scene::ISceneNode* node)
{
node->dosomething.
}
probably you are doing bad coding. having lots of stuff included in headers instead of source files increase compile time both for you and for your library's users. Not only, having most of the code in headers (unless you are using templates) will make the compiler to inline wrongly lot of stuff resulting in bigger size of final executable.
6) profiling your code can help find where the execution time is spent. If you find that a function drains 80% of the total performance maybe you should look if you can improve it. On some compiler you can't profile when you are debugging. There are lots of ways for profile your code.
8 )if possibile don't use multithreading, that's an advanced feature. that causes problems. When MT is done usually a overhead is added so
a program that is multythreaded on 2 CPU of 2 Ghz each will be faster if done without MT on 1 CPU of 4Ghz (of course certain things are possible only when mutlythreading so sometimes you must use MT). What you have to know is how MT work at hardware level in the case. Data is copied from RAM to cache, so there can be a CPU working on some data and that CPU will not see changes to that data until it reaload that data from RAM. So in general is better avoiding writing/loading to the same data from multiple processes. And if you need to access data from multiple processes you need then some syncronization primitive (lock, RW lock, semaphore etc.). Every thread has its own stack.
9) reducing code size always is good. compiler do that for you, but you want also to do that your self, commenting code is good. If you optimize a function you must keep the original somewhere (for example leave it commented near the function). optimized code is harder to understand and 90% of times you don't need to optimize it. cleaning and keep code ordered is good. But don't spend 90% of your time on that or you will stop developing.
10) keep a good style and don't move from it, if you change style every moment you have to spend more time on understand what you have wrote. There are good code formatters (codeblocks have one built-int with different possible styles. I found that Astyle is good but i usually add some tab spaces to precompiler directives.). Irrlicht style is good because is from Irrlicht developers wich have years of experience
11) templates are double edged swords too. they are flexible but increase compiled code size. Just take a look to "irrMath.h" or to the List class for having a Idea. Templated code will born with time. You first create some test code, then you improve it. You re-factor the code several time, you add new features. When you need to start using the same piece of code again again maybe it's possible to use it as a template (And some times is very adviced).
12) In C++ lot of work is done by external multimedia libraries. The most performance is gained when you REALLY KNOW THE LIBRARIES YOU ARE USING. Using a good library in the right way is the best way for make a fast and stable program. Using to much libraries is not very good (not for perfromance, but because you are adding dependencies wich can introduce new bugs and need specifical libraries for each platform).
13) A game essentially need to process some data . If data are not processed the game can't be played. That's why over-abstracting is not always good (this can simplify certain problems, but you will have a performance hit). A good design can help you a lot in doing that. (there are lots of websites and books about that topic).
17) try to avoid debug specific code. You probably don't want to debug the debuggin code
18) use "assert" (#include <cassert>). You need to break your program as soon as possible when errors occurs. this allows you to fix bugs soon and make almost impossible ignore certain bugs.
19) always enable all compiler warnings. You don't believe how may bugs will be fixed by that.
20) read C++ FAQs.
21) "So you want to write a game" (don't know just I don't linked that before! you must read that)
http://irrlicht.sourceforge.net/forum/v ... =5&t=43770
AND IF YOU ARE INTERESTED IN INSANE OPTIMIZATIONS (so.. if you have time to lose):
7) when you release your project you must try different compiler optimization configurations. (stripping simbols from binary is a must.to.do). sometimes a particular setting is faster than another one. Unluckily you can change those optimizations only per project with most IDES (in C::B you can achieve that, but that's change your file to read-only). In Visual studio is more easy.
Optimizations like O1,O2,O3 ar mutually exclusive. In general you should prefer to enable those optimizations in code wich have many loops (for, while etc.) In other code you should prefer Osize wich reduce compiled size and improve caching (indirectly increase performance).
The ideal solution should be to use a different optimization for each source file, but perfromance increase is unpredictable without profiling. And there are milions (or more) of possibile compiler settings. So you should try to think about that in bottleneck code (there is lot of research about that). On more modern machines Osize hit performance because they have very big cache size and so cache improvement is secondary respect to other things like consoles. Most common optimization is O2 wich reduce also the size (not much as Osize). Reduced size is also very valuable if your executable /dll will be downloaded from your server due to reduced size. You should also think about using LZMA compressed packages.. There is also people wich studies wich formats can be compressed better also.
14) Single compiling unit is a tecnique that allows to produce lightweight and fast code and reduces also compiling time of several times.
http://en.wikipedia.org/wiki/Single_Compilation_Unit
BE AWARE. a small change to your code need to recompile the whole unit. On certain compilers using a single compile unit has almost no improvements because those compilers always do that optimization. But that's also the deal. Single compilation unit optimize inter-dependent code by producing a better automatic inlining of functions and by improving registers usage.
As example: In my engine there are several sub-systems. Each subsystem has reduced dependencies on other subsystems, so making a single compile unit for the whole engine gives not all benefits. What I done is to make single compile units for every subsystem of my engine. Then I tried to find best compiler options for each subsystem. Doing that I saved 30% of reduced executable size and 20% more speed (of course only software is speeded, not drawcalls). since there are just several sub-systems that was not hard.
15) use "memcpy" instead of copying arrays with a "for" loop. memcpy is optimized on most compilers and is also faster with algined memory. Don't believe to people wich have a faster memcpy hardcoded in C. I tried several solutions and all was slower than original memcpy. A code faster than memcpy is possible only on certain platforms with old compilers.
reference of memcpy function.
http://www.cplusplus.com/reference/clib ... ng/memcpy/
you should note that behaviour is undefined if destination overlaps with source bytes. (can crash, can copy correctly but can also change data). For small arrays probably is better a loop, but you never know that unless you look at disassembly and profile your app. You can also find that memcpy is faster also for small arrays.
memcpy is faster. Probably you profiled and find out it is slower simply because your for-loop was already cached. When your data get moved to Ram from cache you will have some overhead (wich you should putted into profile results.. don't warry bad profiling is pretty common. You don't need only nanoseconds timers). but memcpy can also copy data wich is not cached using DMA, and that's ways better.
16) using memory alignement can increase performance in just few places, but allows you to use SIMD tecniques. memory alignement increase memcpy speed.
22) Learn basics things about your IDE: you should always know how to change compiler settings, how to debug, how to profile, and how to look at disassembly code. Shorter assembly code does not mean faster code. Certain instructions takes more time and can stall your CPU's pipeline.
Feel free to contribute