If your CPU support MMX and/or SSE you can get the full benefit of your harware by adding a couple of flags to your compiler.
Important Notes :
the following instructions only works with GCC (and its ports). Dependeing on which version of GCC you are using some instruction sets may or may not be supported (sss3, sse4 and sse5 are only since gcc 4.3.0)
make sure your CPU supports these instruction sets (refer to online docs or, under linux, read the output of cat /proc/cpuinfo)
binary compiled with these instruction sets will CRASH if you attempt run them on CPU that do not support them
This should be the LAST optimization to turn on. Always start by optimizing C++ code with your brains and a profiler BEFORE playing with compiler optimizations.
The trick : add some of the following options to you CFLAGS / CXXFLAGS (depends on your build system) :
-mmmx : enables use of MMX instructions
-msse : enables use of SSE instructions
-msse2 : enables use of SSE2 instructions
-msse3 : enables use of SSE3 instructions
-mssse3 : enables use of SSSE3 instructions
-msse4 : enables use of SSE4 instructions
-msse5 : enables use of SSE5 instructions
-mfpmath=sse : use SSE registers and instruction for floating point math instead of 'normal' floating point (also know as x87), much better than any fast-math switch both in terms of speed and precision
It is highly recommended to use the same value for all files accross the whole project (and external libs as well if possible, though not required).
-march=? : ask the compiler to take advantage of a a given CPU architecture, refer to GCC manual for valid values
Another important trick : do NOT use -O3 if you are using GCC 4.x Even the manual confirms that the speed gain is negligable in most case while the size increase and the instability induced might not be...
A few examples :
Irrlicht, src/Irrlicht/Makefile :
Code: Select all
ifndef NDEBUG
CXXFLAGS += -g -D_DEBUG
else
CXXFLAGS += -fexpensive-optimizations -O2 -march=prescott -mmmx -msse -msse2 -msse3 -mfpmath=sse
endif
ifdef PROFILE
CXXFLAGS += -pg
endif
CFLAGS := -fexpensive-optimizations -O2 -DPNG_THREAD_UNSAFE_OK -DPNG_NO_MNG_FEATURES -march=prescott -mmmx -msse -msse2 -msse3 -mfpmath=sse
Code: Select all
COMPILER.CFLAGS.optimize += -O2 -fomit-frame-pointer -march=prescott -mmmx -msse -msse2 -msse3 -mfpmath=sse ;
[/code]