If you really made the test case as above, I guess you should update your compiler. The loop should be solved at compile time Anyway, portability is near zero, because the asm code depends on the processor and the correct compiler.
#include <irrlicht.h>
float InvSqrt(float x)
{
float xhalf = 0.5f*x;
int i = *(int*)&x; // get bits for floating value
i = 0x5f375a86- (i>>1); // gives initial guess y0
x = *(float*)&i; // convert bits back to float
x = x*(1.5f-xhalf*x*x); // Newton step, repeating increases accuracy
return x;
}
float Sqrt(float x)
{
return 1.f/InvSqrt(x);
}
int main(void)
{
irr::f32 n = 0.0f;
scanf("%f", &n);
irr::u32 start = clock();
for(int i=0;i<10000000;i++)
irr::core::squareroot(n);
irr::u32 end = clock();
irr::u32 irrT = end - start;
nlog<<"IRRLICHT' SQUARE ROOT TIME: "<<irrT<<nlendl;
nlog<<"IRRLICHT' SQUARE ROOT RES: "<<irr::core::squareroot(n)<<nlendl;
start = clock();
for(int i=0;i<10000000;i++)
Sqrt(n);
end = clock();
irrT = end - start;
nlog<<"QUAKE3' SQUARE ROOT TIME: "<<irrT<<nlendl;
nlog<<"QUAKE3' SQUARE ROOT RES: "<<Sqrt(n)<<nlendl;
return 0;
}
Last edited by sudi on Sat Sep 04, 2010 9:58 pm, edited 1 time in total.
We're programmers. Programmers are, in their hearts, architects, and the first thing they want to do when they get to a site is to bulldoze the place flat and build something grand. We're not excited by renovation:tinkering,improving,planting flower beds.
It mostly stems from back in the day when vector calculations were done on the processor instead of the GPU via a shader. Also, like Hybrid said, it makes a lot of assumptions about the hardware.