[fixed]vector2d.getAngle() = NaN for small FP components

Rocko Bonaparte · Post by **Rocko Bonaparte** » Sun May 22, 2011 3:44 pm

I encountered this today when trying to work with a camera that sometimes had a near-zero component in a rotation vector. The camera's geometry would go completely ga-ga and turn into NaNs at that point. My code is based off the RTSCamera, which wasn't modified for the problem that happens here. But anyways the real problem I think is in getAngle for vector2d instantiations that use f32 or f64. The zero checks are coded to hard zeroes AFAIK:

Code: Select all

	inline f64 getAngle() const
	{
		if (Y == 0) // corrected thanks to a suggestion by Jox
			return X < 0 ? 180 : 0;
		else if (X == 0)
			return Y < 0 ? 90 : 270;

		// don't use getLength here to avoid precision loss with s32 vectors
		f64 tmp = Y / sqrt((f64)(X*X + Y*Y));
		tmp = atan( core::squareroot(1 - tmp*tmp) / tmp) * RADTODEG64;

		if (X>0 && Y>0)
			return tmp + 270;
		else
		if (X>0 && Y<0)
			return tmp + 90;
		else
		if (X<0 && Y<0)
			return 90 - tmp;
		else
		if (X<0 && Y>0)
			return 270 - tmp;

		return tmp;
	}

What I just saw was X == really, really, really small (-1.53080559e-16) while Y was much larger (2.49999523). I expect I can work around this in some fashion, whether coding my own version for f64 vectors or changing the situation that was causing it. For the vector2d template though, I don't have any good recommendations since it's a template class dealing with both fixed and floating point types.[/code]

hybrid · Post by **hybrid** » Sun May 22, 2011 4:13 pm

These problems should all be fixed already. Please provide test numbers (vectors which return incorrect values) and make sure it's not something like the inherent gimbal lock.

CuteAlien · Post by **CuteAlien** » Sun May 22, 2011 5:56 pm

I remember I added a patch for that a few weeks ago in svn.

And it had turned out to be a problem which I could not reproduce without a larger context. I had stepped through identical numbers there in different contexts and even with identical register values and asm code I got different results depending on where it was called. There was obviously some rounding going on, but depending on context it seemed to happen in different calculations. I even compiled it a few times with different floating point compile settings, but that had not changed anything. So... I also can't deliver a test - but it's fixed in svn now,

Rocko Bonaparte · Post by **Rocko Bonaparte** » Sun May 22, 2011 6:37 pm

The numbers I was referencing in the post were

(-1.53080559e-16, 2.49999523)

I think technically the numbers were supposed to be (0, 2.5) so I might entertain myself wondering how they got so slightly frazzled. I haven't messed with the subversion trunk so I will try that then.

Something I should have said is that this was with the 1.7.2 release so I haven't tried any updates. In this case I wouldn't want to switch completely over to an SVN revision, but I assume it'll be easy enough to extract the GetAngle code and write my own helper to call and see if it clears up the mess.

Rocko Bonaparte · Post by **Rocko Bonaparte** » Sun May 22, 2011 6:57 pm

I pulled out the GetAngle from the subversion trunk:

https://irrlicht.svn.sourceforge.net/sv ... vector2d.h

Didn't do a diff but just tried it, and that had the problem too. I got the hex representations of the floats I'm using. I'm running on 64-bit x86; I do forget exactly which floating point specification that's following for doubles. This is the vector components I'm using in the calculation just preceding the problem:
X=0x0
Y=0x2

What is happening is that I have a camera pointing down on my game world, and the problem happens when I tell the camera to initially move north. It looks like this screws with the Y component just a smidgeon and all hell breaks loose. The hex representation is different:
X=0x0
Y=0xfffffffe -- don't have the spec on what is what here so I can't hand scribble this out, but Eclipse claims it's roughly 2.49999523.

GetAngle() spits out a NaN on that one.

CuteAlien · Post by **CuteAlien** » Sun May 22, 2011 8:35 pm

So far I have no success reproducing it. Tested with:

Code: Select all

#include <irrlicht.h>
#include <iostream>

using namespace irr;
using namespace core;

int main()
{
	video::E_DRIVER_TYPE driverType = video::EDT_OPENGL;
	IrrlichtDevice * device = createDevice(driverType, core::dimension2d<u32>(640, 480));
	if (device == 0)
		return 1; // could not create selected driver.

	vector2d<f32> v1;
	vector2d<f64> v2;
	v1 = vector2d<f32>(-1.53080559e-16, 2.49999523);
	std::cout << v1.getAngle() << std::endl;
	
	v1 = vector2d<f32>(0x0, 0x2);
	std::cout << v1.getAngle() << std::endl;
	
	v1 = vector2d<f32>(0x0, 0xfffffffe);
	std::cout << v1.getAngle() << std::endl;
	
	v1 = vector2d<f32>(0x0, 2.49999523);
	std::cout << v1.getAngle() << std::endl;
	
	v2 = vector2d<f64>(-1.53080559e-16, 2.49999523);
	std::cout << v2.getAngle() << std::endl;
	
	v2 = vector2d<f64>(0x0, 0x2);
	std::cout << v2.getAngle() << std::endl;
	
	v2 = vector2d<f64>(0x0, 0xfffffffe);
	std::cout << v2.getAngle() << std::endl;
	
	v2 = vector2d<f64>(0x0, 2.49999523);
	std::cout << v2.getAngle() << std::endl;
	
	device->drop();

	return 0;
}

I tried it only on Linux-32bit so far in case that makes a difference.

Can you try to find out where exactly you get the NAN by stepping through the function in a debugger? Previously the problem had been that tmp could be larger than 1 which then lead to squareroot receiving a negative number, but that is checked in newer headers.

hybrid · Post by **hybrid** » Sun May 22, 2011 8:41 pm

I've also just tested it here locally, and the function returns 270 for me. Added to regression test, but don't see how I can fix this.

Rocko Bonaparte · Post by **Rocko Bonaparte** » Mon May 23, 2011 2:43 am

I am not convinced stuffing the hex values straight in will work as expected. I think the compiler translates that number as an integer value represented by the hex string, and then creates an appropriate float representation. So more specifically:

f64 foo = 0x11 will set foo to 3.0, not the IEEE spec for what in hex 0x11 might represent.

At least that what I think happened when I tried to do it.

Here I took the function out of the trunk and made it into a helper, so we could both be sure to be on the trunk. I could reproduce with this:

Code: Select all

f64 GetAngleIrrlichtSVN(vector2df const &v)
{
	// v.X = 0x0, v.Y = 0xfffffffe
	if (v.Y == 0) // corrected thanks to a suggestion by Jox
		return v.X < 0 ? 180 : 0;
	else if (v.X == 0)
		return v.Y < 0 ? 90 : 270;

	// don't use getLength here to avoid precision loss with s32 vectors
	f64 tmp = v.Y / sqrt((f64)(v.X*v.X + v.Y*v.Y));    // = -1.0
	if ( tmp > 1.0 ) //   avoid floating-point trouble as sqrt(y*y) is occasionally larger y
		tmp = 1.0;
	tmp = atan( core::squareroot(1 - tmp*tmp) / tmp) * RADTODEG64;
	// tmp = -nan a.k.a 0x8000000000000000

....
}

Looks like the core::squareroot(1 - tmp*tmp) in particular is causing the ruckus. Dugging in, that looks like one's local, friendly math.h sqrt(). sqrt(0) should succeed so I am confused.

I actually tried to isolate that code and it appears to work. From what I can see in my own debugger, the hex representation of the floating-point numbers is the same.

On a whim, I added a -1.0 clamp and it appeared to work!

Code: Select all

	if (v.Y == 0) // corrected thanks to a suggestion by Jox
		return v.X < 0 ? 180 : 0;
	else if (v.X == 0)
		return v.Y < 0 ? 90 : 270;

	// don't use getLength here to avoid precision loss with s32 vectors
	f64 tmp = v.Y / sqrt((f64)(v.X*v.X + v.Y*v.Y));    // = -1.0
	if ( tmp > 1.0 ) //   avoid floating-point trouble as sqrt(y*y) is occasionally larger y
		tmp = 1.0;
	if(tmp < -1.0)	
		tmp = -1.0;
	tmp = atan( core::squareroot(1 - tmp*tmp) / tmp) * RADTODEG64;
...

So I suspect my ability to get the accurate view of the raw floating point numbers was also getting hampered too. Does that negative clamp make mathematical sense?

hybrid · Post by **hybrid** » Mon May 23, 2011 7:26 am

Yes, I think this clamp makes perfect sense. Because the problem occurs into both directions, as Y is an arbitrary coordinate value. So it can be any positive or negative value. I'll add this to trunk then.

Rocko Bonaparte · Post by **Rocko Bonaparte** » Mon May 23, 2011 2:21 pm

I am going to sniff around about the floating point representation problem then for the sake of the regression test discussion that came up earlier in the thread. I am figuring there's probably some voodoo one could do setting a long pointer to the hex value and then getting a double pointing at the same address, but I'd think there's something better than that. Of course this will only work on systems with a compatible floating point representation . . .

hybrid · Post by **hybrid** » Mon May 23, 2011 2:46 pm

Well, this should work

Code: Select all

union
{
  u8 iVal[8];
  double dVal;
} v;

v.iVal[...]=...;
func(dVal);

Or use long long instead. But it's also a question of internal double representation, floating point arithmetic settings (also check the FPU precision flag in device creation) etc.