Patch to improve irrXML's fast_atof

Discuss about anything related to the Irrlicht Engine, or read announcements about any significant features or usage changes.
Post Reply
spotrh
Posts: 5
Joined: Thu Feb 03, 2011 9:08 pm

Patch to improve irrXML's fast_atof

Post by spotrh »

Hi! I'm the Fedora maintainer for the Irrlicht Engine (and IrrXML). Recently, another Fedora contributor wanted to package up and include the Open Asset Import Library, aka assimp (http://assimp.sourceforge.net/). During the review process, it was discovered that assimp was using a modified copy of IrrXML, which is almost never permitted in Fedora. Further investigation revealed that the only changes were to improve fast_atof.

I know that there is a comment in the IrrXML code that says:

Code: Select all

// Please run this regression test when making any modifications to this function:
// https://sourceforge.net/tracker/download.php?group_id=74339&atid=540676&file_id=298968&aid=1865300
However, that code doesn't run on Linux, only Windows. So I fixed it to run on Linux. :) A copy of the updated regression test code with Linux support can be found here:

https://bugzilla.redhat.com/attachment.cgi?id=461140

Once I'd fixed up the test code (and built/installed a local 1.7.2 package), I
compiled it like this:

gcc -m64 -I/usr/include/irrlicht fast_atof.cpp -o fast_atof-172 -lIrrlicht
-lstdc++ -lm

Without any changes to IrrXML, this is a sample result on Fedora 14 x86_64:

Code: Select all

 String '340282346638528859811704183484516925440.000000'
 New fast 340282346638528859811704183484516925440.0000000000000000000000000000000000000000
 Old fast 0.0000000000000000000000000000000000000000
     atof 340282346638528859811704183484516925440.0000000000000000000000000000000000000000

 String '3.402823466e+38F'
 New fast 340282326356119256160033759537265639424.0000000000000000000000000000000000000000
 Old fast 3.4028234481811523437500000000000000000000
     atof 340282346638528859811704183484516925440.0000000000000000000000000000000000000000

 String '3402823466e+29F'
 New fast 340282346638528859811704183484516925440.0000000000000000000000000000000000000000
 Old fast 3402823424.0000000000000000000000000000000000000000
     atof 340282346638528859811704183484516925440.0000000000000000000000000000000000000000

 String '-340282346638528859811704183484516925440.000000'
 New fast -340282346638528859811704183484516925440.0000000000000000000000000000000000000000
 Old fast -0.0000000000000000000000000000000000000000
     atof -340282346638528859811704183484516925440.0000000000000000000000000000000000000000

 String '-3.402823466e+38F'
 New fast -340282326356119256160033759537265639424.0000000000000000000000000000000000000000
 Old fast -3.4028234481811523437500000000000000000000
     atof -340282346638528859811704183484516925440.0000000000000000000000000000000000000000

 String '-3402823466e+29F'
 New fast -340282346638528859811704183484516925440.0000000000000000000000000000000000000000
 Old fast -3402823424.0000000000000000000000000000000000000000
     atof -340282346638528859811704183484516925440.0000000000000000000000000000000000000000

 String '34028234663852885981170418348451692544.000000'
 New fast 34028234663852885981170418348451692544.0000000000000000000000000000000000000000
 Old fast 0.0000000000000000000000000000000000000000
     atof 34028234663852885981170418348451692544.0000000000000000000000000000000000000000

 String '3.402823466e+37F'
 New fast 34028234663852885981170418348451692544.0000000000000000000000000000000000000000
 Old fast 3.4028234481811523437500000000000000000000
     atof 34028234663852885981170418348451692544.0000000000000000000000000000000000000000

 String '3402823466e+28F'
 New fast 34028232128551685524711615355045281792.0000000000000000000000000000000000000000
 Old fast 3402823424.0000000000000000000000000000000000000000
     atof 34028234663852885981170418348451692544.0000000000000000000000000000000000000000

 String '-34028234663852885981170418348451692544.000000'
 New fast -34028234663852885981170418348451692544.0000000000000000000000000000000000000000
 Old fast -0.0000000000000000000000000000000000000000
     atof -34028234663852885981170418348451692544.0000000000000000000000000000000000000000

 String '-3.402823466e+37F'
 New fast -34028234663852885981170418348451692544.0000000000000000000000000000000000000000
 Old fast -3.4028234481811523437500000000000000000000
     atof -34028234663852885981170418348451692544.0000000000000000000000000000000000000000

 String '-3402823466e+28F'
 New fast -34028232128551685524711615355045281792.0000000000000000000000000000000000000000
 Old fast -3402823424.0000000000000000000000000000000000000000
     atof -34028234663852885981170418348451692544.0000000000000000000000000000000000000000

 String '.00234567'
 New fast 0.0023456700146198272705078125000000000000
 Old fast 0.0023456700146198272705078125000000000000
     atof 0.0023456700146198272705078125000000000000

 String '-.00234567'
 New fast -0.0023456700146198272705078125000000000000
 Old fast -0.0023456700146198272705078125000000000000
     atof -0.0023456700146198272705078125000000000000

 String '0.00234567'
 New fast 0.0023456700146198272705078125000000000000
 Old fast 0.0023456700146198272705078125000000000000
     atof 0.0023456700146198272705078125000000000000

 String '-0.00234567'
 New fast -0.0023456700146198272705078125000000000000
 Old fast -0.0023456700146198272705078125000000000000
     atof -0.0023456700146198272705078125000000000000

 String '1.175494351e-38F'
 New fast 0.0000000000000000000000000000000000000118
 Old fast 0.0000000000000000000000000000000000000118
     atof 0.0000000000000000000000000000000000000118

 String '1175494351e-47F'
 New fast 0.0000000000000000000000000000000000000000
 Old fast 1175494400.0000000000000000000000000000000000000000
     atof 0.0000000000000000000000000000000000000118

 String '1.175494351e-37F'
 New fast 0.0000000000000000000000000000000000001175
 Old fast 0.0000000000000000000000000000000000001175
     atof 0.0000000000000000000000000000000000001175

 String '1.175494351e-36F'
 New fast 0.0000000000000000000000000000000000011755
 Old fast 0.0000000000000000000000000000000000011755
     atof 0.0000000000000000000000000000000000011755

 String '-1.175494351e-36F'
 New fast -0.0000000000000000000000000000000000011755
 Old fast -0.0000000000000000000000000000000000011755
     atof -0.0000000000000000000000000000000000011755

 String '123456.789'
 New fast 123456.7890625000000000000000000000000000000000
 Old fast 123456.7890625000000000000000000000000000000000
     atof 123456.7890625000000000000000000000000000000000

 String '-123456.789'
 New fast -123456.7890625000000000000000000000000000000000
 Old fast -123456.7890625000000000000000000000000000000000
     atof -123456.7890625000000000000000000000000000000000

 String '0000123456.789'
 New fast 123456.7890625000000000000000000000000000000000
 Old fast 123456.7890625000000000000000000000000000000000
     atof 123456.7890625000000000000000000000000000000000

 String '-0000123456.789'
 New fast -123456.7890625000000000000000000000000000000000
 Old fast -123456.7890625000000000000000000000000000000000
     atof -123456.7890625000000000000000000000000000000000
         atof time = 401
    fast_atof Time = 543
old fast_atof time = 439
The fast method isn't at least three times as fast as atof()
This means that the test is showing that the atof() in Fedora 14 glibc is faster than the fast_atof function, but what I really wanted to see was what difference (if any) the assimp changes to fast_atof had.

So, I applied the assimp changes to IrrXML and recompiled Irrlicht. A copy of the patch is here: http://spot.fedorapeople.org/irrlicht-1 ... ents.patch

One of the nice things about this patch is that there is a tunable variable that can be adjusted at compile time, to specify the number of relevant decimals for floating-point parsing. Initially assimp had AI_FAST_ATOF_RELAVANT_DECIMALS set to 6, but when I tested that, I got a lot of results like this:

Code: Select all

 String '-.00234567'
 New fast -0.0023449999280273914337158203125000000000
 Old fast -0.0023456700146198272705078125000000000000
     atof -0.0023456700146198272705078125000000000000
*** ERROR - less accurate than old method ***
So, I bumped it up to 15, and did three runs. Here are the final results from the three runs:

Code: Select all

Run 1:
         atof time = 478
    fast_atof Time = 268
old fast_atof time = 486
The fast method isn't at least three times as fast as atof()

Run 2:
        atof time = 446
    fast_atof Time = 267
old fast_atof time = 461
The fast method isn't at least three times as fast as atof()

Run 3:
         atof time = 399
    fast_atof Time = 295
old fast_atof time = 462
The fast method isn't at least three times as fast as atof()
When you average those results, you get:

Code: Select all

Averaged:

        atof time = 441
    fast_atof Time = 276
old fast_atof time = 469
The fast_atof time with the assimp patch and AI_FAST_ATOF_RELAVANT_DECIMALS 15 is approx 1.6x faster than glibc atof.

Then, I tried setting the field to 10, and did three runs again:

Code: Select all

         atof time = 451
    fast_atof Time = 273
old fast_atof time = 476
The fast method isn't at least three times as fast as atof()

         atof time = 428
    fast_atof Time = 310
old fast_atof time = 531
The fast method isn't at least three times as fast as atof()

         atof time = 406
    fast_atof Time = 309
old fast_atof time = 490
The fast method isn't at least three times as fast as atof()

Averaged:
         atof time = 428.3
    fast_atof Time = 297.3
old fast_atof time = 499
The fast_atof time with the assimp patch and AI_FAST_ATOF_RELAVANT_DECIMALS 10 is approx 1.4x faster than glibc atof. Approx 1.7x faster than stock 1.7.2 fast_atof.

*****

Phew! If you are still reading, thanks. We would like for this patch to be considered for IrrXML (and Irrlicht), because it would mean that assimp could use the system copies of IrrXML on Fedora instead of bundling its own copy.

In case you missed it before, the patch (against 1.7.2) is here:
http://spot.fedorapeople.org/irrlicht-1 ... ents.patch

It only touches include/fast_atof.h and include/irrTypes.h (to fix up and conditionalize some additional types for Windows and Linux).[/code]
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Post by hybrid »

Ok, these are pretty huge changes. Some of them cannot be applied, though. This only applies to the renaming of the methods. strtol10 will keep the signedness, we can add some unsigned versions maybe named strtoul10 or strtol10u.
The 64bit types are not correctly defined. This would only work correctly under msvc or 64bit systems. We need a way to properly define the 64bit types under 32bit non-msvc systems, though.
The restriction of precision seems to be a very good thing. But I need to work out the proper fixes for the rest first.
I've added a patch ticket here https://sourceforge.net/tracker/?func=d ... tid=540678
Nalin
Posts: 194
Joined: Thu Mar 30, 2006 12:34 am
Location: Lacey, WA, USA
Contact:

Post by Nalin »

hybrid wrote:The 64bit types are not correctly defined. This would only work correctly under msvc or 64bit systems. We need a way to properly define the 64bit types under 32bit non-msvc systems, though.
The type long long int is a 64-bit data type added to C99. Some compilers support it in C++ as an extension (ie, GCC). The type is officially added to C++ as part of the C++0x standard. We could support it if we add in special defines for Visual C++, GCC with extensions, and finally for compilers with C++0x support.
Last edited by Nalin on Mon Feb 07, 2011 7:00 pm, edited 1 time in total.
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Post by hybrid »

Yeah, that's my point. We need support for gcc3 and gcc4, on 32bit and 64bit systems.
spotrh
Posts: 5
Joined: Thu Feb 03, 2011 9:08 pm

Post by spotrh »

Sorry for the delay on responding (I thought the forum would send me an email upon reply, but it doesn't seem to do that). The missing typedefs is a valid concern, although, because the rest of the typedefs in that header didn't seem to be completely portable, I didn't think it was a problem initially. I've got a reasonably easy fix for that though, which works on Linux i686 and x86_64 (gcc 4.6). Other targets are outside of my capacity to test right now.

I'm not an expert here, but I think some of the performance gain comes the fact that strtol10 is used all over asymptote and it does not need to be signed. It is possible to make a strtoul10u, but this is a notably more intrusive change (all the calls to strtoul10 would need to move to the new strtoul10u, and the assimp code that uses this function would also need to change). I'm willing to rework the patch to do this, if it is what we need to do to get it merged.
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Post by hybrid »

Well, yes, this would at least help to get the rest of the patch merged, i.e. all those new functions which have been added and need to be cleaned up. The performance patch of the existing functions just need to go through testing. Well, I first have to extract them from the patch, and then apply and test them of course.
The type is a completely independent thing, so we can add them once we have figured out all possible ways to get a 64bit wide integer on most common architectures and OSes.
BTW: The forum now sends out emails if you get a PM, the forum post alarm seems to still have problems.
spotrh
Posts: 5
Joined: Thu Feb 03, 2011 9:08 pm

Post by spotrh »

Okay, here's a new patch (sorry that it isn't split out):
http://spot.fedorapeople.org/irrlicht-1 ... ixes.patch

It adds:

* Portability to irrTypes.h for u16, s16, u32, s32, u64, s64 for at least MSC, Linux i386 and Linux x86_64. Should work everythere though.

* strtol10 returns signed.
* New "unsigned" functions:
- strtoul8
- strtoul10
- strtoul10_64
- strtoul16
- strtoul_cppstyle
* New helper functions:
- HexDigitToDecimal
- HexOctetToDecimal

* Faster fast_atof

* Calls to "strtol10" in code switched to call "strtoul10".
spotrh
Posts: 5
Joined: Thu Feb 03, 2011 9:08 pm

Re: Patch to improve irrXML's fast_atof

Post by spotrh »

Any update here?
CuteAlien
Admin
Posts: 9734
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany
Contact:

Re: Patch to improve irrXML's fast_atof

Post by CuteAlien »

spotrh wrote:Any update here?
I remember fixing something in those functions for 64-bit recently. But didn't see (or forgot about) this patch :-(
The strtol was always a bad name in Irrlicht - the l stands for long usually so using that name and not returning a long is extremely irritating to me.

I won't add it for Irrlicht 1.8 as I just do stuff now to get that out of the door. Afterwards I can look at it maybe. Really sorry for not looking at this earlier - I wish I had when working on it last time.

<offtopic rant>
spotrh wrote: During the review process, it was discovered that assimp was using a modified copy of IrrXML, which is almost never permitted in Fedora.
Always confuses me as hell that linux distros make modifying software libs so hard. I mean isn't that the whole point of free software? Why is it easier distributing applications with modified libs on Windows than on systems from people who promote the right to modify software the rest of the time? I nearly always work with modified Irrlicht libs in projects - the ease of doing that is why I love using a library with open source and a liberal license...
</offtopic rant>
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
spotrh
Posts: 5
Joined: Thu Feb 03, 2011 9:08 pm

Re: Patch to improve irrXML's fast_atof

Post by spotrh »

It's a maintenance thing. When (and not if) you discover some silly security vulnerability in IrrXML, we want to be able to patch it once and push it out, not hunt down every forked and embedded copy of irrXML that exists in our very large package universe.

It's also a useful thing for picking up improvements. Like this fast_atof improvement, which you never would have seen if we'd just let assimp bundle their fork of irrXML.
CuteAlien
Admin
Posts: 9734
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany
Contact:

Re: Patch to improve irrXML's fast_atof

Post by CuteAlien »

edit: Ok, was getting too much offtopic.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Re: Patch to improve irrXML's fast_atof

Post by hybrid »

A few things have been added from the improvements as well. But I didn't finish the tests for the extra functions, yet. So won't go into 1.8, it's simply very low priority.
Post Reply