[fixed]small bug in CXMLReaderImp.h

IPv6 · Post by **IPv6** » Wed Mar 11, 2009 11:23 am

There is a small bug in detecting UTF8 encoding of xml source. Because of this bug utf8 xmls with unicode characters are loaded improperly with IrrXmlReader
---
in function
bool readFile(IFileReadCallBack* callback)

we have
const unsigned char UTF8[] = {0xEF, 0xBB, 0xBF};

and later we comparing BOM with source header
if (size >= 3 && data8[0] == UTF8[0] && data8[1] == UTF8[1] && data8[2] == UTF8[2]) // Bug!

On my MSVC Express this "if" is never triggered, even if source hase this BOM in proper place.

the reason is type casting (works weird here):
- UTF8[] is "unsigned char"
- data8[] is "char" (char* data8 = new char[size])
=> (UTF8[2] = 0xBF) and (data8[2] = -63)!

I would suggest to add explicit type conversion on both sides to avoid any misinterpretaions. Obviously there is no overhead, just compile-time hint

if (size >= 3 && char(data8[0]) == char(UTF8[0]) && char(data8[1]) == char(UTF8[1]) && char(data8[2]) == char(UTF8[2]))

that is all

hybrid · Post by **hybrid** » Wed Mar 11, 2009 1:41 pm

Ok, if this fixes the problem I also don't see why we shouldn't add it. I'll also add a comment so no one "optimizes" it away later on

IPv6 · Post by **IPv6** » Wed Mar 11, 2009 2:53 pm

Thanks!!!