CZipReader bug fix, plus added bzip2 decompression

Anton1 · Post by **Anton1** » Tue Nov 25, 2008 12:35 pm

Hey y'all...

The problem...

While i was busy writing a class to write Zip archives for irrlicht, i noticed that although 7-zip could open the zip file, irrlicht gave a ireadfile* object that had an infinately large buffer size...

hmm... changing my zip file a bit to seek back to the local header, i wrote the datadescriptor there and bam... irrlicht could open that...

however, depending on the type of iwritefile*, if it doesn't support seeking that would be a problem... plus, why wouldn't irrlicht load the extended local header to provide compatibility with all zip files...

Summary:
Irrlicht doesn't support extended datadescriptors properly, yet..

Solution...

I looked through the CZipReader and easily enough i see that irrlicht expects the datadescriptor to be straight after the local header...

but in fact it is after the compressed data...

so all you have to do is read ahead till you get to the extended headers signature, then seek back (all ireadfile* objects should supply seek functionality) then read the datadescriptor

easy enough...

heres the patch... in CZipReader::scanLocalHeader() just after you skip the extra information add:

Code: Select all

//  this is the data block
    entry.fileDataPosition = File->getPos();

	// if bit 3 was set, read DataDescriptor, following after the compressed data
	if (entry.header.GeneralBitFlag & ZIP_INFO_IN_DATA_DESCRITOR)
	{
	    s32 pos = File->getPos();
	    c8 _char = 0;
	    bool found_ext = false;
	    while(!found_ext && File->getPos() != File->getSize())
	    {
	        c8 oldchar = _char;
	        File->read(&_char, 1);
	        if(oldchar == 0x50 && _char == 0x4B)
	        {
                found_ext = true;
                File->read(&_char, 1);
                File->read(&_char, 1);
	        }
	    }

	    if(found_ext)
	    {
            // read data descriptor
            File->read(&entry.header.DataDescriptor, sizeof(entry.header.DataDescriptor));
    #ifdef __BIG_ENDIAN__
            entry.header.DataDescriptor.CRC32 = os::Byteswap::byteswap(entry.header.DataDescriptor.CRC32);
            entry.header.DataDescriptor.CompressedSize = os::Byteswap::byteswap(entry.header.DataDescriptor.CompressedSize);
            entry.header.DataDescriptor.UncompressedSize = os::Byteswap::byteswap(entry.header.DataDescriptor.UncompressedSize);
    #endif

	    }
	    else
	    {
            os::Printer::log("Zip file data descriptor not found", entry.simpleFileName.c_str(), ELL_ERROR);
            return 0;
	    }
	}
	else

	// we have the info, so move forward length of data
	File->seek(entry.header.DataDescriptor.CompressedSize, true);

now irrlicht should support all 7-zip .zip files on any supported platform...

hopefully niko will edit the code in future versions of irrlicht so i won't have to keep patching it

Anton1 · Post by **Anton1** » Wed Nov 26, 2008 9:32 pm

hmmm... i fear my last post was a bit presumptious... it does not support bzip2 compression...

have no fear... check out http://www.bzip.org and download the pack... pay attention to the licensing and compile it in irrlicht...

in CZipReader::openFile(s32) straight after the compression method 8, add

Code: Select all

case 12:    //  bzip2
            {
                const u32 uncompressedSize = entry.header.DataDescriptor.UncompressedSize;
                const u32 compressedSize = entry.header.DataDescriptor.CompressedSize;

                c8* pBuf = new c8[ uncompressedSize ];
                if (!pBuf)
                {
                    os::Printer::log("Not enough memory for decompressing", entry.zipFileName.c_str(), ELL_ERROR);
                    return 0;
                }

                c8 *pcData = new c8[ compressedSize ];
                if (!pcData)
                {
                    os::Printer::log("Not enough memory for decompressing", entry.zipFileName.c_str(), ELL_ERROR);
                    return 0;
                }

                Complete->read(pcData, compressedSize );

                // Setup the inflate stream.
                bz_stream stream;
                s32 err;

                stream.next_in = pcData;
                stream.avail_in = (uInt)compressedSize;
                stream.next_out = pBuf;
                stream.avail_out = uncompressedSize;
                stream.bzalloc = 0;
                stream.bzfree = (free_func)0;

                // Perform inflation. wbits < 0 indicates no zlib header inside the data.
                err = BZ2_bzDecompressInit(&stream, 0,0);
                if (err == Z_OK)
                {
                    err = BZ2_bzDecompress(&stream);
                    if (err == Z_STREAM_END)
                        err = Z_OK;
                    err = Z_OK;
                    BZ2_bzDecompressEnd(&stream);
                }


                delete[] pcData;

                if (err != Z_OK)
                {
                    os::Printer::log("Error decompressing", entry.zipFileName.c_str(), ELL_ERROR);
                    delete [] (c8*)pBuf;
                    return 0;
                }
                else
                    return Device->getFileSystem()->createMemoryReadFile(pBuf, uncompressedSize, entry.zipFileName.c_str(), true);
            }
            break;

that will add bzip2 decompression. bzip2 is arguably better that the gzip, however i feel its up to preference and content...

loki1985 · Post by **loki1985** » Thu Nov 27, 2008 10:11 am

well, the main problem with bzip2 is that you can only compress one file into one compressed file, so containers (like the ZIP format also is) are not possible with bzip2.

you _could_ however use tar.bz2, but the problem here is that it employs solid compression, which means if you want to load only the last file in an archive with 1000 files, the program will have to uncompress all 999 files before being abled to load said last file.

//EDIT: sorry, just saw that you added the bzip2 support for inside the ZIP container, so that makes much more sense.

MasterGod · Post by **MasterGod** » Fri Nov 28, 2008 2:29 pm

@Anton1: It looks pretty nice but unless you'll make a test case that tests your code for every flaw it could have I don't believe the Irrlicht devs would even look at it.
You should make a test-case code that shows that your code firstly, really work, then work yourself on searching all the bugs it could have.
For a user code to get into Irrlicht it must be polished as much as possible by the author of this code before the Irrlicht developers would check it out seriously. Additionally the following weeks are the lats weeks before v1.5 so the developers would be even more busy and won't have time to look at your code unless you'll follow my recommendations (I guess) so all they would have would be 1 - your well checked code and 2- a test case that shows that it doesn't have bugs and that it actually works.

I hope I helped, good luck!

P.S
Also it won't get into v1.5 because they said they won't add any new features but bug fixes until the release.

rogerborg · Post by **rogerborg** » Fri Nov 28, 2008 2:47 pm

Sure, thanks for doing this, but we're into the bugfixing phase of 1.5, so we we're averse to adding any new functionality.

That said, there's now a test suite so if you feel like adding a test case for this, and wrapping it all up in a nice patch, you never know...