Programmaticaly parse archive loaded with addFileArchive

If you are a new Irrlicht Engine user, and have a newbie-question, this is the forum for you. You may also post general programming questions here.
Cube_
Posts: 1010
Joined: Mon Oct 24, 2011 10:03 pm
Location: 0x45 61 72 74 68 2c 20 69 6e 20 74 68 65 20 73 6f 6c 20 73 79 73 74 65 6d

Programmaticaly parse archive loaded with addFileArchive

Post by Cube_ »

So what I want to do is pack resources into archive files (for a few reason, better density on disk, easier to distribute, fewer open file handles etc), and the loading archives part seems pretty simple.
The problem is actually building a list of files found within, I can't seem to actually get the file list from the archive, neither with FILESYSTEM_NATIVE or FILESYSTEM_VIRTUAL

Code: Select all

 
    io::IFileSystem* fs = game->device->getFileSystem();
    fs->changeWorkingDirectoryTo("Data/");
    IFileList *fileList = fs->createFileList();
    fs->setFileListSystem(FILESYSTEM_NATIVE);
        int tmpj = 0;
    for (unsigned int i = 0; i < fileList->getFileCount(); ++i)
    {
        s = fileList->getFullFileName(i);
        //.gaf is actually just renamed zip <3
        if (s.find(".tgz") >= 0 || s.find(".tar.gz") >= 0 || s.find(".gaf") >= 0)
        {
            strArray[tmpj] = s.c_str();
            tmpj++;
        }
    }
 
    for (int i = 0; i < tmpj; i++)
    {
        wprintf(L"loading file: %lS\n", strArray[i].c_str());
        fs->addFileArchive(strArray[i].c_str(), false, false, EFAT_GZIP);
    }
 
    //fs->setFileListSystem(FILESYSTEM_VIRTUAL);
    fileList = fs->createFileList();
    for (int j = 0; j < fileList->getFileCount(); j++)
    {
        //try to find files in archives.
        s = fileList->getFullFileName(j);
        wprintf(L"found file: %ls\n", s.c_str());
        //do things with files found
    }
I'm surely missing something, but I can't pinpoint it; what I want to do is to load whatever archives found and then get a string list of all files found within, but I can't figure out how to do that - the file list doesn't update with or without a call to fileList = fs->createFileList(); after archives have loaded (this tells me I'm doing something wrong, but I can't figure it out)
"this is not the bottleneck you are looking for"
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: Programmaticaly parse archive loaded with addFileArchive

Post by hendu »

FWIW, I used physfs for my games' archive needs, since I needed it anyway for the platform save dir abstraction, and preferred its API to irr's.
CuteAlien
Admin
Posts: 9734
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany
Contact:

Re: Programmaticaly parse archive loaded with addFileArchive

Post by CuteAlien »

Doesn't look wrong to me on first view. Are you getting any errors on console? I didn't know we support tar.gz by the way :-) Will have to check that myself some time.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Cube_
Posts: 1010
Joined: Mon Oct 24, 2011 10:03 pm
Location: 0x45 61 72 74 68 2c 20 69 6e 20 74 68 65 20 73 6f 6c 20 73 79 73 74 65 6d

Re: Programmaticaly parse archive loaded with addFileArchive

Post by Cube_ »

hendu wrote:FWIW, I used physfs for my games' archive needs, since I needed it anyway for the platform save dir abstraction, and preferred its API to irr's.
Irrlicht's capabilities should be sufficient; I just need basic RO to fetch files from archive.
CuteAlien wrote:Doesn't look wrong to me on first view. Are you getting any errors on console? I didn't know we support tar.gz by the way :-) Will have to check that myself some time.
I didn't either but google and the docu tells me we do, what with EFAT_GZIP and that irrlicht throws "unsupported type" (and then crashes, not very graceful imo - print warning and ignore load request is much less intrusive) on unsupported archives but happily parsed my tar.gz

No errors, but also not the printf output I expect, i.e I mount the archives but I can't loop over the rebuilt (or old) file list and get these files; regardless of what I set the working directory to (I tried looking relative to Data/ and Relative to the program working directory to no avail)

Take this hypothetical archive, it's not a /real/ archive (why would I bother wrapping three files in an archive?)

Code: Select all

 
Example.tar.gz
+ META.INFO
+ LEVEL1.OBJ
+ LEVEL1.MTL
 
If I then printed the file paths I'd expect to see these files in the output.
Expected locations:
Data/Example/META.INFO
Example/META.INFO
META.INFO


But I can't find these files or paths in
$exeDir
$exeDir/Data/

Ergo I must be missing something.

The output with native FS is what you'd expect, it finds the data dir, it finds the exe file, it finds a few other misc files; if we operate in the data dir it files

Code: Select all

 
/. 
/..
/test.tgz
if we're operating in virtual mode it finds (regardless of where I set the working dir to)

Code: Select all

 
/.
/..
 
"this is not the bottleneck you are looking for"
CuteAlien
Admin
Posts: 9734
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany
Contact:

Re: Programmaticaly parse archive loaded with addFileArchive

Post by CuteAlien »

Guess I have to try. Our website hasn't been hacked for nearly a week now, so maybe I can finally get back to some coding on weekend. Should be simple to check I hope.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Cube_
Posts: 1010
Joined: Mon Oct 24, 2011 10:03 pm
Location: 0x45 61 72 74 68 2c 20 69 6e 20 74 68 65 20 73 6f 6c 20 73 79 73 74 65 6d

Re: Programmaticaly parse archive loaded with addFileArchive

Post by Cube_ »

mm fair enough, I'll run some more testing myself - maybe GZIP was only stubbed so as to not crash but otherwise unimplemented?
"this is not the bottleneck you are looking for"
CuteAlien
Admin
Posts: 9734
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany
Contact:

Re: Programmaticaly parse archive loaded with addFileArchive

Post by CuteAlien »

Why do I continue to say stupid stuff like "this should be simple to check"? I really should know better :-)

There's more than one problem here. First - Irrlicht had a bug. Since Irrlicht 1.6 createFileList shouldn't have worked for virtual archives at all. I suppose no one used that or everyone gave up. Basically that tries to find all files in a folder - but it looked for folders in "/" instead of in "" for reasons which I obviously don't know as I wasn't involved yet back then.

I changed that now in svn trunk (r5359). Not much tests yet (as this wasn't tested...), so I hope I didn't break something else.

Fixing this - next problem is tar.gz archives. It should be supported, but so far I can't get it to work. Internally it checks certain bytes at the file-header to figure out if it's a tar.gz it supports - and that function returns false. Don't know the reason. If you use just .tar files instead those work now (didn't before fixing the thing above). But even if it works - it should probably need 2 archives. First one to open the .gz part. And then another one to open the .tar part (just guesssing here).

Last things are 2 potential problems in your code as well. First using wprintf might - at least on Unix. Because Irrlicht logging uses printf - and Unix programs can't seem to handle a mix of those 2 functions so once a printf call was made the wprintf simply does nothing.
The other problem is your strArray. I don't see the type you use. But you pass a c_str() for a variable which changes on each loop. So if you have more than one result there ... you can get all kind of strange results as the pointers are constantly overwritten and point now... most likely to the last string, but in some situation even to completely wrong places in memory. That is - unless strArray contains a string-class (and not just char pointers) - but then you probably don't need the c_str() call.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Cube_
Posts: 1010
Joined: Mon Oct 24, 2011 10:03 pm
Location: 0x45 61 72 74 68 2c 20 69 6e 20 74 68 65 20 73 6f 6c 20 73 79 73 74 65 6d

Re: Programmaticaly parse archive loaded with addFileArchive

Post by Cube_ »

CuteAlien wrote:Why do I continue to say stupid stuff like "this should be simple to check"? I really should know better :-)

There's more than one problem here. First - Irrlicht had a bug. Since Irrlicht 1.6 createFileList shouldn't have worked for virtual archives at all. I suppose no one used that or everyone gave up. Basically that tries to find all files in a folder - but it looked for folders in "/" instead of in "" for reasons which I obviously don't know as I wasn't involved yet back then.

I changed that now in svn trunk (r5359). Not much tests yet (as this wasn't tested...), so I hope I didn't break something else.

Fixing this - next problem is tar.gz archives. It should be supported, but so far I can't get it to work. Internally it checks certain bytes at the file-header to figure out if it's a tar.gz it supports - and that function returns false. Don't know the reason. If you use just .tar files instead those work now (didn't before fixing the thing above). But even if it works - it should probably need 2 archives. First one to open the .gz part. And then another one to open the .tar part (just guesssing here).
so.. EFAT_GZIP has a misleading name and should actually be EFAT_TAR?
CuteAlien wrote:Last things are 2 potential problems in your code as well. First using wprintf might - at least on Unix. Because Irrlicht logging uses printf - and Unix programs can't seem to handle a mix of those 2 functions so once a printf call was made the wprintf simply does nothing.
The other problem is your strArray. I don't see the type you use. But you pass a c_str() for a variable which changes on each loop. So if you have more than one result there ... you can get all kind of strange results as the pointers are constantly overwritten and point now... most likely to the last string, but in some situation even to completely wrong places in memory. That is - unless strArray contains a string-class (and not just char pointers) - but then you probably don't need the c_str() call.
printf can't print wide strings (i.e printf(L"") is a syntax error), that's the whole point of wprintf - if your implementation lets you that's a bug (strictly speaking printing a wide string with printf is undefined behavior, and a type mismatch)
As for printf/wprintf being mixed, that sounds more like a bug with the terminal emulator itself rather than mixing them being bad per se ('sides it's only used for my testing and works on the environments tested so that's really all that matters here) - the tl;dr reason here is, I don't rely on printf for anything other than debugging and won't fix potential platform bugs that don't affect me since I don't do debugging on those platforms (and if I do I can always write correct debug messages for those platforms if ever needed).

as for strArray, you're probably referring to this (since it's the only part that ever /writes/ to the array)

Code: Select all

strArray[tmpj] = s.c_str();
s is an irrlicht stringw type and it's only to cease the incessant warnings from the compiler.

strArray is just a temporary consisting of std::wstring type strings (i.e C++ strings with wide char support), the reason I use .c_str() varies but it ranges from silencing warnings to actually getting the correct data.
Presumably you may also refer to

Code: Select all

fs->addFileArchive(strArray[i].c_str(), false, false, EFAT_GZIP);

but in this case the concern is questionable since we're only reading the array, not writing to it
"this is not the bottleneck you are looking for"
CuteAlien
Admin
Posts: 9734
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany
Contact:

Re: Programmaticaly parse archive loaded with addFileArchive

Post by CuteAlien »

Cube_ wrote: so.. EFAT_GZIP has a misleading name and should actually be EFAT_TAR?
No - they both exist. Just that the TAR one works (now in trunk), while haven't figure out yet the gzip one.

The printf stuff... I just read up a little more. And the problem seems to be that implementation tend to mix up string-buffers (c-lib implementation, not a terminal thing). So it might work if there is a flush() enforced in between. Thought... I'm not sure if that's true or not. Just remember - mixing char and wide-char output on the same console can be trouble. So if your strings don't show up - that one is often the cause.
Cube_ wrote: as for strArray, you're probably referring to this (since it's the only part that ever /writes/ to the array)

Code: Select all

strArray[tmpj] = s.c_str();
Yeah, I meant that one. Was just guessing as you didn't post declarations and it looked suspicious. If it's std::wstring in the array then it's fine.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Cube_
Posts: 1010
Joined: Mon Oct 24, 2011 10:03 pm
Location: 0x45 61 72 74 68 2c 20 69 6e 20 74 68 65 20 73 6f 6c 20 73 79 73 74 65 6d

Re: Programmaticaly parse archive loaded with addFileArchive

Post by Cube_ »

CuteAlien wrote:
Cube_ wrote: so.. EFAT_GZIP has a misleading name and should actually be EFAT_TAR?
No - they both exist. Just that the TAR one works (now in trunk), while haven't figure out yet the gzip one.
ah gotcha.
CuteAlien wrote:The printf stuff... I just read up a little more. And the problem seems to be that implementation tend to mix up string-buffers (c-lib implementation, not a terminal thing). So it might work if there is a flush() enforced in between. Thought... I'm not sure if that's true or not. Just remember - mixing char and wide-char output on the same console can be trouble. So if your strings don't show up - that one is often the cause.
so a glibc bug then, luckily that isn't affecting my testing environment but I'll keep it in the back of my mind if such a bug ever hits me.

Well, I suppose I'll just use zip or something for now then since gzip isn't working, tar doesn't have compression so that's not very useful in itself (and there's no lz4 support; but I might just hack that in - or use a third party library for this instead)
"this is not the bottleneck you are looking for"
CuteAlien
Admin
Posts: 9734
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany
Contact:

Re: Programmaticaly parse archive loaded with addFileArchive

Post by CuteAlien »

Yeah, but you also need trunk for zip. That bug should have broken createFileList for every archive type.
I plan to take another look at this (also at gzip), so maybe I can find out some more about this. This also needs some more tests (current tests check if opening files in archives work, but the createFileList was missed).
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Cube_
Posts: 1010
Joined: Mon Oct 24, 2011 10:03 pm
Location: 0x45 61 72 74 68 2c 20 69 6e 20 74 68 65 20 73 6f 6c 20 73 79 73 74 65 6d

Re: Programmaticaly parse archive loaded with addFileArchive

Post by Cube_ »

gotcha, I'll probably just pull the specific patch (it /should/ apply cleanly to 1.8.3 and if it doesn't I'll just apply it manually)
"this is not the bottleneck you are looking for"
CuteAlien
Admin
Posts: 9734
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany
Contact:

Re: Programmaticaly parse archive loaded with addFileArchive

Post by CuteAlien »

Yeah, it will apply to 1.8 - it is just a single line. I'm just worried about backporting because I still wonder if I missed something... (I mean - did really no one try to list files in archives for more than half a decade?)
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
CuteAlien
Admin
Posts: 9734
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany
Contact:

Re: Programmaticaly parse archive loaded with addFileArchive

Post by CuteAlien »

OK, it does actually work and you can get the files even without the fix.

Example code (using different folder-names than your test, so you have to adapt):

Code: Select all

 
#include <irrlicht.h>
#include <iostream>
 
using namespace irr;
 
int main()
{
    video::E_DRIVER_TYPE driverType = video::EDT_OPENGL;
    IrrlichtDevice * device = createDevice(driverType, core::dimension2d<u32>(640, 480));
    if (device == 0)
        return 1; // could not create selected driver.
 
 
    io::IFileSystem* fs = device->getFileSystem();
    fs->changeWorkingDirectoryTo("my_media");
    io::IFileList *fileList = fs->createFileList();
    fs->setFileListSystem(io::FILESYSTEM_NATIVE);
    for (u32 i = 0; i < fileList->getFileCount(); ++i)
    {
        io::path s = fileList->getFullFileName(i);
        std::cout << s.c_str() << "\n";;
        if (s.find(".tgz") >= 0 || s.find(".tar.gz") >= 0)
        {
            std::cout << "loading file: " << s.c_str() << "\n";
            if ( fs->addFileArchive(s.c_str(), false, false, io::EFAT_GZIP) )
            {
                io::IFileArchive* tarArchive = fs->getFileArchive(fs->getFileArchiveCount()-1);
                const io::IFileList* flTar = tarArchive->getFileList();
                if ( flTar->getFileCount() > 0 && flTar->getFullFileName(0).find(".tar") ) // should have exactly one tar file
                {
                    if ( fs->addFileArchive(flTar->getFullFileName(0).c_str(), false, false, io::EFAT_TAR) )
                    {
                        io::IFileArchive* gzArchive = fs->getFileArchive(fs->getFileArchiveCount()-1);
                        const io::IFileList* flGz = gzArchive->getFileList();
                        for ( u32 j=0; j<flGz->getFileCount(); ++j )
                        {
                            io::path s2 = flGz->getFullFileName(j);
                            std::cout << s2.c_str() << "\n";;
                        }
                    }
                }
            }
        }
    }
 
// You don't really need that part anymore now - just showing it can also work (with fix in trunk)
    fs->setFileListSystem(io::FILESYSTEM_VIRTUAL);
    fileList = fs->createFileList();
    for (u32 j = 0; j < fileList->getFileCount(); j++)
    {
        io::path s = fileList->getFullFileName(j);
        std::cout << s.c_str() << "\n";;
    }
 
    return 0;
}
 
As you can see in code I had to do the trick with opening the archive twice - first the .gz archive and then the tar archive inside it. Not sure why it didn't work when I tried yesterday - I guess I just messed something up.
And while FileSystem::createFileList needs the patch to work - you can instead use IFileArchive::getFileList(). That's even better if you have several archives as FileSystem::createFileList can't tell from which archive a file is if you have more than one.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Cube_
Posts: 1010
Joined: Mon Oct 24, 2011 10:03 pm
Location: 0x45 61 72 74 68 2c 20 69 6e 20 74 68 65 20 73 6f 6c 20 73 79 73 74 65 6d

Re: Programmaticaly parse archive loaded with addFileArchive

Post by Cube_ »

oh? I'll try your snippet, thanks.
"this is not the bottleneck you are looking for"
Post Reply