LZ77 Very Slow
-
- Posts: 288
- Joined: Wed Oct 29, 2008 12:07 pm
LZ77 Very Slow
I made an implementation of the LZ77 algorithm to learn the ins and outs of file compression. I compared the compression ratios to that of ZIP and for files with heavy repetition they were not as good but not terrible. However the code is soooo slow. It works quick for small files (of course) but large files >=5 MB can take minutes!!! I am using a 4096 byte moving window varient for the sliding window. I know the searching is costly for possible searching 4096 times per byte in the file, my question is how do ZIP and other compression programs execute so damn fast!? How do you profile your code for speed? I am trying to find the bottle neck in the code but am having trouble .
Re: LZ77 Very Slow
Use a profiler. On Linux there is gprof and on Windows a free profiler is "very sleepy" (really, that's it's name ^_^). Or add timer-code into your functions (for example like this: http://irrlicht.sourceforge.net/forum/v ... highlight=).
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
-
- Posts: 288
- Joined: Wed Oct 29, 2008 12:07 pm
Re: LZ77 Very Slow
Thanks, that program is very good. I replaced the brute force search method I was using with a 256 x 256 index table. I am now compressing 50 MB @ 26.66 seconds and decompressing @ 10.00 seconds. Right now I am getting > 50% compression on most text files. I am going to implement a huffman code algorithm before the LZ77 (I think 7zip does this?) to hopefully achieve a good final compression.
I know Irrlicht uses zlib for compression right? For games and file compression, I know the decompression speed is most desired so you can decompress on the fly and put the data right into the game, right? I am hoping to get the final version to about 10 seconds compression and 5 seconds decompression by the time I am finished. I just don't know if the compression % will be as good, we will see.
I know Irrlicht uses zlib for compression right? For games and file compression, I know the decompression speed is most desired so you can decompress on the fly and put the data right into the game, right? I am hoping to get the final version to about 10 seconds compression and 5 seconds decompression by the time I am finished. I just don't know if the compression % will be as good, we will see.
Re: LZ77 Very Slow
Huffman is not good. Its only distinction is that, given a particular example, it will compress it better than any other method, but it does not then compress other data well. That is, you can set it up to compress just one piece of data.
-
- Posts: 288
- Joined: Wed Oct 29, 2008 12:07 pm
Re: LZ77 Very Slow
Huffman will only not work if the file contains all 256 different byte combinations that are of equal probability. So prolly not good with image files but text files/log files can achieved great compression. Also using huffman algorithm will not eliminate the redundancy in the bit stream, as I see it replacing an 8 bit combination with a 3 bit combination will still result in the same encoded pattern giving further compression and faster LZ77 execution.
But the best compression is with arithmetic coding models, however the speed is terrible. This algorithm will be for very quick execution to possibly encode information through transmissions.
I have a problem with Huffman algorithm when it comes to transmitting the table used during compression. So maybe I will choose 2-3 predetermined look up tables to do the encoding, prolly more inefficient but will keep the speed cost low.
But the best compression is with arithmetic coding models, however the speed is terrible. This algorithm will be for very quick execution to possibly encode information through transmissions.
I have a problem with Huffman algorithm when it comes to transmitting the table used during compression. So maybe I will choose 2-3 predetermined look up tables to do the encoding, prolly more inefficient but will keep the speed cost low.
Re: LZ77 Very Slow
Well, I was going to walk on by, and maybe I should, but:Alpha Omega wrote:Huffman will only not work if the file contains all 256 different byte combinations that are of equal probability. So prolly not good with image files but text files/log files can achieved great compression. Also using huffman algorithm will not eliminate the redundancy in the bit stream, as I see it replacing an 8 bit combination with a 3 bit combination will still result in the same encoded pattern giving further compression and faster LZ77 execution.
2^3 is 8. So you are compressing files with only 8 distinct characters in them?
-
- Posts: 288
- Joined: Wed Oct 29, 2008 12:07 pm
Re: LZ77 Very Slow
No I was just using it as an example. Obviously you have use to use more bits for less probable characters. Anyway your right, huffman code is not very efficient compared to arithmetic coding, and I think I can make an implementation that is pretty quick, combining a LZW style dictionary into the arithmetic coding algorithm.
-
- Admin
- Posts: 14143
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
Re: LZ77 Very Slow
Maybe try bzip or lzma
Re: LZ77 Very Slow
The fastest should be gzip. lzma has very good compression but is slower (not to much). Every compression algorithm has pros and contros (for games i think that zlib and gzip are very good.. other formats are good for server/clients wich needs to minimize bandwith when downloading). I never tried to use directly libraries like zlib or libpng. I spent some time on data crypt/decrypt streams. but not on data compression.
Junior Irrlicht Developer.
Real value in social networks is not about "increasing" number of followers, but about getting in touch with Amazing people.
- by Me
Real value in social networks is not about "increasing" number of followers, but about getting in touch with Amazing people.
- by Me
Re: LZ77 Very Slow
idsoft used zlib (I think) for doom3 and the compression is pretty good.
http://www.iddevnet.com/doom3/packfiles.phpThe assets for Doom 3 are 3.6GB uncompressed, and only 1.44GB after being packed up.
My company: http://www.kloena.com
My blog: http://www.zhieng.com
My co-working space: http://www.deskspace.info
My blog: http://www.zhieng.com
My co-working space: http://www.deskspace.info
Re: LZ77 Very Slow
Jake03 a bot or not - that is the question. Typical bot pattern (the kind that copy-pastes texts which sound similar from elsewhere and adds spam a few days later), but could also be uninformed coder who asks for code-help without showing code...
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
-
- Admin
- Posts: 14143
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
Re: LZ77 Very Slow
Yep, actually the mentioned code is missing. But assuming this was just a copy-paste error, this post might make sense. But well, looking at the two other posts, it's either bot or troll, so maybe ready for deletion anyway...
Re: LZ77 Very Slow
Ok, Jake03 turned out to be a bot (he added spam now). Grr.. really, really hate that combination of delayed spam + bots copying text that sounds similar to thread-text from somewhere.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Re: LZ77 Very Slow
What about adding extra antibot questions for users wich have less than X messages? So before posting any message on the forum if you have less than 30 messages a question will be asked.
Junior Irrlicht Developer.
Real value in social networks is not about "increasing" number of followers, but about getting in touch with Amazing people.
- by Me
Real value in social networks is not about "increasing" number of followers, but about getting in touch with Amazing people.
- by Me
Re: LZ77 Very Slow
We're not even sure if that's bots anymore - or just people in some poor country which are payed for spamming. But the idea is sound - I'll ask Yoran if something like that could be done.
IRC: #irrlicht on irc.libera.chat
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm
Code snippet repository: https://github.com/mzeilfelder/irr-playground-micha
Free racer made with Irrlicht: http://www.irrgheist.com/hcraftsource.htm