Hashes ...

If you are a new Irrlicht Engine user, and have a newbie-question, this is the forum for you. You may also post general programming questions here.
Post Reply
TomTim
Posts: 72
Joined: Mon Aug 16, 2010 5:32 pm
Location: NRW, Germany

Hashes ...

Post by TomTim »

Is it possible to create hash values with Irrlicht or the Win-API? For I should use hashes instead of strings ...
loki1985
Posts: 214
Joined: Thu Mar 31, 2005 2:36 pm

Post by loki1985 »

i would recommend http://www.cryptopp.com/
it is easy to use, has a nice set of algorithms, and a good free license which even allows taking parts of it as public domain.

alternatively, the WinAPI apparently has a crypto API. but i would not recommend it, it seems a bit complicated and obviously is not portable. more details here (a bit down the thread):
http://www.c-plusplus.de/forum/viewtopi ... 70790.html

if you just need exactly 1 algorithm (like MD5 or SHA1), you can find a lot of portable C implementations working on char buffers on the net. example:

the original MD5 implementation: http://people.csail.mit.edu/rivest/Md5.c

one SHA1 implementation: http://www.packetizer.com/security/sha1/
TomTim
Posts: 72
Joined: Mon Aug 16, 2010 5:32 pm
Location: NRW, Germany

Post by TomTim »

loki1985 wrote:i would recommend http://www.cryptopp.com/
it is easy to use, has a nice set of algorithms, and a good free license which even allows taking parts of it as public domain.

alternatively, the WinAPI apparently has a crypto API. but i would not recommend it, it seems a bit complicated and obviously is not portable. more details here (a bit down the thread):
http://www.c-plusplus.de/forum/viewtopi ... 70790.html

if you just need exactly 1 algorithm (like MD5 or SHA1), you can find a lot of portable C implementations working on char buffers on the net. example:

the original MD5 implementation: http://people.csail.mit.edu/rivest/Md5.c

one SHA1 implementation: http://www.packetizer.com/security/sha1/
Actually, I just need one algorithim (MD5). I don't wanna save any passwords, I just want to cut off strings.
haffax
Posts: 13
Joined: Wed Jul 29, 2009 1:40 pm

Post by haffax »

What do you mean with "cut off strings"? Is this a hash for using strings as keys to a hash table or in some other way having a shorter handle for a string? In that case MD5 is a bad choice as are all cryptographic hashes. 128 bit hash size is overkill and hash generation is slow.

Better use some fast hash algorithm like MurmurHash.

Btw: If your strings don't overlap in memory, then the pointer is a good handle too.
fmx

Post by fmx »

haffax wrote:Btw: If your strings don't overlap in memory, then the pointer is a good handle too.
Thanks for the great tip, I never thought of using pointers as hashes before :D
loki1985
Posts: 214
Joined: Thu Mar 31, 2005 2:36 pm

Post by loki1985 »

fmx wrote:
haffax wrote:Btw: If your strings don't overlap in memory, then the pointer is a good handle too.
Thanks for the great tip, I never thought of using pointers as hashes before :D
doesn't sound very portable though.
TomTim
Posts: 72
Joined: Mon Aug 16, 2010 5:32 pm
Location: NRW, Germany

Post by TomTim »

haffax wrote:What do you mean with "cut off strings"? Is this a hash for using strings as keys to a hash table or in some other way having a shorter handle for a string? In that case MD5 is a bad choice as are all cryptographic hashes. 128 bit hash size is overkill and hash generation is slow.

Better use some fast hash algorithm like MurmurHash.

Btw: If your strings don't overlap in memory, then the pointer is a good handle too.
This is what I wanna do:

1. Read a string from a file.
2. Create a hash value from the string.
3. Save that value in a variable (ID of my object).
4. Compare the string/hash with other strings/hashes.

It would be nice if I could store the ID in a DWORD or QWORD, but I am very confused by your post, so I just say: with these informations, what would you use?
haffax
Posts: 13
Joined: Wed Jul 29, 2009 1:40 pm

Post by haffax »

Well, it still depends on the question about why you want to hash the strings in the first place. If it is for performance gains when you compare or find these strings, MurmurHash clearly wins. Using MD5 actually makes comparison slower for most use cases compared to just using strcmp.
MD5 uses 128bit hashes which don't fit in neither DWORD nor QWORD. MurmurHash uses DWORD as hash size by default.

The problem is that there is a possiblity for collisions. The hash algorithm may create the same hash value for two distinct strings. The chance of this to happen depends on the number of strings you have. MurmurHash is pretty good at collision prevention taking aside a pretty narrow pathological case.
And since the algorithm is deterministic on any single platform, you can check for hash collisions in your code.

So for generic gaming purposes MurmurHash wins by a great margin compared to cryptographic hashes like MD5.
TomTim
Posts: 72
Joined: Mon Aug 16, 2010 5:32 pm
Location: NRW, Germany

Post by TomTim »

haffax wrote:Well, it still depends on the question about why you want to hash the strings in the first place. If it is for performance gains when you compare or find these strings, MurmurHash clearly wins. Using MD5 actually makes comparison slower for most use cases compared to just using strcmp.
MD5 uses 128bit hashes which don't fit in neither DWORD nor QWORD. MurmurHash uses DWORD as hash size by default.

The problem is that there is a possiblity for collisions. The hash algorithm may create the same hash value for two distinct strings. The chance of this to happen depends on the number of strings you have. MurmurHash is pretty good at collision prevention taking aside a pretty narrow pathological case.
And since the algorithm is deterministic on any single platform, you can check for hash collisions in your code.

So for generic gaming purposes MurmurHash wins by a great margin compared to cryptographic hashes like MD5.
OK, after all I read, I will decide for ... MurmurHash. After all, I just need to compare strings, not to save passwords or something like that. Thanks for the help and the great documentation. ;)
TomTim
Posts: 72
Joined: Mon Aug 16, 2010 5:32 pm
Location: NRW, Germany

Post by TomTim »

Sorry for double-posting, but I got the question if I am able to get the original string back from the hash. I assume it's not possible, but I don't wanna take it guaranteed.
loki1985
Posts: 214
Joined: Thu Mar 31, 2005 2:36 pm

Post by loki1985 »

TomTim wrote:Sorry for double-posting, but I got the question if I am able to get the original string back from the hash. I assume it's not possible, but I don't wanna take it guaranteed.
OK, looks like you need to learn something about hashes:
you NEVER can get your original value back.
TomTim
Posts: 72
Joined: Mon Aug 16, 2010 5:32 pm
Location: NRW, Germany

Post by TomTim »

loki1985 wrote:OK, looks like you need to learn something about hashes:
you NEVER can get your original value back.
Then my assuming was right, I just wanted it confirmed.
Thanks.
vitek
Bug Slayer
Posts: 3919
Joined: Mon Jan 16, 2006 10:52 am
Location: Corvallis, OR

Post by vitek »

loki1985 wrote:OK, looks like you need to learn something about hashes:
you NEVER can get your original value back.
If you have a perfect hash and a map from hash to value, it seems that you could do it quite easily. If the hash is actually a unique identifier for an object, you could also do it trivially.

Travis
loki1985
Posts: 214
Joined: Thu Mar 31, 2005 2:36 pm

Post by loki1985 »

vitek wrote:If you have a perfect hash and a map from hash to value, it seems that you could do it quite easily. If the hash is actually a unique identifier for an object, you could also do it trivially.
i thought that was obvious. but when assuming real-world usecases of hashes, you do not have a map of the original values, so in that case it is not possible.

so, to state it more correct: you cannot get arbitrary original strings back from the hashes alone. if that was possible, i would already have developed a tool decompressing gigabytes of original data from 128 bit hashes, and become filthy rich selling it :lol:
vitek
Bug Slayer
Posts: 3919
Joined: Mon Jan 16, 2006 10:52 am
Location: Corvallis, OR

Post by vitek »

Yes, hashing is not a lossless compression algorithm. I can agree to that.

Travis
Post Reply