fact that we are not using much text, so i thought it might be of some general interest how to do that stuff with Irrlicht.
All sources posted by me in this thread are public domain and maybe parts of it will find their way back in Irrlicht.
Oh - and i18n stands for "Internationalization".
What i needed:
- Font output using the truetype font class from here: http://irrlicht.sourceforge.net/phpBB2/ ... highlight=
- Support for cyrillic fonts in the editbox (so i needed the correct event.KeyInput.Char)
- Support for keyboard keys for playing (so i needed a usable event.KeyInput.Key) with the correct names
- Has to work on linux and windows 98 to windows vista
Some preparations i had already done in advance, which are always a good idea in any application:
- All texts which are used for display were put in a stringtable (there's a simple stringtableclass here, but it does not work yet with i18n for reasons described below: http://www.michaelzeilfelder.de/irrlicht.htm)
- All those texts are internally using widechar strings (wchar_t, irr::stringw, std::wstring)
Also some general information you will need to understand when programming with Unicode and widechars:
The c++ type wchar_t has 2 bytes on Windows and 4 bytes in Linux. It is therefore typically used for encoding the Unicode formats UTF-16/UCS-2 on Windows and UTF-32/UCS-4 on Linux.
UTF-32/UCS-4 are identical, but there's a difference between UTF-16 and UCS-2, as UTF-16 can use several 16bit numbers
in a row to encode charsets which won't fit in 16bit.
This does not matter much for cyrillic which can be represented by UCS-2 and will have identical codes for UTF-16 and UTF-32/UCS-4. You might have to care about that when using some more eastern languages (and it will even get a little harder when using fantasy languages like Klingon which are also supported by Unicode).
Ok, so i'm happy with wchar_t's using UCS-2 for text output. But now i stumpled upon my first problem - my stringtable class is using tinyXML and that is using yet another Unicode format called UTF-8. UTF-8 works for some western languages (p.E. English and German) like ASCII, as it's only using a single byte for those languages. But UTF-8 can be used to represent any unicode char and it does that by having a way to use several bytes in a row to represent a single character. Cyrillic chars for example won't fit in the first byte so UTF-8 will use two bytes for each char for that language.
For this conversion from UTF-8 to UTF-16 i used the function FromUtf8 from this site: http://www.codeproject.com/useritems/UtfConverter.asp
Edit (2007-04-20): I'm not 100% sure if FromUtf8 and ToUtf8 will work in all cases and i found now one problem with those functions. The resulting strings seemed ok, but the == operator of the string class failed when i assigned the resulting strings to other strings and compared those. To fix that i changed them somewhat insofar as i return no longer the resultstring in those functions but do it like that now:
Code: Select all
return std::wstring( resultstring.c_str() );
And wow - that was already all i needed to print some nice cyrillic characters. Btw, while those transformations caused some work for me, it is actually a way often recommended when doing i18n stuff to use UTF-8 for files and UCS-2 or UCS-4 within your application. This way your application will be fast, but you can still use all tools which are working with ASCII-text files.
Now to the keyboard input. Let's do Linux first.
The keyevents needed to be fixed for two cases
a) KeyInput.Key should have an EKEY_CODE for russian keyboards
b) KeyInput.Char should have the correct Unicode id for cyrillic chars
For a) i found only a rather ugly solutions for Linux (there's maybe a better one, but i gave up on that).
It will work for most keys, by returning the corresponding english key, which is usually also printed onto russian keyboards.
I just enhanced the keymap in CIrrDeviceLinux.cpp in createKeyMap by that:
Code: Select all
#ifdef XK_CYRILLIC
KeyMap.push_back(SKeyMap(XK_Cyrillic_shorti, KEY_KEY_Q));
KeyMap.push_back(SKeyMap(XK_Cyrillic_SHORTI, KEY_KEY_Q));
KeyMap.push_back(SKeyMap(XK_Cyrillic_tse, KEY_KEY_W));
KeyMap.push_back(SKeyMap(XK_Cyrillic_TSE, KEY_KEY_W));
KeyMap.push_back(SKeyMap(XK_Cyrillic_u, KEY_KEY_E));
KeyMap.push_back(SKeyMap(XK_Cyrillic_U, KEY_KEY_E));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ka, KEY_KEY_R));
KeyMap.push_back(SKeyMap(XK_Cyrillic_KA, KEY_KEY_R));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ie, KEY_KEY_T));
KeyMap.push_back(SKeyMap(XK_Cyrillic_IE, KEY_KEY_T));
KeyMap.push_back(SKeyMap(XK_Cyrillic_en, KEY_KEY_Y));
KeyMap.push_back(SKeyMap(XK_Cyrillic_EN, KEY_KEY_Y));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ghe, KEY_KEY_U));
KeyMap.push_back(SKeyMap(XK_Cyrillic_GHE, KEY_KEY_U));
KeyMap.push_back(SKeyMap(XK_Cyrillic_sha, KEY_KEY_I));
KeyMap.push_back(SKeyMap(XK_Cyrillic_SHA, KEY_KEY_I));
KeyMap.push_back(SKeyMap(XK_Cyrillic_shcha, KEY_KEY_O));
KeyMap.push_back(SKeyMap(XK_Cyrillic_SHCHA, KEY_KEY_O));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ze, KEY_KEY_P));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ZE, KEY_KEY_P));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ha, 0));
KeyMap.push_back(SKeyMap(XK_Cyrillic_HA, 0));
KeyMap.push_back(SKeyMap(XK_Cyrillic_hardsign, 0));
KeyMap.push_back(SKeyMap(XK_Cyrillic_HARDSIGN, 0));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ef, KEY_KEY_A));
KeyMap.push_back(SKeyMap(XK_Cyrillic_EF, KEY_KEY_A));
KeyMap.push_back(SKeyMap(XK_Cyrillic_yeru, KEY_KEY_S));
KeyMap.push_back(SKeyMap(XK_Cyrillic_YERU, KEY_KEY_S));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ve, KEY_KEY_D));
KeyMap.push_back(SKeyMap(XK_Cyrillic_VE, KEY_KEY_D));
KeyMap.push_back(SKeyMap(XK_Cyrillic_a, KEY_KEY_F));
KeyMap.push_back(SKeyMap(XK_Cyrillic_A, KEY_KEY_F));
KeyMap.push_back(SKeyMap(XK_Cyrillic_pe, KEY_KEY_G));
KeyMap.push_back(SKeyMap(XK_Cyrillic_PE, KEY_KEY_G));
KeyMap.push_back(SKeyMap(XK_Cyrillic_er, KEY_KEY_H));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ER, KEY_KEY_H));
KeyMap.push_back(SKeyMap(XK_Cyrillic_o, KEY_KEY_J));
KeyMap.push_back(SKeyMap(XK_Cyrillic_O, KEY_KEY_J));
KeyMap.push_back(SKeyMap(XK_Cyrillic_el, KEY_KEY_K));
KeyMap.push_back(SKeyMap(XK_Cyrillic_EL, KEY_KEY_K));
KeyMap.push_back(SKeyMap(XK_Cyrillic_de, KEY_KEY_L));
KeyMap.push_back(SKeyMap(XK_Cyrillic_DE, KEY_KEY_L));
KeyMap.push_back(SKeyMap(XK_Cyrillic_zhe, 0));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ZHE, 0));
KeyMap.push_back(SKeyMap(XK_Cyrillic_e, 0));
KeyMap.push_back(SKeyMap(XK_Cyrillic_E, 0));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ya, KEY_KEY_Z));
KeyMap.push_back(SKeyMap(XK_Cyrillic_YA, KEY_KEY_Z));
KeyMap.push_back(SKeyMap(XK_Cyrillic_che, KEY_KEY_X));
KeyMap.push_back(SKeyMap(XK_Cyrillic_CHE, KEY_KEY_X));
KeyMap.push_back(SKeyMap(XK_Cyrillic_es, KEY_KEY_C));
KeyMap.push_back(SKeyMap(XK_Cyrillic_ES, KEY_KEY_C));
KeyMap.push_back(SKeyMap(XK_Cyrillic_em, KEY_KEY_V));
KeyMap.push_back(SKeyMap(XK_Cyrillic_EM, KEY_KEY_V));
KeyMap.push_back(SKeyMap(XK_Cyrillic_i, KEY_KEY_B));
KeyMap.push_back(SKeyMap(XK_Cyrillic_I, KEY_KEY_B));
KeyMap.push_back(SKeyMap(XK_Cyrillic_te, KEY_KEY_N));
KeyMap.push_back(SKeyMap(XK_Cyrillic_TE, KEY_KEY_N));
KeyMap.push_back(SKeyMap(XK_Cyrillic_softsign, KEY_KEY_M));
KeyMap.push_back(SKeyMap(XK_Cyrillic_SOFTSIGN, KEY_KEY_M));
#endif // #ifdef XK_CYRILLIC
http://www.cl.cam.ac.uk/~mgk25/ucs/keysym2ucs.h
http://www.cl.cam.ac.uk/~mgk25/ucs/keysym2ucs.c
I added those files to the engine and had to change some lines in bool CIrrDeviceLinux::run() in the cases of KeyPress and KeyRelease:
Code: Select all
// old code solution
//mbtowc(&irrevent.KeyInput.Char, buf, 4);
// new solution
long int ucsCode = keysym2ucs(mp.X11Key);
if (ucsCode == -1)
ucsCode =0;
memcpy( &irrevent.KeyInput.Char, &ucsCode, sizeof(wchar_t) );
Ok, same stuff for Windows. I should mention in advance that i had some more restrictions for windows which complicated things somewhat which you might not have. First of all i need to support Windows 98. While this is often no problem it is a big problem when it comes to Unicode because Windows 95/98 and ME had basically no support for that. Then it has to compile with MinGW. And last it must work legally with a commercially distributed game.
Now Microsoft does offer the "Microsoft Layer for Unicode on Windows 95/98/ME Systems" since a few years. Which is nice.
They allow you even do get and distribute it for free, which is even nicer. Well, really great would have been if they
would also release such codes, which do basically fix stuff that sucks in 95/98/ME, as open source with a license which can
just be used. They are not that nice. Actually they have the usual EULA-stuff which for example only allows you to use
that layer when you put yourself such an restrictive EULA in front of your game. And it's certainly the usual closed source lib
which is harder to get working with other Environments like MinGW. I'm not even sure if it would be legal to use there.
There are two open libraries which try to help you there.
libunicows allows you using the MS Unicode layer with other compilers, like MinGW: http://libunicows.sourceforge.net/
opencow is a free replacement for the MS Unicode layer, but seems not yet to be complete: http://opencow.sourceforge.net/
Adding new libs is always some trouble, and i don't know if those libs will work and i didn't know if the Mozilla license of opencow
would have been compatible with my project. Still those libs might be fine and useful for you in case you're working on similar stuff.
But in the end i found another way to do it, which will need another short introduction to a few things:
ANSI codepages: Windows is using a separate codepage for a lot of languages, which is basically a table of keycodes in the multibyteformat (yet another format - it's not yet Unicode).
HKL (i think it stands for: handle keyboard layout): That is telling you the language id for the language to which your keyboard is currently set. You can change that with the language symbols in the tray if you have installed several languages in windows.
I haven't found any function to get the ANSI codepage from the HKL or the language id's, but i found a table for it on
http://www.science.co.il/Language/Local ... ?s=decimal
So i wrote it:
Code: Select all
unsigned int LangIdToCodepage(unsigned int langId_)
{
switch ( langId_ )
{
case 1098: // Telugu
case 1095: // Gujarati
case 1094: // Punjabi
case 1103: // Sanskrit
case 1111: // Konkani
case 1114: // Syriac
case 1099: // Kannada
case 1102: // Marathi
case 1125: // Divehi
case 1067: // Armenian
case 1081: // Hindi
case 1079: // Georgian
case 1097: // Tamil
return 0;
case 1054: // Thai
return 874;
case 1041: // Japanese
return 932;
case 2052: // Chinese (PRC)
case 4100: // Chinese (Singapore)
return 936;
case 1042: // Korean
return 949;
case 5124: // Chinese (Macau S.A.R.)
case 3076: // Chinese (Hong Kong S.A.R.)
case 1028: // Chinese (Taiwan)
return 950;
case 1048: // Romanian
case 1060: // Slovenian
case 1038: // Hungarian
case 1051: // Slovak
case 1045: // Polish
case 1052: // Albanian
case 2074: // Serbian (Latin)
case 1050: // Croatian
case 1029: // Czech
return 1250;
case 1104: // Mongolian (Cyrillic)
case 1071: // FYRO Macedonian
case 2115: // Uzbek (Cyrillic)
case 1058: // Ukrainian
case 2092: // Azeri (Cyrillic)
case 1092: // Tatar
case 1087: // Kazakh
case 1059: // Belarusian
case 1088: // Kyrgyz (Cyrillic)
case 1026: // Bulgarian
case 3098: // Serbian (Cyrillic)
case 1049: // Russian
return 1251;
case 8201: // English (Jamaica)
case 3084: // French (Canada)
case 1036: // French (France)
case 5132: // French (Luxembourg)
case 5129: // English (New Zealand)
case 6153: // English (Ireland)
case 1043: // Dutch (Netherlands)
case 9225: // English (Caribbean)
case 4108: // French (Switzerland)
case 4105: // English (Canada)
case 1110: // Galician
case 10249: // English (Belize)
case 3079: // German (Austria)
case 6156: // French (Monaco)
case 12297: // English (Zimbabwe)
case 1069: // Basque
case 2067: // Dutch (Belgium)
case 2060: // French (Belgium)
case 1035: // Finnish
case 1080: // Faroese
case 1031: // German (Germany)
case 3081: // English (Australia)
case 1033: // English (United States)
case 2057: // English (United Kingdom)
case 1027: // Catalan
case 11273: // English (Trinidad)
case 7177: // English (South Africa)
case 1030: // Danish
case 13321: // English (Philippines)
case 15370: // Spanish (Paraguay)
case 9226: // Spanish (Colombia)
case 5130: // Spanish (Costa Rica)
case 7178: // Spanish (Dominican Republic)
case 12298: // Spanish (Ecuador)
case 17418: // Spanish (El Salvador)
case 4106: // Spanish (Guatemala)
case 18442: // Spanish (Honduras)
case 3082: // Spanish (International Sort)
case 13322: // Spanish (Chile)
case 19466: // Spanish (Nicaragua)
case 2058: // Spanish (Mexico)
case 10250: // Spanish (Peru)
case 20490: // Spanish (Puerto Rico)
case 1034: // Spanish (Traditional Sort)
case 14346: // Spanish (Uruguay)
case 8202: // Spanish (Venezuela)
case 1089: // Swahili
case 1053: // Swedish
case 2077: // Swedish (Finland)
case 5127: // German (Liechtenstein)
case 1078: // Afrikaans
case 6154: // Spanish (Panama)
case 4103: // German (Luxembourg)
case 16394: // Spanish (Bolivia)
case 2055: // German (Switzerland)
case 1039: // Icelandic
case 1057: // Indonesian
case 1040: // Italian (Italy)
case 2064: // Italian (Switzerland)
case 2068: // Norwegian (Nynorsk)
case 11274: // Spanish (Argentina)
case 1046: // Portuguese (Brazil)
case 1044: // Norwegian (Bokmal)
case 1086: // Malay (Malaysia)
case 2110: // Malay (Brunei Darussalam)
case 2070: // Portuguese (Portugal)
return 1252;
case 1032: // Greek
return 1253;
case 1091: // Uzbek (Latin)
case 1068: // Azeri (Latin)
case 1055: // Turkish
return 1254;
case 1037: // Hebrew
return 1255;
case 5121: // Arabic (Algeria)
case 15361: // Arabic (Bahrain)
case 9217: // Arabic (Yemen)
case 3073: // Arabic (Egypt)
case 2049: // Arabic (Iraq)
case 11265: // Arabic (Jordan)
case 13313: // Arabic (Kuwait)
case 12289: // Arabic (Lebanon)
case 4097: // Arabic (Libya)
case 6145: // Arabic (Morocco)
case 8193: // Arabic (Oman)
case 16385: // Arabic (Qatar)
case 1025: // Arabic (Saudi Arabia)
case 10241: // Arabic (Syria)
case 14337: // Arabic (U.A.E.)
case 1065: // Farsi
case 1056: // Urdu
case 7169: // Arabic (Tunisia)
return 1256;
case 1061: // Estonian
case 1062: // Latvian
case 1063: // Lithuanian
return 1257;
case 1066: // Vietnamese
return 1258;
}
return 65001; // utf-8
}
Code: Select all
static HKL KEYBOARD_INPUT_HKL=0;
static unsigned int KEYBOARD_INPUT_CODEPAGE = 1252;
Code: Select all
// get the codepage used for keyboard input
KEYBOARD_INPUT_HKL = GetKeyboardLayout(0);
KEYBOARD_INPUT_CODEPAGE = LangIdToCodepage( LOWORD(KEYBOARD_INPUT_HKL) );
Code: Select all
case WM_INPUTLANGCHANGE:
// get the new codepage used for keyboard input
KEYBOARD_INPUT_HKL = GetKeyboardLayout(0);
KEYBOARD_INPUT_CODEPAGE = LangIdToCodepage( LOWORD(KEYBOARD_INPUT_HKL) );
return 0;
Code: Select all
case WM_KEYDOWN:
{
event.EventType = irr::EET_KEY_INPUT_EVENT;
event.KeyInput.Key = (irr::EKEY_CODE)wParam;
event.KeyInput.PressedDown = true;
dev = getDeviceFromHWnd(hWnd);
BYTE allKeys[256];
WORD KeyAsc=0;
GetKeyboardState(allKeys);
ToAsciiEx(wParam,lParam,allKeys,&KeyAsc,0,KEYBOARD_INPUT_HKL); // ToAscii wouldn't work for unicode on newer window systems
event.KeyInput.Shift = ((allKeys[VK_SHIFT] & 0x80)!=0);
event.KeyInput.Control = ((allKeys[VK_CONTROL] & 0x80)!=0);
WORD unicodeChar;
MultiByteToWideChar(
KEYBOARD_INPUT_CODEPAGE,
MB_PRECOMPOSED, // default
(LPCSTR)&KeyAsc,
sizeof(KeyAsc),
(WCHAR*)&unicodeChar,
1 );
event.KeyInput.Char = unicodeChar;
if (dev)
dev->postEventFromUser(event);
return 0;
}
Edit (2007-04-23): I found another i18n problem which can happen when saving files. Cyrillic filenames are not supported by all systems (i don't even know yet which work and which don't). Just converting strings to Utf8-Names can even result in filenames which can, for example in Windows 98, no longer be renamed or deleted (at least outside the dos-box). So far i have no solution for that, except to use English filenames if you can.
Edit (2007-05-30): Added lester's fix for spaces appearing when pressing shift, ctrl, esc.