17 Aug 2004

India needs to get local !

Mayank sharma rightly wonders, that why a fast growing IT power like India still has such low percentage of acceptance of technologies by non-IT people (see "Why India is struggling with localized language computing"). Given the fact that there are several languages spoken and written in India, very few of them have made it to the localized language list.


Despite several initiatives by organizations like CDAC (Center for Development of Advanced Computing) and FSF-India, there has been a gross disinterest in localization projects. If you feel that there's something you can do about it, Mayank goes on to suggest a good place to start. Read more here ..

As an excercise, (see Step 1) i found out that several indian languages have been encoded by the Unicode standard (namely - Devnagari, Bengali, Gurumukhi, Gujarati, Oriya, Tamil, Telugu, Kannada and Malayalam) and hey - most of them are indian !

For Step 2, their ISO codes are Assamese (AS), Bengali (BN), Bihari (BH), English (EN), Gujarati (GU), Hindi (HI), Kannada (KN), Kashmiri (KS), Malayalam (ML),Marathi (MR), Nepali (NE), Oriya (OR), Punjabi (PA), Sanskrit (SA), Sindhi (SD), Singhalese (SI) , Tamil (TA), Telugu (TE) and Urdu (UR).

Now as the next step, we need if locale data (language specific info) exists for you a particular language. The name format would be ISOCode_CountryCode (Country Code for India is IN), so for example, the locale data file for Sanskrit should be SA_IN. According to the article we need to search something called a GNU libc sources (a library used by the Linux Kernel - the heart of the Linux operating system). I did just that and i found the locale data for Arabic, Bengali, English, Gujarati, Hindi, Kannada, Malayalam, Marathi, Punjabi, Tamil and Telugu (hey ! we're missing Sanskrit and Urdu - two of the oldest ones !). As per the advice on the site, i did find others like Sanskrit and Urdu - though not in the form of glibc locale data files. In simpler terms, work is being done of these languages, but is not centralized.

Like Mayank says, i got through most of the stuff in a few minutes, but the major stuff seems to be translation. I never got around finding out what a PO file means anyway. But, the point was made - all that it requires is a little effort from all of us to help translate what stuff means in our mother tongue (thats not asking too much is it ?).