[ubuntu-in] SCIM keymap: Unicode and TTF

Gora Mohanty gora at sarai.net
Thu Aug 23 06:14:24 BST 2007


On Wed, 2007-08-15 at 20:19 -0400, Dinbandhu wrote:
> I have a question regards Unicode and TTF. The superiority of Unicode is
> clear from both scientific and systems approach perspectives. 
> 
> My question stems from a perceived need to be able to communicate with
> persons around the world whose computers are not yet upgraded to be able
> to accommodate Unicode. I do not have any numbers to say how common it
> is, but the question I would ask is: in your experience how common is it
> for Hindi readers to be accessing computers running on for example Win
> 98 which can only recognize TTF fonts? 

Not so sure about Hindi, but in Orissa there were quite a few people
using not only Windows 98, but also Windows 95, to an extent such
that regional newspapers were unwilling to shift to Unicode web pages.
This will probably change within the next couple of years, in that
most people will be forced to shift to at least Windows XP. While XP
can be enabled to support Hindi and a few other Indian languages, it
does not work with many others.

> I do communicate regularly with many people in India via text documents
> in Hindi. Are there many people in India today who still only have
> access to TTF fonts? It is a matter which perhaps warrants addressing,
> because it is still a reality in today's world of communication.
[...]
 
> Is there any facility in the current Linux system for accommodating
> communication with non-Unicode users?

Given that the number and aesthetics of old-style, 8-bit fonts for
Indian languages is better than Unicode equivalents, I agree that
they need to be supported at least for the next few years. There is
also a wealth of existing content made using these fonts, that needs
to be converted into Unicode. However, as most of these fonts are
proprietary, and many are not even available free of cost for
non-commercial use, I do have misgivings about encouraging people to
continue using them. Thus, convertibility to Unicode must be part and
parcel of any such scheme, to prevent lock-in to a particular
font/vendor.

As TTF fonts are well-supported under Linux, what is needed is a way
to allow text entry using these fonts, and means to convert the
content to and from Unicode. Here is what I would propose be done:
(a) Build keymaps for various fonts: As the encoding of the fonts is
    non-standardised, this has to be done separately for each font.
    The saving grace is that many of these fonts fall into identical
    families, and general principles of making the conversion maps
    can be elucidated.
      Growing out of the work on the Unicode Baraha maps, I have
    almost completed a keymap for the Devanagari Shiva font, using the
    same Baraha layout. Shiva is an 8-bit, TTF font widely used by
    printing houses, and this was needed for Sarai publications. I
    will also make the maps for Inscript, ITRANS, Bolnagri, and
    Phonetic. Once these are done, I plan to write up a short report
    on how to prepare conversion tables for such fonts, so that other
    people can work on preparing tables.
(b) As mentioned earlier, the problem with the above is that it
    encourages content creation in old technology, using
    non-standardised encodings. Thus, I think that such maps should be
    released only along with converters to/from Unicode. I have long
    been planning for making a general-purpose library for Indian
    language character processing, which would also indclude such
    converters.
(c) Minor improvements to the rendering of such fonts might be made by
    adding hints for anti-aliasing, something which I understand that
    most Indian language fonts do not do. However, I have no experience
    in this area.

Regards,
Gora





More information about the ubuntu-in mailing list