[i18n] Input Method and Fonts improvements for Gutsy

Michael Vogt michael.vogt at ubuntu.com
Tue Aug 7 19:31:05 BST 2007


On Tue, Aug 07, 2007 at 01:15:23AM +0800, Arne Goetje wrote:
> Dear all,
Hey Arne!
 
> I have taken a first look at the current font and input method situation
> in Gutsy (Tribe 3 Live CD and up to date installation on HDD) and have a
> few suggestions to make.

Thanks for looking into this!

> 1. Input Method (SCIM):
> Both Live CD and default installation come with the SCIM package
> installed, however it is not properly set up, so that the user actually
> cannot use it.
>
> SCIM depends on some environment variables and the SCIM demon started in
> the background. There is a nice tool, called im-switch, which takes care
> of this.
> The purpose of im-switch is to give the user a simple frontend to choose
> which input method program (s)he wants to use. For most users with
> non-Latin based alphabets, this should be SCIM, as it clearly supports
> most languages and scripts. However, some Asian users might prefer a
> different application, like IIIMF or gcin (especially in Taiwan).
> im-switch will take a parameter, whether or not it should do the setting
> system wide or in the user scope only, and the name of the input method
> framework.
[..]

Input methods are something that should work on a freshly installed
system with CJK langauges out of the box. We have default
scim/im-switch configs for the CJK languages. I just tested this by
enabling chinese with language-selector and I got a working scim setup
(it let me input something that looks chinese, I can not really read
it :). I have not tested this with a current tribe gutsy CD, maybe
there is a bug there?

For non-CJK languages we do not enable scim, but people who want
complex text input should use the language-selector
(System/Administration/Language Support). It has a checkbox "Enable
support to enter complex charackters).

This is done via a dependency for language-support-$lang to im-switch
and the matching scim plugin/table packages. Those have a im-switch
configuration that will enable scim for the matching locale. 

It has been a while that I looked into the default scim configrations
in the various packages (e.g. scim-pinyin), someone going through them
would really be good to see what we still need, what can be merged
with debian and what is no longer needed.
 
> 2. SCIM modules:
> The default installed scim module packages are:
>  * scim-modules-table
>  * scim-tables-additional (Russian and Indic IMs)
> I highly recommend, that we put the following packages and their
> dependencies into the Live CD and the default installation to make it
> become more useful:
>  * scim-anthy or scim-prime: Japanese input methods, scim-prime is a
> dictionary based IM, which has a great advantage over anthy. Although
> both are widely used in Japan.
>  * scim-chewing: Traditional Chinese phonetic IM, widely used in Taiwan
>  * scim-pinyin: Simplified and Traditional Chinese Pinyin IM, widely
> used in China and by foreigners in Taiwan. ;)
>  * scim-hangul: As the name says it - Korean.
>  * scim-tables-zh: additional table based IMs for Simplified and
> Traditional Chinese, many of them are popular in China, Hong Kong and
> Taiwan.
>  * scim-thai: well, Thai. :)
>  * scim-m17n: bridge to the m17n library, which adds a lot of additional
>  IMs, including Latin based ones for the European languages with
> diacritics. (not everyone likes to fiddle with XKB settings. ;) )
[..]

I think one problem we face here is that CD space is pretty
tight. On a installed system those shold be dependencies of
language-support-$lang. A review for the major languages would be good
to check if we still have the best options for our users as defaults. 

> 3. Fonts:
>  a) language selector:
> The idea with the language selector handling the fontconfig
> configuration is nice, however, it needs some tweaking:
>  * more languages: I will add more config files for more locales; needs
> some testing and probably some community feedback.
>  * Question: how to handle those config files which come with the font
> packages? Font preference handling should be done by language selector,
> while font specific options can remain the the config files installed by
> the font packages? If that's the case, we need to check all the font
> packages and tweak those where it's not the case.

I think just puting the default order in language-selector and moving
the bits that are font specific into the font package sound like a
good plan. I would also be interessted if fontconfig upstream has any
plans to improve the situation here (I asked about this a while ago
with no real result).

[..]
> 4. Improvements for Gutsy+1
> I expect that we don't have enough time to implement these improvements
> into Gutsy, therefor we should probably postpone them for the next release:
>  a) Language selector:
> It would be useful, if the user could have an Advanced button in the
> language selector, where (s)he can adjust his/her preferred fonts and
> translation order. Just like you have a list of available fonts and you
> move them up or down according to your own preference. And the same
> should be possible for translations:
> 
> There are users who live in a foreign country and whose language ability
> is not good enough to use that country's locale settings, but use their
> native language instead. However, they need to use their host country's
> writing system.
[..]

That sounds like a good improvment, lets add a spec for this for
gutsy+1. 

> There are also users who depend on translations, but sometimes meet the
> situation, that a translation is not available in their native language.
> The default fallback is English. But maybe that user is not very good in
> understanding English and prefers a different fallback language, or set
> of languages: For example, a Taiwanese user who uses Traditional
> Chinese, might prefer Simplified Chinese and then Japanese as fallback
> and not English.
[..]

There is some code in language-selector to write a LANGUAGE
environemnt like this, I guess that could be extended to make it more
useful and work with a future "advanced" tab.

>  b) CJK fonts:
[..]
> The problem is, that many characters share the same codepoint in
> Unicode, but have a different shape (number of strokes and stroke order)
> in the different CJK regions (China, Hong Kong / Macao, Taiwan, Japan,
> Korea). This is one of the main reasons why users in these regions
> prefer different fonts.
> My approach would be to put all character shape variants into a single
> TTC (TrueType Collection) and use a different glyph ID to Unicode
> codepoint mapping for each "virtual font".
> Instead of having 5 separate TTF files, each about 25MB in size, we
> would end up with only one TTC file (about 30 MB in size), which
> produces 5 "virtual fonts". Saves a lot of space. ;)
[..]

That sounds very useful, how much work is it to create such a new TTF?
And how much work will it be to maintain it then if the fonts that are
used change (does that happen?).
 
> Caveat: QT3 does not support TTC fonts. GTK2 however has no problem with
> it. QT4 >= 4.3 is also able to use them.
> So, I basically wait until KDE4 is released and adopted into Ubuntu.
> Otherwise KDE users can't use the TTC fonts.
[..]

Ok, so we have to wait a bit with this :)

Cheers,
 Michael



More information about the ubuntu-devel mailing list