[Localization] Indigenous language L10n processes

Chris Leonard cjlhomeaddress at gmail.com
Sun Jul 24 23:31:00 EDT 2011


On Sun, Jul 24, 2011 at 9:41 PM,  <mokurai at earthtreasury.org> wrote:
> On Sun, July 24, 2011 1:51 am, Chris Leonard wrote:
>> Dear Localization Community,
>>
>> There are a variety of methods for facilitating third language
>> localization via an intermediate (non-English) language.  This is an
>> important process for South American indigenous languages via Spanish
>> (Aymara, Quechua, Nahuatl, Tzotzil, Huastec / Tének, etc.) as well as
>> Francophone Africa and other circumstances where a non-English
>> majority lingua franca overlaps an indigenous language.
>
> I can see a future need for
>
> * Spanish-->Catalan
>
> * Portuguese-->languages of Brazil and some countries in Africa
>
> * Russian-->languages of former Soviet Republics
>
> * Chinese-->many minority languages in China
>
> * Traditional Chinese<-->Simplified Chinese
>
> This last cannot be done using simple character substitution tables, due
> to many-one and one-many mappings, and to differences in terminology for
> almost everything since the 1949 split between the usage areas,
> particularly in computing.
>
> I don't know whether Dutch is still an important intermediate language in
> former Dutch colonies. There are clearly other minority languages in Asian
> countries such as Thailand, Laos, Burma, and others that will benefit from
> this approach.

Indeed, that is why it will be useful to agree upon a generalizable
appoach / process for addressing these various circumstances.

I would make a distinction between cases where the intermediate
(Non-English) language is *essential* to recruit sufficient numbers of
localizers and circumstances where a third language (e.g. Dutch) may
be a useful (but not strictly essential) reference for Afrikaans or
Papiamento localizers, because English comprehension is sufficiently
common among those multilingual localizers to allow for a direct
translation from English-based POT files.

One key element of these approaches is a completely localized set of
PO files in the intermediate language.  As  for the two different
methods of pre-processing the PO files  I proposed,  I think that  the
instrans  (translator comment) method is more flexible in that both
the English original and the intermediate language are readily visible
in the Pootle (or off-line editor) window, whereas the poswap method
only shows the intermediate language.

With only the intermediate language shown (poswap), there is a
potential problem with "lossy" information transfer, as in the
"whispsering game".

http://en.wikipedia.org/wiki/Chinese_whispers

which can be mitigated by also having the English original visible.

As a less "invasive" modification of the PO file, the instrans method
could be used for adding an intermediate reference language (e.g.
Dutch annotation within Afrikaans / Papiamento PO files) without
incurring additional burden for reversing the poswap style processing.

I may be making arguments in favor of the instrans method, but I am
still very interested in feedback from those directly engaged in
projects targeting indigenous language translations.

cjl


More information about the Localization mailing list