[Localization] stats versus size

Chris Leonard cjlhomeaddress at gmail.com
Fri Feb 11 11:57:27 EST 2011

Dear Pootle developers,

The Sugar Labs / OLPC / eToys localization community faces a somewhat unique
challenge and I am hoping that you might be able to offer some insight or

It is a given that it is extremely useful to be able to track L10n progress
and to use the "untranslated" or "needs attention" review links as a means
of quickly focusing effort on areas where it is needed.  However there are a
number of terms in our projects that for one reason or another may not
translate well and some language projects prefer to leave them in their
original form.  No feature exists to flag such an entry as "do not
translate".  While it is possible to simple select the "Copy" option (which
would leave it flagged as 'unchanged" on the review tabs), this would seem
to have the unintended consequence of bloating the localization files beyond
the minimalist change set needed.

This is a significant issue for OLPC in that builds are made with a select
number of localizations included and the limited space available on the XO
hardware causes the developers to make some difficult choices about whether
to cut languages out of the builds to save space (even a handful of MB makes
a difference when you are limited to 1 GB total storage on the XO).

My question is what is the best way to deal with these conflicting goals,
keeping good stats on strings that have been "human-reviewed" versus those
that are untouched and reducing the size of the localization change set

One option that occurs to me would be to create a "Do not translate" status
flag (similar in concept to "fuzzy") that would count the string as human
reviewed, but not unnecessarily contribute to the file size of the PO as a
"copy" would.  I am open to other solutions for this conundrum, either
technical or administrative, but there is a certain appeal to the technical
solution as we have a highly distributed localization community (over 100
languages or dialects in progress) and establishing manual business rules
for all to follow can be challenging in it's own right.


A related issue has come up with our Spanish language localizations.
Certain terms are translated differently in different regions, this leads to
well intentioned, but sometimes disruptive, reversions and changes in the
lang-es project.  The Spanish-speaking Sugar community is reluctant to fork
the lang-es project into multiple locales because of the administrative
overhead of maintaining multiple localizations, but nonetheless would be
interested in a minimalist changeset feature that would allow the core
lang-es to be used except where a local variant (trivial example: frijoles
versus habichuelas when referring to beans) should override.  This is not a
small problem as there are hundreds of thousands of XO laptops deployed to
children throughout Latin America.

Any suggestions you may have as how to address these issues would be most

volunteer Sugar Labs / OLPC / eToys Pootle admin
