OLPC Software Code Localization - A Few Things I've Noticed
Xavier Alvarez
xavi.alvarez at gmail.com
Fri Oct 26 20:50:41 EDT 2007
On Friday 26 October 2007 16:54, you wrote:
ET> Hi, everyone,
ET>
ET> In response to Xavier Alvarez' request on 10/25 for
ET> translators and coordinators, I decided to get off the
ET> sidelines and take a look at OLPC's new Pootle-based L10N
ET> infrastructure.
ET>
ET> Here are a few things I noticed which I think will be of
ET> general interest and concern:
ET>
ET> (0) CASING/NAMING OF PO FILES PROBLEM:
The 'rule' is quite simple (but not necessarily as intuitive as
may be expected): given that we are bundling several d.l.o
projects into pootle-projects, we need to ensure (or at least
minimize the possibility) of having 2 POT files with the same
name.
Solution? We prefix whatever filename used for the POT in d.l.o
with the name of its project...
journal-activity.Journal.po
<--dlo-project->.<filename>
Thus, any 'inconsistencies' are really product of other
inconsistencies... they just happen to be more evident (and ugly)
within Pootle.
ET>
ET> (Upper/Lower) Casing of names of po files is
ET> inconsistent: For example, in Core there is
ET> "journal-activity.Journal.po" with upper case "J" for
ET> the 2nd occurrence of "Journal" but then why isn't
ET> "write.write.po" written "write.Write.po"?
ET>
ET> This is a small point, but consistent and inuitive
ET> naming of these PO files will help everyone. Or am I just
ET> failing to understand or intuit what the pattern is supposed
ET> to be here?
ET>
ET> (1) INCONSISTENT NUMBER OF MSGIDs ACROSS DIFFERENT
ET> LANGUAGES:
Yes and no.
The numbers shown in the statistics do not represent quantity of
MSGIDs but WORDS in the file. So I presume that for untranslated
strings it takes the MSGID words, and for translated strings, the
MSGSTR. Thus two languages with all things translated and upto
date, may still show different numbers (although conceptually
they are the same). BTW, it does show the number of strings in
other 'statistic levels'.
Yes, I was quite baffled too... translators are more worried about
the word-count than 'lines of code'... ;)
In http://solar.laptop.org:5080/projects/xo_core/
Language Trans. Fuzzy Untrans. Total
Portuguese (Brazil) 162 42% 4 1% 213 56% 379
Spanish 219 62% 0 0% 132 37% 351
While in each language+project
[pt_BR] 8 files, 162/379 words (42%) translated [118/247 strings]
[es] 8 files, 219/351 words (62%) translated [157/234 strings]
Note that even Still, there's a difference with the number of
strings... see below.
ET>
ET> The other day when I looked at write.write.po for
ET> French, there were only 10 messages in the catalog. Today, I
ET> see that there are 36 messages which looks a lot closer to
ET> what I myself get from "xgettext toolbar.py" on the latest
ET> code.
ET> However, when I checked write.write.po for Thai today,
ET> I see that it still has only 10 messages.
ET>
ET> Solution (Or at least A Question Posing As A Possible
ET> Solution):
ET>
ET> Does everyone agree that there needs to be a way that
ET> all of the ".po" files for all languages get updated with the
ET> latest messages extracted via "xgettext" from the latest
ET> codebase (toolbar.py, etc.)?
Yes, there's a problem. Reviewing what you've noted, the problem
appears to be a mix of things. Just for the record, we are
sticking to the POT files found in d.l.o git (not fedora)
1) the POT in dlo only has 9 strings
http://dev.laptop.org/git?p=projects/write;a=blob_plain;f=po/write.pot;hb=HEAD
2) the POT creation dates have probably been tampered with
externally so it's impossible to determine which one makes sense
without going into the source code:
FR.PO "POT-Creation-Date: 2007-06-21 17:33+0200\n"
DLO POT "POT-Creation-Date: 2007-06-21 17:33+0200\n"
I personally believe that developers should generate the POT file
and make sure that it's in d.l.o git.
Overall, I find these inconsistencies a direct result of the messy
flow we've had with t.fp.o. As a matter of fact, I've been trying
to process the tickets in d.l.o holding PO submissions and things
haven't been very nice. The current situation is:
0) only some projects have been injected into Pootle
(core and bundled activites, with few exceptions like Etoys)
1) d.l.o POT files are being considered the standard
2) d.l.o PO files have been injected but not fully verified
2.1) many have lost their (UTF-8) encoding
2.2) many PO files seem not to correspond to their POT (1)
3) tickets (submitting PO files) seem to issues noted in (2)
On top, some of the quirks and particularities of the tools do
seem to get in the way, but I think that most stem from the fact
that we don't have a 'base' POT population.
Still working on it,
Xavier
PS: The issue regarding lists is an interesting issue that I think
it may be much broader than the XO... :)
...snip...
ET> >
ET> > Questions, suggestions, ideas, etc. are all welcome!
ET> >
ET> >
ET> > Cheers,
ET> > Xavier
ET> >
ET> > [[Localization]] http://wiki.laptop.org/go/Localization
ET> > [[Pootle]] http://wiki.laptop.org/go/Pootle
ET> > [[Pootle/Administration]]
ET> > http://wiki.laptop.org/go/Pootle/Admininstration
ET> > [[Pootle/Glossary]]
http://wiki.laptop.org/go/Pootle/Glossary
--
XA
=========
Don't Panic! The Answer is 42
More information about the Devel
mailing list