[Localization] [IMPORTANT] Getting your formatting strings right

Alexander Dupuy alex.dupuy at mac.com
Tue Feb 5 14:15:37 EST 2008


Sayamindu Dasgupta writes:

> While going through the translations, I found that some of the
> translators are ignoring the formatting strings in the msgids. Eg,
> they are translating "%d Foo" as "Bar" (where Foo->Bar) in their
> language.
> _Please, please, do not do this._ If you do this, the software will in
> many cases crash while working in your language.
>
> To understand the use of these strings, and to know how to deal with
> them, please do go through
> http://www.bengalinux.org/devel_guide/ch03.html#transguide.poanatomy.specialcases
>   

Note that there is another aspect of the formatting strings that is 
often getting ignored - these are the positional formatting strings.  To 
take an example from the (otherwise excellent) set of screen shots for 
Translation for XO using Pootle that Sameer Verma just posted, the 
message "Downloading %(1)s from %(2)s" should NOT be translated as 
"Descargando %(1) desde %(2)" or the Hindi text (which I cannot type or 
copy-paste) that looks like "%(1) # %(2) ##### ## # ## #" (where the # 
are Hindi characters).  Without the letter 's' after the parentheses, 
the localized string will not be displayed correctly.

The full formatting string, including the trailing letter (this is 
usually 's' but it could be 'd' or 'f' or possibly some other letters) 
needs to be used.  The purpose of the numbers in these format strings is 
to allow for languages where the most natural order of words in the 
sentence in the target language is different from the order used in the 
English source message, but the letter is necessary to indicate that the 
data provided to the formatting function is a string (s), decimal number 
(d), or floating-point number (f).

So for example the message "While saving %(1)s, the device %(2)s was 
removed." could be 'translated' into a more natural word order as 
"Device %(2)s was removed while saving %(1)s."  These sorts of order 
changes aren't necessary very often, but it's well worth knowing that 
you are not obliged to keep the strings in the same order as they are in 
the original English message.

If you're interested in understanding these formatting strings better, 
the full specification can be found at 
http://docs.python.org/lib/typesseq-strings.html - this lists all of the 
possible trailing letters (and other formatting modifiers). It also 
notes that developers need not use numbers in the parentheses: they can 
use a format string like "Version %(filevers)d of file %(filename)." - 
in such a case, the words in parentheses should NOT be translated, but 
must remain exactly as-is, so a translation might be "Versión 
%(filevers)d de fichero %(filename)." Needless to say, this gets very 
confusing, and is strongly discouraged (I doubt if there are any cases 
like this in the Sugar .po files, although they may occur in other 
software).

Note that these are Python positional format strings, and will typically 
be marked as "python-format" in the .po file (I don't think that Pootle 
actually displays this fact though).  There can also be other types of 
format strings, in particular "c-format" for system code written in C - 
this also has a positional formatting notation, but it is different, and 
looks like "Version %1$d of file %2$s" where instead of parentheses, the 
number is followed by a $ symbol.  You won't see anything like this in 
the XO localization, but it's worth knowing if you are working on 
translations for the underlying Fedora OS or other software.

@alex

-- 
mailto:alex.dupuy at mac.com



More information about the Localization mailing list