[Localization] Trying to get big picture

Chris Leonard cjlhomeaddress at gmail.com
Fri May 2 00:47:49 EDT 2008


Hello,

I'm trying to mentally work through mechanics of internationalization of
textual content (or HTML content) and I have many questions.  I've read much
of the wiki on the pootle, i18n,  l10n, translating, etec. topics, but they
are all oriented towards code, not textual content, so much is still not
clear to me.

Let's assume that I have some great text in English, starting from
plain-vanilla ASCII, but there may be some words that are going to be better
if they are represented in italics or bold (for emphasis). Let's say I take
care of that by using HTML mark-up, so I now have an English HTML text tha I
want to internationalize for Pootle submission.

Most of the HTML is doing stuff behind the scenes (links, font-size, etc.)
which poses no special i18n issues, but some HTML mark up has the effect of
modifying the presentation of text in a way that actually does have an
impact on the text's meaning.

(see pseudo-HTML below).

sentence1 is phrase1a + phrase1b + phrase 1c

sentence2 is phrase2a + <bold>phrase2b</bold> + phrase2c

sentence3 is <italics>phrase3a</italics> + phrase3b + phrase3c



sentence1 is no real challenge, it goes straight into a .pot file as a
single string.

But what about sentences 2 or 3?

Do italics and/or bold tags translate into non-latin alphabets?  (especially
say Nepali)

How do you parse this for presentation in Pootle?

Do you put full sentence with HTML tag in it, hoping translator mentally
interprets and adjusts HTML tags as needed to preserve emphasis desired
within context of sentence?

or do you break it into sub-phrases (broken at the HTML tag junctions in
English)?

I can envision the .po files (pseudo versions below are ASCII only),

sentence1
Where is the library?
Donde esta la biblioteca?

sentence2
Let's go to the beach!
Vamos a la playa!

does the internationalized backbone html look like this?

<ahref="sentence1">www.laptop.org</a>
<br>
<ahref="sentence2">www.google.com</a>

and if so what takes it and the .po file to produce a localized version?

Is this done with gettext ignoring the fact that it is not code, just text?


Do gettext tools understand HTML?
I know that is a lot of questions, but I am really hoping to get some health
content prepared/bundled and it would help me very much to understand the
later phases of 118n and l10n so that the entire process can be done in the
most efficient fashion.  I more-or-less grasp the process used for code, but
I'm not sure I understand how or if the same tools work for plain-text or
HTML.

Any guidance (more wiki links, whatever) would be appreciated.

cjl


*http://wiki.laptop.org/go/User:Cjl* <http://wiki.laptop.org/go/User:Cjl>

cjlhomeaddress at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.laptop.org/pipermail/localization/attachments/20080502/de7aa648/attachment.htm 


More information about the Localization mailing list