[Localization] Omitting format specifiers in plural form translations
Khaled Hosny
khaledhosny at eglug.org
Thu Aug 7 17:02:12 EDT 2008
On Thu, Aug 07, 2008 at 03:23:12PM -0400, Alexander Dupuy wrote:
> However, in a string like the following "There are %d files in the %s
> directory" having a plural form translation "There are a pair of files
> in the %s directory" is likely to cause an application written in C to
> crash. Using the positional format "There are a pair of files in the
> %2$s directory" might work in some cases, but I would not want to depend
> on it, since the printf documentation says:
Yes, we already encountered this in Arabic and we didn't figure a good
workaround yet (I usually put %Id in brackets after the plural, like
There are a pair (%1$d) of files....), I'll try the %.0s trick (it did
work with simple printf on my system).
>> There may be no gaps in the numbers of arguments specified using '$';
>> for example, if arguments 1and 3 are specified, argument 2 must also
>> be specified somewhere in the format string.
>
> I don't see any great solution for these sorts of strings; hopefully,
> they are rare, and in the few cases where they occur, the trick that you
> came up with for Python could be used, and might work on at least some
> systems.
It isn't that rare in Arabic translation actually, I think we've 10s of
strings like this.
>> I think this a bug in python's gettext implementation, since this is
>> allowed in C.
>>
>
> The issue here is not with gettext itself - either in Python or in C,
> gettext does not interpret or replace the %d format specifier - in C,
> the substitution of %d is done by a call to one of the printf functions;
> in Python, the substitution is performed by the % string formatting
> operator.
Yes, I just realized that.
>>
>> I tried %.d which I supposed it would suppress printing the number, but
>> it made no difference, however %.s does the trick. Now I'm wondering how
>> bad is that since msgfmt -c gives "fatal errors" but python didn't
>> complain so far.
>>
>
> Python is much more flexible than C when it comes to implicit type
> conversion, so it's quite reasonable to use %.s to print a zero-width
> representation of a number. I would suggest using %.0s to make it more
> explicit that this is what you are doing and that it is intentional.
OK, I'm going to fix the translations to use %.0s instead.
> It's also probably not a great idea to use %.0s for localizing C
> applications (although it works on my Fedora 7 system) since some
> implementations of printf may cause an application crash when formatting
> a numeric value as if it were a string (even if it is zero-width).
I see, I've to do more testing for this.
> These changes would allow %.0s to be used as a placeholder when omitting
> format specifiers in plural form translations for Python applications,
> without triggering undesired errors from the msgfmt and
> translate-toolkit checking.
Right.
Thanks very much for your informative replies.
Regards,
Khaled
>
> @alex
> --
> mailto:alex.dupuy at mac.com
--
Khaled Hosny
Arabic localizer and member of Arabeyes.org team
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
Url : http://lists.laptop.org/pipermail/localization/attachments/20080808/3881f787/attachment.pgp
More information about the Localization
mailing list