Investigate issues with bogus localized strings

NEW
Unassigned

Status

()

7 years ago
7 years ago

People

(Reporter: Dolske, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

7 years ago
Followup from bug 680812, wherein we found that strings in .properties files are basically handled as input to a snprintf-like function, and so "%9S" and "%9$S" do very different things.

If I do a regex search for "%[0-9][^\$]" in mozilla-central (http://bit.ly/qEBQ1I), I get a few hits on potentially-suspicious strings...

1) The string from bug 680812

2) toolkit/locales/en-US/chrome/global/headsUpDisplay.properties

  85 # LOCALIZATION NOTE (timestampFormat): %1$02S = hours (24-hour clock),
  86 # %2$02S = minutes, %3$02S = seconds, %4$03S = milliseconds.
  87 timestampFormat=%02S:%02S:%02S.%03S

3) browser/locales/en-US/chrome/browser-region/region.properties

  browser.search.siteSearchURL= (a url with a percent-encoded character)

#1 will be fixed shortly, #2 looks like we need to fix it, and #3 is _probably_ ok, as I bet it never goes through nsIStringBundle's formatStringFromName().


A similar search on the L10N mxr also finds some "interesting" strings...

1) A number of locales have a tryNewerDriverVersion string with something like "blah blah %1 blah blah". This string has since been removed from m-c, so should be harmless.

2) A number of hits on the string for bug 680812.

3) Various other mistakes that I haven't checked to see if they're used...

  tn/toolkit/chrome/mozapps/downloads/downloads.properties
  transferred=%1SKB ya %2SKB
  he/calendar/chrome/calendar/calendar.properties 
  tooNewSchemaErrorBoxTitle=נתוני לוח השנה שלך אינם נתמכים בגרסה זו של %1S
  it/browser/chrome/browser/browser.properties
  privateBrowsingMessage = Le schede correnti ... anonima.%0.S

  and likely others, since MXR maxed out with "Too many hits, displaying the first 1000" (most of which are comments or repeats)


In the short term, we should just identify and fix all the broken strings. Do the L10N tools have validators? Probably want to use that to prevent anything except "%S" and "%#%S" (and somehow figure out how to let URLs with "%" in them through).

In the long term, we probably should just remove the ability to do printf-style formatting in localized strings. (Eg, see bug 288400)
(In reply to Justin Dolske [:Dolske] from comment #0)
>   privateBrowsingMessage = Le schede correnti ... anonima.%0.S

This is a hack to avoid warnings from compare-locales. To avoid personification we often remove application's name as a subject and change to a passive structure, which results in a warning in compare-locales (missing variable) 

Affected strings in Italian
http://bit.ly/qEBQ1I

Original suggestion
http://bit.ly/qXd4jZ
(In reply to flod (Francesco Lodolo) from comment #1)
> Affected strings in Italian
> http://bit.ly/qEBQ1I

Sorry, bad clipboard. Right link: http://bit.ly/nK4uPP

Comment 3

7 years ago
(In reply to Justin Dolske [:Dolske] from comment #0)
> A similar search on the L10N mxr also finds some "interesting" strings...

Probably a case for Axel or someone to add a check to the L10n dashboard if it isn't there already.

(And I can't help myself but think of how this problem should go away in the future with L20n having more understandable parameter syntax...)

Comment 4

7 years ago
The 2) string is correct, and is there to output 1:02 as a time instead of 1:2.

The %1$0.S strings are also right, as flod pointed out, they're a custom hack around the fact that the string formatter crashes if not all arguments are given. Sad, but true.

3) hebrew is OK, that's just funny looking due to LTR param and RTL mixing bad.

I already have checks for bad printf params in compare-locales, http://hg.mozilla.org/l10n/compare-locales/file/default/lib/Mozilla/Checks.py#l112. They error on any formatting that I understand to crash, and they warn about real bugs that don't crash.

I'm not sure how reliable we can warn or error beyond that. Maybe warn about non-zero sizes in l10n if en-US doesn't have them?
The much more tricky issue is finding out if en-US does what they intended to do, as bug 680812 shows.
You need to log in before you can comment on or make changes to this bug.