Closed Bug 288244 Opened 20 years ago Closed 16 years ago

need to ensure that locale codes in their various formats are supported

Categories

(Webtools :: Bouncer, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: chase, Assigned: morgamic)

Details

Our current locale uses may assume in various places that locales must be of the form: * \w{2-3}-\w{2-3} This is a concern since new efforts seek to use the additional (ISO-specified) locale formats of: * \w{2-3} * \w{2-3}-\w{2-3}-\w{2-3} I've been told that the build system supports this new format. Before we rely heavily on it, though, we must ensure that our other systems are capable of supporting it: * Bouncer / download.mozilla.org * Content detection on the start page * App Update Service (aus) Use this bug to track those issues or, if a significant problem is found along the way, file it separately and set this bug as dependent on that issue.
What ISO spec dictates the format of ISO locale codes? What's the general make-up of a locale code? (eg is it language-region-subregion?)
An example of this change is that the "de-DE" and "de-AT" authors are considering merging / have merged. If so, they wish to offer Firefox, Thunderbird, &c under a locale named "de".
What we actually should support is this format(s): \w{2-3}(-\w{2}(-\w{3-8})?)? If it has three parts, this is <language>-<region>-<dialect>, basically. This follows the "language tag" RFC 3066, see http://www.faqs.org/rfcs/rfc3066.html for more info. Actually, every language that's not different for different regions should go with the ISO 639.1/.2 (2-letter/3-letter) language code alone in theory ("de", "eo", "pl", "cs", etc.), while all where the region does matter should include it (2-letter ISO 3166 code; locale strings look like those we have used until now: "es-ES", "es-AR", "pt-PT", "en-US"). In some rare cases, we might need the dialect part as a third part (3- to 8-letter basically freeform part), I currently can image two cases there: 1. (real case we have at the moment): there's no ISO 639.2 code for some language that wants to do a localization (venetian Firefox in our current case). In this case, we can use the generic identifier for the language family (romance: roa) from ISO 639.2 as the language code, and add an identifier for the specific language as the dialect (if one exists, we prefer to use the 3-letter SIL code there http://www.ethnologue.com/codes/default.asp). In the case of venetian, we end up with "roa-IT-vec" this way. 2. (hypothetical case): we have a real dialect, e.g. a Bavarian L10n, which would get "de-DE-bavarian" or similar. The build system works OK with those locales, and we have one or two with language-code-only on L10n trunk already, so we have some stuff to test it with.
cite from Mark Tyndall's post on n.p.m.l10n the build process (for Firefox on the trunk) can cope with * ab * ab-CD * abc-CD * abc-CD-SIL * abc-CD-dialect where * ab/abc - from ISO 639.1/639.2 * CD - from ISO 3166 * SIL - from the SIL list <http://www.ethnologue.com/codes/> * dialect - following the rules from RFC 3066
(In reply to comment #0) To the programs that make up Bouncer, the locale is just a string. The string is used for exact match database lookup without regard to the string's internal structure or semantics. For example, we have a locale 'de-DE' which implies that we have an associated reference to a localized product. If we were to get a request for a version 'de-DE-bavarian', we would be unable to comply because we don't have that exact string in our database. This situation, however, should never come up. The user doesn't get free form input, they must choose from a selection of locales for which there are associated products. A proper selection is made before the request even gets to Bouncer. Is this the desired behavior? Or should we be prepared to truncate the locale from the right until we do have a match? In the aforementioned example, we'd offer the 'de-DE' because it is a partial match.
(In reply to comment #5) > Is this the desired behavior? Or should we be prepared to truncate the locale > from the right until we do have a match? In the aforementioned example, we'd > offer the 'de-DE' because it is a partial match. I think that fallback mechanism may be good if we have a chance that people are getting to bouncer with a "de-DE" string, but we actually only have "de" to offer. I'm not sure if this can happen, but if it can, it might be good to do that.
Justin is now the QA for this component.
QA Contact: myk → justin
Assignee: nobody → morgamic
Component: Server Operations → Bouncer
Product: mozilla.org → Webtools
QA Contact: justin → kveton
QA Contact: kveton → bouncer
Short little update: We recently added "native" support for locales to Bouncer v1, but the assessment Lars made in comment 5 still holds: We are doing an exact matching against the locales in the database, so whatever string is in there is supported.
Target Milestone: --- → Bouncer 1.5
Bouncer 1.0 has had locale support for awhile now, and I don't know of any problems with it. If there are, please file a new bug. Thanks!
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.