Closed Bug 1048050 Opened 9 years ago Closed 8 years ago

Special code for casing of Irish text should be triggered in any document with lang=ga

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: kscanne, Assigned: jfkthame)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

I've been testing the resolution of bug 1014639 (which is now in our Aurora builds), and the special casing rules work perfectly.  This brings me great joy.

I noticed however that the rules are only triggered when a lang attribute matches "ga-ie", and not simply "ga" which is the code one would typically expect in Irish language documents.
Hmm, you're right, this should have been done for plain "ga".

Actually, this highlights a wider problem, in that we won't currently apply the proper casing rules for other languages if the content is tagged with a lang attribute that *does* include a region subtag, such as "tr-TR" or "nl-NL", etc.

In general, I think we should ignore the region or other subtags, and base the behavior here just on the primary language subtag (unless there's specific, different behavior for a particular region -- but we don't currently have any examples of this).
Assignee: nobody → jfkthame
OS: Mac OS X → All
Hardware: x86 → All
This should enable us to support language tags with or without region subtags here. Simon, if you agree this is the right behavior to implement, I'll add a couple of testcases.
Attachment #8466990 - Flags: review?(smontagu)
Blocks: 905381
Comment on attachment 8466990 [details] [diff] [review]
ignore region (or other) subtags when checking for language-specific casing behavior.

Review of attachment 8466990 [details] [diff] [review]:
-----------------------------------------------------------------

Seems reasonable (but it would be nice if we had a less ad-hoc way of parsing language tags. I suppose that is bug 556237, or bug 356038)
Attachment #8466990 - Flags: review?(smontagu) → review+
This just duplicates a couple of our existing tests, and then varies the lang tag so that we're testing examples both with and without region subtags.
Attachment #8468404 - Flags: review?(smontagu)
Attachment #8468404 - Flags: review?(smontagu) → review+
https://hg.mozilla.org/mozilla-central/rev/99e040a2e972
https://hg.mozilla.org/mozilla-central/rev/bf19c9da1518
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
(In reply to Simon Montagu :smontagu from comment #3)
> Comment on attachment 8466990 [details] [diff] [review]
> ignore region (or other) subtags when checking for language-specific casing
> behavior.
> 
> Review of attachment 8466990 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Seems reasonable (but it would be nice if we had a less ad-hoc way of
> parsing language tags. I suppose that is bug 556237, or bug 356038)

(Sorry, I'm just seeing this now.)

While we wait on true BCP 47 support, this would be a lot better if it recursively truncated at the last '-' instead of the first one (which is what I understand this to be doing). That way you can support all possible subtag combinations.

(In reply to Gordon P. Hemsley [:GPHemsley] from comment #7)

(In reply to Simon Montagu :smontagu from comment #3)

Comment on attachment 8466990 [details] [diff] [review]
ignore region (or other) subtags when checking for language-specific casing
behavior.

Review of attachment 8466990 [details] [diff] [review]:

Seems reasonable (but it would be nice if we had a less ad-hoc way of
parsing language tags. I suppose that is bug 556237, or bug 356038)

(Sorry, I'm just seeing this now.)

While we wait on true BCP 47 support, this would be a lot better if it
recursively truncated at the last '-' instead of the first one (which is
what I understand this to be doing). That way you can support all possible
subtag combinations.

Flags: needinfo?(abdulsami03222)

(-)a/content/base/src/nsGkAtomList.h (-1 / +1 lines)
Line Link Here
Lines 2047-2063 GK_ATOM(ko_xxx, "ko-xxx") Link Here
2047 GK_ATOM(x_central_euro, "x-central-euro") 2047 GK_ATOM(x_central_euro, "x-central-euro")
2048 GK_ATOM(x_symbol, "x-symbol") 2048 GK_ATOM(x_symbol, "x-symbol")
2049
2049
2050 // additional languages that have special case transformations 2050 // additional languages that have special case transformations
2051 GK_ATOM(az, "az") 2051 GK_ATOM(az, "az")
2052 GK_ATOM(ba, "ba") 2052 GK_ATOM(ba, "ba")
2053 GK_ATOM(crh, "crh") 2053 GK_ATOM(crh, "crh")
2054 GK_ATOM(el, "el") 2054 GK_ATOM(el, "el")
2055 GK_ATOM(ga_ie, "ga-ie") 2055 GK_ATOM(ga, "ga")
2056 GK_ATOM(nl, "nl") 2056 GK_ATOM(nl, "nl")
2057
2057
2058 // Names for editor transactions 2058 // Names for editor transactions
2059 GK_ATOM(TypingTxnName, "Typing") 2059 GK_ATOM(TypingTxnName, "Typing")
2060 GK_ATOM(IMETxnName, "IME") 2060 GK_ATOM(IMETxnName, "IME")
2061 GK_ATOM(DeleteTxnName, "Deleting") 2061 GK_ATOM(DeleteTxnName, "Deleting")
2062
2062
2063 // IPC stuff 2063 // IPC stuff
(-)a/layout/generic/nsTextRunTransformations.cpp (-1 / +14 lines)
Line Link Here
Lines 230-261 enum LanguageSpecificCasingBehavior { Link Here
230 eLSCB_Greek, // strip accent when uppercasing Greek vowels 230 eLSCB_Greek, // strip accent when uppercasing Greek vowels
231 eLSCB_Irish, // keep prefix letters as lowercase when uppercasing Irish 231 eLSCB_Irish, // keep prefix letters as lowercase when uppercasing Irish
232 eLSCB_Turkish // preserve dotted/dotless-i distinction in uppercase 232 eLSCB_Turkish // preserve dotted/dotless-i distinction in uppercase
233 }; 233 };
234
234
235 static LanguageSpecificCasingBehavior 235 static LanguageSpecificCasingBehavior
236 GetCasingFor(const nsIAtom* aLang) 236 GetCasingFor(const nsIAtom* aLang)
237 { 237 {
238 if (!aLang) {
239 return eLSCB_None;
240 }
238 if (aLang == nsGkAtoms::tr || 241 if (aLang == nsGkAtoms::tr ||
239 aLang == nsGkAtoms::az || 242 aLang == nsGkAtoms::az ||
240 aLang == nsGkAtoms::ba || 243 aLang == nsGkAtoms::ba ||
241 aLang == nsGkAtoms::crh || 244 aLang == nsGkAtoms::crh ||
242 aLang == nsGkAtoms::tt) { 245 aLang == nsGkAtoms::tt) {
243 return eLSCB_Turkish; 246 return eLSCB_Turkish;
244 } 247 }
245 if (aLang == nsGkAtoms::nl) { 248 if (aLang == nsGkAtoms::nl) {
246 return eLSCB_Dutch; 249 return eLSCB_Dutch;
247 } 250 }
248 if (aLang == nsGkAtoms::el) { 251 if (aLang == nsGkAtoms::el) {
249 return eLSCB_Greek; 252 return eLSCB_Greek;
250 } 253 }
251 if (aLang == nsGkAtoms::ga_ie) { 254 if (aLang == nsGkAtoms::ga) {
252 return eLSCB_Irish; 255 return eLSCB_Irish;
253 } 256 }
257
258 // Is there a region subtag we should ignore?
259 nsAtomString langStr(const_cast<nsIAtom*>(aLang));
260 int index = langStr.FindChar('-');
261 if (index > 0) {
262 langStr.Truncate(index);
263 nsCOMPtr<nsIAtom> truncatedLang = do_GetAtom(langStr);
264 return GetCasingFor(truncatedLang);
265 }
266
254 return eLSCB_None; 267 return eLSCB_None;
255 } 268 }
256
269
257 bool 270 bool
258 nsCaseTransformTextRunFactory::TransformString( 271 nsCaseTransformTextRunFactory::TransformString(
259 const nsAString& aString, 272 const nsAString& aString,
260 nsString& aConvertedString, 273 nsString& aConvertedString,
261 bool aAllUppercase, 274 bool aAllUppercase,

Flags: needinfo?(abdulsami03222)
You need to log in before you can comment on or make changes to this bug.