Special code for casing of Irish text should be triggered in any document with lang=ga
Categories
(Core :: Internationalization, defect)
Tracking
()
People
(Reporter: kscanne, Assigned: jfkthame)
References
(Blocks 1 open bug)
Details
Attachments
(2 files)
2.42 KB,
patch
|
smontagu
:
review+
|
Details | Diff | Splinter Review |
3.60 KB,
patch
|
smontagu
:
review+
|
Details | Diff | Splinter Review |
I've been testing the resolution of bug 1014639 (which is now in our Aurora builds), and the special casing rules work perfectly. This brings me great joy. I noticed however that the rules are only triggered when a lang attribute matches "ga-ie", and not simply "ga" which is the code one would typically expect in Irish language documents.
Assignee | ||
Comment 1•9 years ago
|
||
Hmm, you're right, this should have been done for plain "ga". Actually, this highlights a wider problem, in that we won't currently apply the proper casing rules for other languages if the content is tagged with a lang attribute that *does* include a region subtag, such as "tr-TR" or "nl-NL", etc. In general, I think we should ignore the region or other subtags, and base the behavior here just on the primary language subtag (unless there's specific, different behavior for a particular region -- but we don't currently have any examples of this).
Assignee | ||
Comment 2•9 years ago
|
||
This should enable us to support language tags with or without region subtags here. Simon, if you agree this is the right behavior to implement, I'll add a couple of testcases.
Comment 3•9 years ago
|
||
Comment on attachment 8466990 [details] [diff] [review] ignore region (or other) subtags when checking for language-specific casing behavior. Review of attachment 8466990 [details] [diff] [review]: ----------------------------------------------------------------- Seems reasonable (but it would be nice if we had a less ad-hoc way of parsing language tags. I suppose that is bug 556237, or bug 356038)
Assignee | ||
Comment 4•9 years ago
|
||
This just duplicates a couple of our existing tests, and then varies the lang tag so that we're testing examples both with and without region subtags.
Updated•9 years ago
|
Assignee | ||
Comment 5•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/99e040a2e972 https://hg.mozilla.org/integration/mozilla-inbound/rev/bf19c9da1518
Comment 6•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/99e040a2e972 https://hg.mozilla.org/mozilla-central/rev/bf19c9da1518
Comment 7•9 years ago
|
||
(In reply to Simon Montagu :smontagu from comment #3) > Comment on attachment 8466990 [details] [diff] [review] > ignore region (or other) subtags when checking for language-specific casing > behavior. > > Review of attachment 8466990 [details] [diff] [review]: > ----------------------------------------------------------------- > > Seems reasonable (but it would be nice if we had a less ad-hoc way of > parsing language tags. I suppose that is bug 556237, or bug 356038) (Sorry, I'm just seeing this now.) While we wait on true BCP 47 support, this would be a lot better if it recursively truncated at the last '-' instead of the first one (which is what I understand this to be doing). That way you can support all possible subtag combinations.
Comment 8•1 year ago
|
||
noise |
(In reply to Gordon P. Hemsley [:GPHemsley] from comment #7)
(In reply to Simon Montagu :smontagu from comment #3)
Comment on attachment 8466990 [details] [diff] [review]
ignore region (or other) subtags when checking for language-specific casing
behavior.Review of attachment 8466990 [details] [diff] [review]:
Seems reasonable (but it would be nice if we had a less ad-hoc way of
parsing language tags. I suppose that is bug 556237, or bug 356038)(Sorry, I'm just seeing this now.)
While we wait on true BCP 47 support, this would be a lot better if it
recursively truncated at the last '-' instead of the first one (which is
what I understand this to be doing). That way you can support all possible
subtag combinations.
Comment 9•1 year ago
|
||
noise |
(-)a/content/base/src/nsGkAtomList.h (-1 / +1 lines)
Line Link Here
Lines 2047-2063 GK_ATOM(ko_xxx, "ko-xxx") Link Here
2047 GK_ATOM(x_central_euro, "x-central-euro") 2047 GK_ATOM(x_central_euro, "x-central-euro")
2048 GK_ATOM(x_symbol, "x-symbol") 2048 GK_ATOM(x_symbol, "x-symbol")
2049
2049
2050 // additional languages that have special case transformations 2050 // additional languages that have special case transformations
2051 GK_ATOM(az, "az") 2051 GK_ATOM(az, "az")
2052 GK_ATOM(ba, "ba") 2052 GK_ATOM(ba, "ba")
2053 GK_ATOM(crh, "crh") 2053 GK_ATOM(crh, "crh")
2054 GK_ATOM(el, "el") 2054 GK_ATOM(el, "el")
2055 GK_ATOM(ga_ie, "ga-ie") 2055 GK_ATOM(ga, "ga")
2056 GK_ATOM(nl, "nl") 2056 GK_ATOM(nl, "nl")
2057
2057
2058 // Names for editor transactions 2058 // Names for editor transactions
2059 GK_ATOM(TypingTxnName, "Typing") 2059 GK_ATOM(TypingTxnName, "Typing")
2060 GK_ATOM(IMETxnName, "IME") 2060 GK_ATOM(IMETxnName, "IME")
2061 GK_ATOM(DeleteTxnName, "Deleting") 2061 GK_ATOM(DeleteTxnName, "Deleting")
2062
2062
2063 // IPC stuff 2063 // IPC stuff
(-)a/layout/generic/nsTextRunTransformations.cpp (-1 / +14 lines)
Line Link Here
Lines 230-261 enum LanguageSpecificCasingBehavior { Link Here
230 eLSCB_Greek, // strip accent when uppercasing Greek vowels 230 eLSCB_Greek, // strip accent when uppercasing Greek vowels
231 eLSCB_Irish, // keep prefix letters as lowercase when uppercasing Irish 231 eLSCB_Irish, // keep prefix letters as lowercase when uppercasing Irish
232 eLSCB_Turkish // preserve dotted/dotless-i distinction in uppercase 232 eLSCB_Turkish // preserve dotted/dotless-i distinction in uppercase
233 }; 233 };
234
234
235 static LanguageSpecificCasingBehavior 235 static LanguageSpecificCasingBehavior
236 GetCasingFor(const nsIAtom* aLang) 236 GetCasingFor(const nsIAtom* aLang)
237 { 237 {
238 if (!aLang) {
239 return eLSCB_None;
240 }
238 if (aLang == nsGkAtoms::tr || 241 if (aLang == nsGkAtoms::tr ||
239 aLang == nsGkAtoms::az || 242 aLang == nsGkAtoms::az ||
240 aLang == nsGkAtoms::ba || 243 aLang == nsGkAtoms::ba ||
241 aLang == nsGkAtoms::crh || 244 aLang == nsGkAtoms::crh ||
242 aLang == nsGkAtoms::tt) { 245 aLang == nsGkAtoms::tt) {
243 return eLSCB_Turkish; 246 return eLSCB_Turkish;
244 } 247 }
245 if (aLang == nsGkAtoms::nl) { 248 if (aLang == nsGkAtoms::nl) {
246 return eLSCB_Dutch; 249 return eLSCB_Dutch;
247 } 250 }
248 if (aLang == nsGkAtoms::el) { 251 if (aLang == nsGkAtoms::el) {
249 return eLSCB_Greek; 252 return eLSCB_Greek;
250 } 253 }
251 if (aLang == nsGkAtoms::ga_ie) { 254 if (aLang == nsGkAtoms::ga) {
252 return eLSCB_Irish; 255 return eLSCB_Irish;
253 } 256 }
257
258 // Is there a region subtag we should ignore?
259 nsAtomString langStr(const_cast<nsIAtom*>(aLang));
260 int index = langStr.FindChar('-');
261 if (index > 0) {
262 langStr.Truncate(index);
263 nsCOMPtr<nsIAtom> truncatedLang = do_GetAtom(langStr);
264 return GetCasingFor(truncatedLang);
265 }
266
254 return eLSCB_None; 267 return eLSCB_None;
255 } 268 }
256
269
257 bool 270 bool
258 nsCaseTransformTextRunFactory::TransformString( 271 nsCaseTransformTextRunFactory::TransformString(
259 const nsAString& aString, 272 const nsAString& aString,
260 nsString& aConvertedString, 273 nsString& aConvertedString,
261 bool aAllUppercase, 274 bool aAllUppercase,
Description
•