Closed Bug 2003721 Opened 1 month ago Closed 1 month ago

"lang" attribute should be ascii-case-insensitive

Categories

(Core :: CSS Parsing and Computation, defect, P3)

Firefox 145
defect

Tracking

()

RESOLVED FIXED
147 Branch
Tracking Status
firefox147 --- fixed

People

(Reporter: e-school, Assigned: jfkthame)

Details

Attachments

(6 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:145.0) Gecko/20100101 Firefox/145.0

Steps to reproduce:

I had a lang attribute <html lang="DE-DE">.

Actual results:

This should actually be recognised, but the CSS setting ‘hyphen: auto’ does not work in this case. It does work with ‘de-DE’.

Expected results:

Although a notation such as "de-DE" is quite common, lang attributes should be case-insensitive according to RFC5646 and RFC5234.

Component: Untriaged → CSS Parsing and Computation
Product: Firefox → Core
Summary: Lang-Attributes should be case insensitive → "lang" attribute selector should be case-insensitive
Severity: -- → S3
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Summary: "lang" attribute selector should be case-insensitive → "lang" attribute should be ascii-case-insensitive

Indeed, for hyphens: auto we only seem to recognize the lang in its "canonical" case.

This also affects the language-dependent behavior of text-transform:

data:text/html,<div lang="NL-NL" style="text-transform:capitalize">how to capitalize ijsselmeer?

should result in the digraph "IJ" being capitalized as a unit, but it doesn't.

For some other purposes, however, it is correctly treated as case-insensitive; e.g.

data:text/html,<div lang="DE-DE">Does this use <q>German</q> quotes?

For quotes, this is handled in intl::QuotesForLang by parsing the lang as a locale code and canonicalizing the subtags if we don't find an entry for the original lang as provided. But I wonder if we should do the canonicalization earlier (before storing in the mLanguage field of the StyleFont struct)?

Yes, this code could lowercase the atom if needed or so. There are a few places that use the HTML language directly via nsIContent::GetLang tho...

Yeah, that should work. My first thought was to do it in the GeckoBindings code (in Gecko_nsStyleFont_SetLang), but MapLangAttributeInto is better, as the potential effect on text-emphasis-position there should also be fixed.

In principle I think it seems preferable to use the LocaleService to canonicalize the tag, rather than simply lowercasing the string, so that we can keep using canonical form for things like hyphenation locales (e.g. Swiss German is de-CH, not de-ch) and rely on exact comparisons.

This does have a couple of side-effects, though: we currently have (Northern) Kurdish hyphenation patterns labelled using their ISO 639-3 code of kmr, but Locale::Canonicalize maps this to ISO 639-1 ku. So we'll need to re-label the patterns (which in the TeX world are tagged as kmr) to ku in order to use them.

The other thing I notice is that we have a couple of tests that explicitly assert that ISO 639-3 codes for Sindhi and Urdu are not recognized as triggering language-dependent behavior, as they are not the canonical tags for these languages. If we start canonicalizing the tag, these examples will change behavior. I think that's an acceptable -- perhaps even desirable -- change, though.

Assignee: nobody → jfkthame
Status: NEW → ASSIGNED

This doesn't change behavior, just cleans up the code a bit by using the
nsStyleUtil helper.

Pushed by jkew@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/527055fac7f9 https://hg.mozilla.org/integration/autoland/rev/a4d234158093 Canonicalize lang-subtag case during attribute mapping. r=layout-reviewers,emilio https://github.com/mozilla-firefox/firefox/commit/7a63ee4efdf2 https://hg.mozilla.org/integration/autoland/rev/71d4a1beda0f Update reftests for language-dependent rendering with non-canonical lang tags. r=layout-reviewers,emilio https://github.com/mozilla-firefox/firefox/commit/7ef07f8cb4bb https://hg.mozilla.org/integration/autoland/rev/385ee975ce35 Rename 'kmr' hyphenation locale data to 'ku' to match canonicalized Locale code. r=layout-reviewers,emilio https://github.com/mozilla-firefox/firefox/commit/c6b00b95bb76 https://hg.mozilla.org/integration/autoland/rev/0a678995cbfc Add some tests for case-insensitivity of language tag matching. r=layout-reviewers,emilio https://github.com/mozilla-firefox/firefox/commit/a876179e4551 https://hg.mozilla.org/integration/autoland/rev/6ad3d7a1b8df Slightly simplify intl::QuotesForLang, now that the lang tag in nsStyleFont is known to be canonicalized. r=layout-reviewers,emilio https://github.com/mozilla-firefox/firefox/commit/33dfc6cf93fb https://hg.mozilla.org/integration/autoland/rev/0bd496793009 Convert GetCasingFor() in nsTextRunTransformations to use nsStyleUtil::MatchesLanguagePrefix. r=layout-reviewers,emilio
Pushed by imoraru@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/98714f708a64 https://hg.mozilla.org/integration/autoland/rev/58736bf5585a Revert "Bug 2003721 - Convert GetCasingFor() in nsTextRunTransformations to use nsStyleUtil::MatchesLanguagePrefix. r=layout-reviewers,emilio" for causing multiple failures.
Flags: needinfo?(jfkthame)
Pushed by jkew@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/e4f1380f8cd0 https://hg.mozilla.org/integration/autoland/rev/049c59a915d2 Canonicalize lang-subtag case during attribute mapping. r=layout-reviewers,emilio https://github.com/mozilla-firefox/firefox/commit/997f833bc32e https://hg.mozilla.org/integration/autoland/rev/56273d257339 Update reftests for language-dependent rendering with non-canonical lang tags. r=layout-reviewers,emilio https://github.com/mozilla-firefox/firefox/commit/1dd24934d214 https://hg.mozilla.org/integration/autoland/rev/574cf09cc0e7 Rename 'kmr' hyphenation locale data to 'ku' to match canonicalized Locale code. r=layout-reviewers,emilio https://github.com/mozilla-firefox/firefox/commit/d279a1a33369 https://hg.mozilla.org/integration/autoland/rev/e957834932b9 Add some tests for case-insensitivity of language tag matching. r=layout-reviewers,emilio https://github.com/mozilla-firefox/firefox/commit/d5b1706ccdd6 https://hg.mozilla.org/integration/autoland/rev/ef4730680824 Slightly simplify intl::QuotesForLang, now that the lang tag in nsStyleFont is known to be canonicalized. r=layout-reviewers,emilio https://github.com/mozilla-firefox/firefox/commit/35da0b7f0f73 https://hg.mozilla.org/integration/autoland/rev/f3671a301bd5 Convert GetCasingFor() in nsTextRunTransformations to use nsStyleUtil::MatchesLanguagePrefix. r=layout-reviewers,emilio
Flags: needinfo?(jfkthame)
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/56553 for changes under testing/web-platform/tests
Upstream PR merged by moz-wptsync-bot
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/56555 for changes under testing/web-platform/tests
Upstream PR merged by moz-wptsync-bot
QA Whiteboard: [qa-triage-done-c148/b147]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: