Closed Bug 1448891 Opened 6 years ago Closed 6 years ago

Private-use locale subtags are incorrectly treated as case sensitive

Categories

(Core :: Internationalization, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: jfkthame, Assigned: zbraniecki)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

The recently-added code (from bug 1447189) to support private-use subtags fails to case-fold them, although BCP47 is explicit[1] that even private-use subtags are case insensitive. So "x-foo-BAR" and "x-Foo-Bar" ought to match, but currently won't.

[1] https://tools.ietf.org/html/bcp47#section-2.1.1
Thanks for filing this Jonathan.

I took a look at this and it seems that:

1) We don't use private tags when doing Locale.Match(), so I think this one is not affected
2) In the operator `==` overload we do compare the mPrivateUse arrays.

If I'm not mistakes it means that the only one we want to fix is (2). It's a special use case because we do not canonicalize the input tags here.

Is there any pattern in our C++ to compare two arrays of strings in a case insensitive way?
Flags: needinfo?(jfkthame)
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #1)
> Thanks for filing this Jonathan.
> 
> I took a look at this and it seems that:
> 
> 1) We don't use private tags when doing Locale.Match(), so I think this one
> is not affected
> 2) In the operator `==` overload we do compare the mPrivateUse arrays.
> 
> If I'm not mistakes it means that the only one we want to fix is (2). It's a
> special use case because we do not canonicalize the input tags here.
> 
> Is there any pattern in our C++ to compare two arrays of strings in a case
> insensitive way?

Not that I'm aware of, I think we'd have to do it by hand.

However, I was assuming the fix here would be to canonicalize the private-use subtags when adding them to the array, rather than to case-fold at comparison time. https://tools.ietf.org/html/bcp47#section-2.1.1 suggests that by convention, "all subtags, including extension and private use subtags, use lowercase letters..." except for Region and Script, so following that recommendation seems the simplest solution.
Flags: needinfo?(jfkthame)
Ahh, I don't know how I missed it while reading BCP47. Thank you! Patch ready :)
Assignee: nobody → gandalf
Status: NEW → ASSIGNED
Comment on attachment 8962613 [details]
Bug 1448891 - Lowercase private use tags in MozLocale in accordance with BCP47.

https://reviewboard.mozilla.org/r/231454/#review236922

::: intl/locale/MozLocale.cpp:66
(Diff revision 1)
> -      mPrivateUse.AppendElement(subTag);
> +      nsAutoCString lcSubTag(subTag);
> +      ToLowerCase(lcSubTag);
> +      mPrivateUse.AppendElement(lcSubTag);

AppendElement returns a pointer to the newly-added element, so I think you can just do

    ToLowerCase(*mPrivateUse.AppendElement(subTag));

here and skip the additional local variable.
Nice! I switched both cases which can benefit from this.
Comment on attachment 8962613 [details]
Bug 1448891 - Lowercase private use tags in MozLocale in accordance with BCP47.

https://reviewboard.mozilla.org/r/231454/#review237000

::: intl/locale/MozLocale.cpp:91
(Diff revision 2)
>      } else if (position <= 4 && slen >= 3 && slen <= 8) {
>        // we're quirky here because we allow for variant to be 3 char long.
>        // BCP47 requires variants to be 5-8 char long at lest.
>        //
>        // We do this to support the `ja-JP-mac` quirk that we have.
> -      nsAutoCString lcSubTag(subTag);
> +      ToLowerCase(*mVariants.AppendElement(subTag));

Uhh...no, I don't think you want to change this one. We deliberately used InsertElementSorted rather than AppendElement here so that the variants would be sorted and so two locales that have the same collection of variants will compare as equal, regardless of the ordering of the variants in the original tag.

So "en-US-basiceng-unifon" and "en-US-UniFon-BasicEng" should compare as equal, because both case and variant ordering should be normalized here. (Might want to add a testcase to check this!)
Uhh, thank you! You're of course right. I'll fix that :)
Comment on attachment 8962613 [details]
Bug 1448891 - Lowercase private use tags in MozLocale in accordance with BCP47.

https://reviewboard.mozilla.org/r/231454/#review237010
Attachment #8962613 - Flags: review?(jfkthame) → review+
Pushed by zbraniecki@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c828f608bf8a
Lowercase private use tags in MozLocale in accordance with BCP47. r=jfkthame
https://hg.mozilla.org/mozilla-central/rev/c828f608bf8a
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: