Closed Bug 427957 Opened 16 years ago Closed 16 years ago

Mixed direction IDN doesn't work (right-to-left and left-to-right characters)

Categories

(Core :: Networking, defect, P1)

defect

Tracking

()

VERIFIED FIXED

People

(Reporter: dev, Assigned: smontagu)

References

(Blocks 1 open bug, )

Details

(Keywords: intl, regression, rtl)

Attachments

(1 file)

Works in Safari & FF2, but not in FF3. Thus, Regression from FF2.

Trying to load http://www.מיץפטל.com/ produces an "Address Not Found" page.

Expected result:
Should load the page (forward URL to http://www.xn--debbkf5af.com/).
Flags: blocking-firefox3?
Version: unspecified → Trunk
-> Core::Internationalization
Assignee: nobody → smontagu
Component: Location Bar and Autocomplete → Internationalization
Flags: blocking-firefox3?
Keywords: regression
Product: Firefox → Core
QA Contact: location.bar → i18n
The testcases at http://idn.icann.org/#The_example.test_names still work, including Hebrew and Arabic script, so the problem is more specific.

Regression is between 2007-12-03-02 and 2007-12-04-02, which suggests bug 402008.
Assignee: smontagu → nobody
Blocks: 402008
Component: Internationalization → Networking
Keywords: intl
OS: Mac OS X → All
QA Contact: i18n → networking
Hardware: Macintosh → All
The problem seems to be that we are validating the domain name as a whole according to RFC 3454, not the individual labels as RFC 3490 specifies.

The bidi restrictions in RFC 3454 prohibit mixing right-to-left and left-to-right characters in the same string, so מיץפטל is valid, but מיץפטל.com is not.
Flags: blocking1.9?
Summary: Hebrew IDN doesn't work → Mixed direction IDN doesn't work (right-to-left and left-to-right characters)
Keywords: rtl
Attached patch PatchSplinter Review
Assignee: nobody → smontagu
Status: NEW → ASSIGNED
Attachment #314935 - Flags: superreview?(cbiesinger)
Attachment #314935 - Flags: review?(cbiesinger)
Comment on attachment 314935 [details] [diff] [review]
Patch

nice catch - although i'm curious why bug 402008 regressed this, since it just refactored code and didn't really change behavior. anyway, thanks for patching :)

>Index: netwerk/dns/src/nsIDNService.cpp
>===================================================================
>+  PRUint32 len = 0, offset = 0;
>+  nsresult rv;
>+  nsAString::const_iterator start, end;
>+  inUTF16.BeginReading(start);
>+  inUTF16.EndReading(end);
>+  outUTF16.Truncate();

fwiw that .Truncate() isn't necessary, the string was just declared above.
(In reply to comment #6)
> i'm curious why bug 402008 regressed this, since it just
> refactored code and didn't really change behavior.

I suppose it exposed it rather than causing it, but fwiw the crucial difference is the change from:

-    if (gShowPunycode || NS_FAILED(gIDN->Normalize(host, temp)))
-        return gIDN->ConvertUTF8toACE(host, result);

to:

+    rv = Normalize(input, _retval);
+    if (NS_FAILED(rv)) return rv;

...since Normalize() had this bug and ConvertUTF8ToACE() didn't.
and I guess that explains bug 355156, too :)
Blocks: 355156
Per blocking request, can you please give a few sentences explaining the impact
of this bug?  Sounds pretty bad, but we're asking ourselves whether or not
individual bugs would stop ship at this point.
Comment on attachment 314935 [details] [diff] [review]
Patch

+    if (*start++ == (PRUnichar)'.') {

PRUnichar('.')

+      outUTF16.Append((PRUnichar)'.');

here too
Attachment #314935 - Flags: superreview?(cbiesinger)
Attachment #314935 - Flags: superreview+
Attachment #314935 - Flags: review?(cbiesinger)
Attachment #314935 - Flags: review+
(In reply to comment #9)
> Per blocking request, can you please give a few sentences explaining the impact
> of this bug?

It's a standards compliance issue (RFC 3490), and a regression, and it has the effect of wiping whole countries off the IDN map. I don't know how many bidirectional IDNs are actually out there, Behnam may have some data about that from his work at nic.ir, but the number is only going to grow. I really don't think we can ship with this bug.

And we do have a patch in hand with reviews and unit tests :)

Attachment #314935 - Flags: approval1.9?
Well, at nic.ir, we have more than 3'000 domains under .ایران.ir (.IRAN.ir) [1] which all are affected by this bug.

Beside that, GoDaddy has recently started selling IDN domains under .com and .net, which the domain in all Arabic, Persia, Urdu, and Hebrew languages would be unusable because of this bug. [2]

Also this bug do affect all IDN sub-domains under every LTR TLD, which might cause some Arabic/Hebrew dictionaries/encyclopedias stop working.

1: http://www.nic.ir/Statistics
2: Example: http://www.ایران.com/ (IRAN.com)
(In reply to comment #11)
> It's a standards compliance issue (RFC 3490), and a regression, and it has the
> effect of wiping whole countries off the IDN map. I don't know how many

Convincing!
Flags: blocking1.9? → blocking1.9+
Priority: -- → P1
Comment on attachment 314935 [details] [diff] [review]
Patch

a1.9=beltzner
Attachment #314935 - Flags: approval1.9? → approval1.9+
Checked in.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Blocks: Persian
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: