Closed Bug 534809 Opened 15 years ago Closed 15 years ago

Non-punycoded IDNs show up in the "Ask" cookie sheet as gibberish

Categories

(Camino Graveyard :: General, defect)

All
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: alqahira, Assigned: alqahira)

References

()

Details

Attachments

(3 files)

STR:

1) Enable asking in the Privacy prefs
2) Visit http://例子.测试/首页 or http://简体中文.idn.icann.org/首页
3) Observe the cookie sheet

ER: Non-Roman characters show up properly, like in Firefox
AR: Non-Roman characters show up as gibberish

I assume we have the same problem with single-byte (European) IDN domains, but I haven’t been able to find any that 1) aren't redirects and 2) are under TLDs where homographs are disallowed (so that punycode is not enabled).

In the cookies list and exceptions list, these domains show up in punycode (which is really unhelpful when actually hunting cookies/exceptions, but that's another bug).
So, I'm a little confused here (well, more than a little, which is directly attributable to my lack of non-AppleScript programming knowledge).

Here's the Core cookie prompt service, which says hostname is an nsACString (which is 8 bits):

http://mxr.mozilla.org/mozilla/source/extensions/cookie/nsCookiePromptService.cpp#60

And here's where something is doing something to the hostname that involves UTF-8 to UTF-16 conversion: 

http://mxr.mozilla.org/mozilla/source/extensions/cookie/nsCookiePromptService.cpp#77

Here's where we grab the hostname (as PromiseFlatCString, which is 8-bits):

http://mxr.mozilla.org/mozilla/source/camino/src/browser/CocoaPromptService.mm#601

(For reference, here's where toolkit grabs the string for their awful dialogues: http://mxr.mozilla.org/mozilla/source/toolkit/components/cookie/content/cookieAcceptDialog.js#113)

So, everyone's taking a UTF-16 string and passing it around as 8-bits?  How is toolkit getting non-gibberish, then?  Where am I messed up reading these string conversions?
Attached patch Hack, v1.1Splinter Review
A couple of weeks of NSLogging and a much-bloodied head later, I've figured out the bug, and I even have *a* fix.  I'm not sure that it's the right fix, though.

I eventually determined, via |NSLog(@"Hostname as stringWith_nsACString: %@", [NSString stringWith_nsACString:hostname]);|, that we were getting a non-gibberish hostname from Core and that turning the nsACString into an NSString was fine.  I then NSLogged dialogText before we sent it off to be PRUnichared and discovered dialogText had the same gibberish as the final sheet, so it was creating the formatted string that somehow screwed things up.

When cl and I thought that it was just an encoding problem, I ran a set of tests with that IDN hostname masquerading as "Remember this decision?" in the strings file; they came back fine, but I noticed the format specifier in CookieText was %s, and when I finally got around to NSLogging hostname, I realized maybe it was the format specifier mismatching.  

Turns out |PromiseFlatCString(hostname).get()| isn't %s, %S, %c, or %C; I have no idea what it is.  But I could take the hostname, stringWith_nsACString it, and have an NSString that would make everything happy with %@ as the specifier (and, apparently according to the Gecko string guide and bug 394161, we shouldn't be using PromiseFlatCString anyway).

So this patch does that.  However, since this code is not only not AppleScript, not even simple Cocoa, but Gecko strings stuff that no-one understands, I certainly don't know if there are issues with lifetimes and null-termination and who-knows-what-else that I don't even know that I need to think about here.  For that reason I'm tagging smorgan and pink, since hopefully one of you will be able to pick up on anything of that nature.
Assignee: nobody → alqahira
Status: NEW → ASSIGNED
Attachment #419726 - Flags: superreview?(mikepinkerton)
Attachment #419726 - Flags: review?(stuart.morgan+bugzilla)
Comment on attachment 419726 [details] [diff] [review]
Hack, v1.1

I don't know anything about PromiseFlatCString, so I don't know why the old code was wrong (maybe it gives some legacy encoding back rather than UTF-8?), but the new code certainly looks good: stringWith_nsACString won't have any termination problems, and it's autoreleased so the lifetimes work out. Nice find!

r=smorgan
Attachment #419726 - Flags: review?(stuart.morgan+bugzilla) → review+
Comment on attachment 419726 [details] [diff] [review]
Hack, v1.1

sr=pink.

promiseFlatString is almost certainly the wrong thing for anything displayed to the user.
Attachment #419726 - Flags: superreview?(mikepinkerton) → superreview+
Landed on cvs trunk.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
This needs to go on the l10n release checklist for 2.1, too.
(In reply to comment #7)
> This needs to go on the l10n release checklist for 2.1, too.

CCing Marcello so that hopefully one of us will remember to mention the %s to %@ change in the string on the l10n list once Alpha 1 is out.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: