Closed Bug 259987 Opened 21 years ago Closed 10 years ago

cookies can't use unicode

Categories

(Core :: Networking: Cookies, defect)

x86
Windows XP
defect
Not set
minor

Tracking

()

RESOLVED WONTFIX

People

(Reporter: philip.nilsson, Unassigned)

Details

(Keywords: intl)

Attachments

(1 file, 1 obsolete file)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.3) Gecko/20040916 Firefox/0.10 (MOOX M3) Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.3) Gecko/20040916 Firefox/0.10 (MOOX M3) This is the scenario: I have a page that's UTF-8 encoded, I submit something like 'テスト', sumbit the form, choose to remember it. When I then see the page next time 'ãã¹ã' is filled in. What I entered is submitted to the page and shows up as it should. I might be able to write up a small testcase. Reproducible: Always Steps to Reproduce:
I change my mind, this bug isn't about the form/password manager at all. It's about cookies/stuff. Which might not be a bug at all, I haven't read the spec for cookies. The javascript in the source sets the form fields to values stored in cookies. It's quite clear that it is what's causing it. I should have looked this up first.
Severity: normal → minor
Component: Form Manager → Browser-General
Assignee: dveditz → darin
Component: Browser-General → Networking: Cookies
QA Contact: core.networking.cookies
Attachment #159183 - Attachment mime type: text/plain → text/html
Summary: forms unable to remember unicode data correctly → cookies can't use unicode
Attached file testcase
Attaching testcase. a point of trouble: http://lxr.mozilla.org/seamonkey/source/content/html/document/src/nsHTMLDocument.cpp#1852 NS_LossyConvertUTF16toASCII will surely loose data here.
Attachment #159183 - Attachment is obsolete: true
Content sink converts to UTF8 instead http://lxr.mozilla.org/seamonkey/source/content/base/src/nsContentSink.cpp#394 Not clear what plugins are using http://lxr.mozilla.org/seamonkey/source/modules/plugin/base/src/nsPluginHostImpl.cpp#6050 nsICookieService is not technically marked frozen, could we just change the interface to AUTF8String instead of string?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Hmm... do any of the cookie specs talk about unicode support? is there any kind of special escaping formula? do we limit our non-ASCII support to the DOM APIs for cookie manipulation? what about HTTP headers and/or HTTP-EQUIV? how do we translate a Unicode cookie value into something that can be inserted into a "Cookie:" request header? We should test IE and find out what it does as well as find out if there is any standard for this.
IE 6.0 rendering of the testcase: setting unicode: foo=テスト
I put up a small testcase for cookie headers at http://yellow.exedo.nl/~michiel/setcookie.php it works using: header("Set-Cookie: TestCookie=テストb"); according to bernd, IE shows the unicode
the php manual about SetCookie says: "Note that the value portion of the cookie will automatically be urlencoded when you send the cookie" But ethereal tells me it actually sends raw utf8. I'm confused.
(In reply to comment #7) > the php manual about SetCookie says: > "Note that the value portion of the cookie will automatically be urlencoded when > you send the cookie" > But ethereal tells me it actually sends raw utf8. I'm confused. It's applicable to setcookie() function only. When setrawcookie() or headers(), raw data which user's PHP script passed is sent to client. It's up to you whether violates or not-violates rules defined by HTTP 1.1 specification, when setrawcookie() or headers().
Assignee: darin → nobody
Keywords: intl
HTTP headers can't contain UTF-8; anything that's not ISO-8859-1 has to be encoded in some fashion. Since the content of cookies is (supposed to be) opaque to the client, it shouldn't matter what encoding scheme a server uses, as long as they use one.
-> me for investigation
Assignee: nobody → dwitte
Reassigning to nobody. If anyone wants to work on this, feel free!
Assignee: dwitte → nobody
I see that this has been reported years ago with only one recent comment. Perhaps this would be of some help. I have observed a problem with unicode characters in cookie saved locally in FF5 and under Vista Home Premium. If one writes a cookie with data such as <<abc¶§½xyz>> then after the browser is shutdown and internal caches cleared, the recover shows <<abc¶§½xyz>>. I first noticed this a few weeks ago with version 4.x (I think) and now with version 5. (Did not have the problem with version 3!) Looking at the distorted recovered data it appears as it the unicode data was in an iso-1251-1 file suggesting to me (who is greatly ignorant of details) that the cookies are not being save in a unicode file. The repeatability is very high but not certain so please try it could of times if necessary. A simple program for testing: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <title>cookie test</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <script type="text/javascript"> function wrt() { var date1 = new Date() ; date1.setTime(date1.getTime()+24*60*60*1000) ; document.cookie="testCookie="+document.getElementById("dat").value+"; expires="+date1.toGMTString()+"; " ; } function rd() { xxx=document.cookie+';'; j=xxx.indexOf('testCookie') ; document.getElementById("xrd").innerHTML=xxx.substring(j+11,xxx.indexOf(';',j)) ; } </script> </head> <body> <p><button onclick="wrt()">Write cookie</button>&nbsp;&nbsp; <button onclick="rd()">Read cookie</button>&nbsp;&nbsp;</p> <p>Field to put in cookie <input type="text" value="abc¶§½xyz" id="dat"></p> <p>Recovered value = &nbsp; <span id="xrd"> </span></p> </body> </html> To test - copy program in unicode. Run. Check recouved cookie data. Shutdown FF do some other stuff to clear caches. Reload file and check cookie. You can also inspect cookie with cookiemanager addon.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: