Closed Bug 380383 Opened 13 years ago Closed 13 years ago
[FIX]about:blank encoding is not consistent
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; lt; rv:18.104.22.168) Gecko/20070309 Firefox/22.214.171.124 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; lt; rv:126.96.36.199) Gecko/20070309 Firefox/188.8.131.52 My page open new window using "window.open" function. then creates form using "document.write", then submits data using "document.forms.submit" Problem: On IE6 - data, which are got on server is encoded using utf-8. On FF1.5 - data, which are got on server is encoded using utf-8. On FF2.0 data, which are got on server is encoded using utf-16. Reproducible: Always Steps to Reproduce: 0. create page using utf-8 encoding 1. open new window using "window.open" 2. create form using "document.write" 3. submits data using "document.forms.submit" Actual Results: data, which are got on server is encoded using utf-16. Expected Results: data, which are got on server is encoded using utf-8.
Is this better on trunk now that bug 255820 is fixed? You'll need to download a nightly to test, because that bug was fixed very recently.
i installed http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2007-05-13-03-mozilla1.8/firefox-184.108.40.206pre.en-US.win32.installer.exe But bug is still there.
That's a 2.0.0.x nightly. You need to test a trunk nightly.
I tested http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2007-05-14-05-trunk/firefox-3.0a5pre.en-US.win32.installer.exe (If this is not correct version, please, copy link to me. thank you.) This version do not use UTF-16, but still do not use correct encoding. I have page in UTF-8, and if open page using "window.open", i expect window in UTF-8, but version i tested opens window in iso-8859-1. i cannot "document.write" using lithuanian letters. FF1.5 and IE6 do not have such problem.
Assignee: nobody → general
Component: General → DOM: Level 0
Product: Firefox → Core
QA Contact: general → ian
> i cannot "document.write" using lithuanian letters. Testcase, please? A testcase for the original bug here would be very helpful also.
I assume that's the testcase for this bug, right? Note that I don't have a server, so I can't do step 3. Is there another way to tell what encoding the form is using? e.g. doing a GET instead of a POST and looking at the URI? Is there another bug on the Lithuanian problem you mentioned?
Also, what are the expected results for comment 6? That the charset used for the newly opened window is that of the document that |o| points to?
This shows UTF-8 for the child document for both trunk and 1.8 branch over here... which is not surprising given that about:blank is loaded as UTF-8. Do you see something different in your setup on this testcase? If you replace the written-out content with a form as in your example, does it submit with the same charset as what n.characterSet returns?
i tested my testcase with http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2007-05-16-04-trunk/firefox-3.0a5pre.en-US.win32.installer.exe Results are similar, except after i comming back - get "iso-8859-1". itested on FF 220.127.116.11. it opened new window in "UTF-8", and all data was thansfered to server correctly encoded. Anyway, new window is opened with different encoding from parent window, and data are posted incorectly encoded. So, if data are posted with FF 2+ , on server side i got post information without lithuanian letters.
Another question. If you document.write into an existing window with a page loaded in it (after that page has finished loading), what charset does IE end up using? UTF-8, the charset of the page doing the writing, or the charset of the page that was loaded in the window the write is being done into?
So in IE6, document.charset seems to be "unicode" (aka UTF-16) for any document created via document.write, no matter what the source and target document charsets were. Same thing for window.open()ed documents in general. And I seem to recall that IE encodes form submissions as UTF-8 if the document is UTF-16, or something like that. Of course I have no idea whether document.charset reflects anything about form submission in this case. vytis, if you have any information that would shed light on the questions in comment 13 and comment 14, I'd love to hear it.
1. It should be no difference between opening "window.open" in new _TAB_ and _WINDOW_ . This is user's decision, use tabs or windows. and it must not impact program. As webdeveloper, I cannot control settings of FF on users' computers. I cannot check on server side, was data submitted from window or tab. Such checking is nonsense... 2. "window.open" of FF 2 opens page in new _TAB_. Lithuanian letters are corrupted when sent via post. "window.open" of FF 1.5 opens page in new _WINDOW_. Lithuanian letters are succesfuly sent via post. I tried to test but i failed to force FF2 to open "window.open" in new _WINDOW_. I posted https://bugzilla.mozilla.org/show_bug.cgi?id=381140
I'm not saying there _should_ be a difference. I'm saying there _is_. That's a bug, but it might explain why we're seeing different results in some of our tests above. If the tab/window thing changed between 1.5 and 2 that would explain things, indeed. The only remaining question (as all along) is what IE does. I'm happy to change this code to do the same thing, but I need to know what "the same thing" is first... Do you see the same thing I do in comment 15? Thank you very much for helping sort this out!
Yes, in IE6 i see the same: On client side "Unicode" is used, but post data are sent UTF-8 encoded.
We should probably make the document created by CreateAboutBlankContentViewer be UTF-8 (like the real about:blank), and make document.open reset the charset to UTF-16 or something. We seem to have the same "submit utf-8 for utf-16" behavior. I'll try to do this when I get back into town, I guess. The about:blank part we might want on the 1.8 branch too. Ian, do you want to add the "set charset to utf-16 on document.open" to the spec?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: document created using document.write uses "UTF-16" encoding for forms (instead of encoding of "parent") → document created using document.write uses ISO-8859-1 encoding instead of UTF-16
I'll look into it.
So I tried to do comment 19. Changing the document.open to set charset to UTF-16 breaks the testcases in bug 255820. Does IE not use the document character set for linked stylesheets? In any case, sounds like we don't want to change that; just changing CreateAboutBlankContentViewer to give UTF-8 the same way that about:blank does will restore the behavior we used to have.
OS: Windows 2000 → All
Hardware: PC → All
Summary: document created using document.write uses ISO-8859-1 encoding instead of UTF-16 → [FIX]document created using document.write uses ISO-8859-1 encoding instead of UTF-16
Checked in. Fixed to the state we had on 1.7 branch, though not per initial description. I don't think we want to make these documents UTF-16, since that will break stylesheets due to the differences with IE in how stylesheet charsets are determined.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Summary: [FIX]document created using document.write uses ISO-8859-1 encoding instead of UTF-16 → [FIX]about:blank encoding is not consistent
http://lxr.mozilla.org/mozilla/source/content/html/document/test/test_bug380383.html Feel free to flip back if server-side-y stuff needs to be tested, or open a new bug, CC me, and I'll make sure to deal with it when the HTTP server has the necessary functionality to deal (POST request support, perhaps?).
I think we're good for now. We'll need POST to test other bugs related to form submission, but that test tests what I checked in pretty well.
Why would this be a branch blocker, rather than just a bug? It's got a regression keyword, but not what it regressed from.
It regressed from bug 323810, I think -- that's what changed how the forcing into a new tab worked. This probably doesn't need to block, but I do think we should take the simple (and fairly safe, imo) fix on the branch.
Attachment #267111 - Flags: approval18.104.22.168?
Comment on attachment 267111 [details] [diff] [review] Proposed fix approved for 22.214.171.124, a=dveditz
Attachment #267111 - Flags: approval126.96.36.199? → approval188.8.131.52+
fixed for 184.108.40.206
Verified FIXED using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:220.127.116.11pre) Gecko/20070710 BonEcho/18.104.22.168pre. Using a build from 2007-06-14 without the patch and the "Simple testcase", the opened pages had the (incorrect) ISO-8859-1 encoding, but using a build with the patch, the pages are opened with UTF-8 encoding.
(In reply to comment #19) > > Ian, do you want to add the "set charset to utf-16 on document.open" to the > spec? Done. I've also made about:blank explicitly UTF-8. I'm not sure what to do about the style sheet encoding issue, that seems like a CSS thing. I haven't made the document's character encoding affect the submission encoding, I'll do that when WF2 is integrated into HTML5.
Done that too now; the HTML5 spec now completely agrees with comment 19.
You need to log in before you can comment on or make changes to this bug.