Closed
Bug 1405696
Opened 7 years ago
Closed 7 years ago
XMLHttpRequest using UTF-8 as input to the URL parser causes compatibility issues
Categories
(Core :: DOM: Core & HTML, defect, P2)
Tracking
()
VERIFIED
FIXED
mozilla58
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox56 | --- | unaffected |
firefox57 | + | fixed |
firefox58 | --- | fixed |
People
(Reporter: yfdyh000, Assigned: wisniewskit)
References
Details
(Keywords: regression)
Attachments
(2 files)
7.87 KB,
text/plain
|
Details | |
59 bytes,
text/x-review-board-request
|
bkelly
:
review+
ritu
:
approval-mozilla-beta+
|
Details |
STR:
1. Open http://xukezheng.cbrc.gov.cn/ilicence/
2. Input the "安" to "机构名称:" field.
3. Click the "查询" button located on the upper right corner.
Actual results:
Empty result on the page, as fullName=%E5%AE%89 instead of fullName=%B0%B2.
Expected results:
Some results with name contain "安".
No broken for sites. Chrome 61 work fine.
Regression range:
https://hg.mozilla.org/integration/autoland/rev/8ff32bdb356d
Bug 1322874 - Remove nsIURI::originCharset
Comment 1•7 years ago
|
||
The site uses XHR to POST a request. The document encoding isn't supposed to affect XHR POST requests, but apparently it is web-incompatible.
Flags: needinfo?(annevk)
Comment 3•7 years ago
|
||
(In reply to Anne (:annevk) from comment #2)
> What exactly is passed to send()?
Minimized STR:
1. Go to http://xukezheng.cbrc.gov.cn/ilicence/ (or any other gbk-encoded pages)
2. Open web console.
3. Type `
xhr=new XMLHttpRequest();xhr.open("POST", "getLicence.do?useState=3&organNo=&fatherOrganNo=&province=&orgAddress=&organType=&branchType=&fullName=安&address=&flowNo=", true);xhr.send("start=0&limit=10");
` into the console.
4. Switch to Network tab and see the fullName parameter.
AR: fullName: 安
ER: fullName: %B0%B2
Flags: needinfo?(annevk)
Comment 4•7 years ago
|
||
Thanks, so the problem is that we use UTF-8 for the URL parser invoked from XMLHttpRequest rather than the encoding of the document.
Given the results for XMLHttpRequest/open-url-encoding.htm in web-platform-tests it would probably be more expedient for Firefox and the standard to change behavior rather than hope everyone else will. :-(
Thomas, is this something you want to take on?
Flags: needinfo?(annevk) → needinfo?(wisniewskit)
Updated•7 years ago
|
Component: Networking → DOM
Priority: -- → P2
Summary: Form field has been sent as utf-8 instead of originCharset (gbk) resulting in empty results on specific site → XMLHttpRequest using UTF-8 as input to the URL parser causes compatibility issues
Assignee | ||
Comment 5•7 years ago
|
||
I'm interested yes, but also on PTO next week and swamped this week, so I'll understand if someone wants to beat me to it.
Flags: needinfo?(wisniewskit)
Comment 6•7 years ago
|
||
Hey Thomas, this is currently tracking 57. would you consider this a blocker?
Flags: needinfo?(wisniewskit)
Assignee | ||
Comment 7•7 years ago
|
||
I'll tentatively say yes, but I'm not at all sure how widespread the issues are, so I can't really be sure. Do we know if this affects more sites, or if xukezheng.cbrc.gov.cn is a site we would block release for on its own?
Flags: needinfo?(wisniewskit) → needinfo?(jmathies)
Updated•7 years ago
|
OS: Unspecified → All
Hardware: Unspecified → All
Comment 9•7 years ago
|
||
This is now tracked by https://github.com/whatwg/xhr/issues/159. I can land the specification change once we update Gecko and "correct" the corresponding web-platform-tests test.
(I found that other browsers do the exact same broken thing for fetch(). I wrote a test for that and filed bugs against the other browsers hoping we can still keep the correct behavior there.)
Comment 10•7 years ago
|
||
Should we consider backing Bug 1322874 out, or is it important for other things in 57?
Flags: needinfo?(annevk)
Flags: needinfo?(VYV03354)
Updated•7 years ago
|
Flags: needinfo?(annevk) → needinfo?(valentin.gosu)
Comment 11•7 years ago
|
||
(In reply to Mike Taylor [:miketaylr] (58 Regression Engineering Owner) from comment #10)
> Should we consider backing Bug 1322874 out, or is it important for other things in 57?
We should probably back it out. The issue is here is that we no longer inherit the charset of the baseURI.
We can fix _this_ issue in XMLHttpRequestMainThread.cpp [1] by passing the charset of the document into NS_NewURI.
But there might be other cases where the charset causes a problem.
[1] http://searchfox.org/mozilla-central/rev/31606bbabc50b08895d843b9f5f3da938ccdfbbf/dom/xhr/XMLHttpRequestMainThread.cpp#1568
Flags: needinfo?(valentin.gosu)
Comment 12•7 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #11)
> We should probably back it out. The issue is here is that we no longer
> inherit the charset of the baseURI.
> We can fix _this_ issue in XMLHttpRequestMainThread.cpp [1] by passing the
> charset of the document into NS_NewURI.
> But there might be other cases where the charset causes a problem.
If we backout, we should update the URL spec to introduce the charset concept because that "problem" is expected per the current spec (e.g. fetch incompatibility). I'm not sure we should do it unless more actual web compat problems are found.
Flags: needinfo?(VYV03354)
Comment 13•7 years ago
|
||
If other browsers don't use utf-8 there, is there a reason we want to take the compatibility risk in 57?
Comment 14•7 years ago
|
||
emk, the URL Standard does have an encoding concept: https://url.spec.whatwg.org/#url-parsing. Using that concept for XMLHttpRequest is https://github.com/whatwg/xhr/issues/159 as I pointed out above. (It's also used throughout the HTML Standard quite a bit, albeit indirectly.)
I filed bugs against other browsers not using UTF-8 for fetch(). Chrome has already fixed their bug on that: https://bugs.chromium.org/p/chromium/issues/detail?id=772390.
Comment 15•7 years ago
|
||
(In reply to Anne (:annevk) from comment #14)
> emk, the URL Standard does have an encoding concept:
> https://url.spec.whatwg.org/#url-parsing. Using that concept for
> XMLHttpRequest is https://github.com/whatwg/xhr/issues/159 as I pointed out
> above. (It's also used throughout the HTML Standard quite a bit, albeit
> indirectly.)
>
> I filed bugs against other browsers not using UTF-8 for fetch(). Chrome has
> already fixed their bug on that:
> https://bugs.chromium.org/p/chromium/issues/detail?id=772390.
Yeah, therefore I support a fix that is limited to XHR rather than a full backout that will introduce an incorrect (non-UTF-8) behavior to fetch.
Comment 16•7 years ago
|
||
(In reply to Masatoshi Kimura [:emk] from comment #15)
> (In reply to Anne (:annevk) from comment #14)
> > emk, the URL Standard does have an encoding concept:
> > https://url.spec.whatwg.org/#url-parsing. Using that concept for
> > XMLHttpRequest is https://github.com/whatwg/xhr/issues/159 as I pointed out
> > above. (It's also used throughout the HTML Standard quite a bit, albeit
> > indirectly.)
> >
> > I filed bugs against other browsers not using UTF-8 for fetch(). Chrome has
> > already fixed their bug on that:
> > https://bugs.chromium.org/p/chromium/issues/detail?id=772390.
>
> Yeah, therefore I support a fix that is limited to XHR rather than a full
> backout that will introduce an incorrect (non-UTF-8) behavior to fetch.
Are you able to take on such a targetted XHR fix, Thomas?
Flags: needinfo?(twisniewski)
Assignee | ||
Comment 17•7 years ago
|
||
I believe so. Have we decided to block 57 on this?
Flags: needinfo?(twisniewski) → needinfo?(overholt)
Comment 18•7 years ago
|
||
(In reply to Thomas Wisniewski from comment #17)
> I believe so. Have we decided to block 57 on this?
Awesome! Yes, we did. Blame miket.
Flags: needinfo?(overholt)
Assignee | ||
Comment 19•7 years ago
|
||
Anne, I have the obvious fix ready (the one you suggested in comment 4), but there's still a compat problem: for the "lone surrogate" test in test_url_encoding.html, we would now emit this: "&%2365533;" while Chrome/Webkit emit "%26%2355357%3B" and Edge emits "?". So who is correct? Should we also be escaping the ampersand and semicolon?
Flags: needinfo?(annevk)
Assignee | ||
Comment 20•7 years ago
|
||
Since it may be worth landing my partial fix (even if the "lone surrogate" test is still failing), I'm doing a try run for what I have so far: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d0fa3fc699e27ee9ed8b4a33121de1f4afc49636
Comment hidden (mozreview-request) |
Assignee | ||
Comment 22•7 years ago
|
||
Alright, since the only worrying test failure in that try run isn't actually related (the wpt10 WebRTC failure; it fails locally for me with or without my patch), I think the patch is ready for a review (and a decision on whether we need to fix remaining issue I pointed out in comment 19, or whether that is better-addressed in a follow-up).
Comment 23•7 years ago
|
||
(In reply to Thomas Wisniewski from comment #19)
> Anne, I have the obvious fix ready (the one you suggested in comment 4), but
> there's still a compat problem: for the "lone surrogate" test in
> test_url_encoding.html, we would now emit this: "&%2365533;" while
> Chrome/Webkit emit "%26%2355357%3B" and Edge emits "?". So who is correct?
> Should we also be escaping the ampersand and semicolon?
Please file a separate bug. It does not have to do with this encoding problem.
Comment 24•7 years ago
|
||
mozreview-review |
Comment on attachment 8919921 [details]
Bug 1405696 - pass the document encoding to NS_NewURI for XMLHttpRequest;
https://reviewboard.mozilla.org/r/190856/#review196352
::: dom/xhr/XMLHttpRequestMainThread.cpp:1554
(Diff revision 1)
> baseURI = mBaseURI;
> } else if (responsibleDocument) {
> baseURI = responsibleDocument->GetBaseURI();
> }
> +
> + mozilla::NotNull<const Encoding*> originCharset = UTF_8_ENCODING;
nit: I believe this code is already in the mozilla namespace, so I think you can just write `NotNull<>`.
::: testing/web-platform/meta/XMLHttpRequest/open-url-encoding.htm.ini:4
(Diff revision 1)
> +[open-url-encoding.htm]
> + type: testharness
> + [lone surrogate]
> + expected: FAIL
Can you add a `bug: <link>` line here to reference the follow-up bug to investigate this?
::: testing/web-platform/tests/XMLHttpRequest/open-url-encoding.htm:23
(Diff revision 1)
> }, "percent encode characters");
> test(function() {
> var client = new XMLHttpRequest()
> client.open("GET", "resources/content.py?\uD83D", false)
> client.send()
> - assert_equals(client.getResponseHeader("x-request-query"), "%EF%BF%BD")
> + assert_equals(client.getResponseHeader("x-request-query"), "%26%2355357%3B")
I can't claim to be an expert on encodings, so I'm trusting these values are correct.
Attachment #8919921 -
Flags: review?(bkelly) → review+
Assignee | ||
Updated•7 years ago
|
Comment hidden (mozreview-request) |
Assignee | ||
Comment 26•7 years ago
|
||
I've created the follow-up bug, so this is ready for check-in.
Keywords: checkin-needed
Updated•7 years ago
|
Assignee: nobody → wisniewskit
Comment 27•7 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/4f95cfe243f3
pass the document encoding to NS_NewURI for XMLHttpRequest; r=bkelly
Keywords: checkin-needed
Comment 28•7 years ago
|
||
Backed out for leaks detected by Linux x64 asan:
https://hg.mozilla.org/integration/autoland/rev/cb2a30c6c7f6c738bbf3dfaa7b9d3332ad17e729
Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=4f95cfe243f3ba509f536fdcf2d0c44b38d4a62a&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=retry
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=138300279&repo=autoland
> SUMMARY: AddressSanitizer: 22920 byte(s) leaked in 704 allocation(s)
Flags: needinfo?(wisniewskit)
Assignee | ||
Comment 29•7 years ago
|
||
I honestly have no clue what could make this patch leaky (and why other code that does the same elsewhere, like in the HTML parser, isn't leaky).
bkelly, emk, would you know who could help me find out?
Flags: needinfo?(wisniewskit)
Flags: needinfo?(bkelly)
Flags: needinfo?(VYV03354)
Comment 30•7 years ago
|
||
This cThis also causes web-platform failures, see the wpt10 in https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=faf8d2ab8dcaff4d78eb722c94ba7727ea20f67b&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=retry
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=138319145&repo=autoland
> TEST-UNEXPECTED-FAIL | /workers/semantics/encodings/003.html | URL encoding, dedicated worker - assert_true: expected true got false
Comment 31•7 years ago
|
||
I guess the leak is a false positive, because Linux64 asan bc8 did not fail in the next changeset:
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=faf8d2ab8dcaff4d78eb722c94ba7727ea20f67b
Servo servo PR #18809 would be the actual cause.
Flags: needinfo?(VYV03354)
Assignee | ||
Comment 32•7 years ago
|
||
Ah, that makes sense, thanks emk.
aryx, I can't imagine those wpt10 failures being from my patch, as I get the same failures in local tests whether my patch is applied or not. Based on bug 1404733, they have been happening for a while. I also can't see why my patch would cause them to fail more often, because the traces I see in the log have nothing to do with XMLHttpRequest (as far as I can tell).
Flags: needinfo?(bkelly)
Comment 33•7 years ago
|
||
Thomas, per the algorithm in https://url.spec.whatwg.org/#query-state Firefox is correct (&%2365533;). However, judging from Ben's review comments it seems you aligned the test with Chrome/Safari?
Flags: needinfo?(annevk)
Assignee | ||
Comment 34•7 years ago
|
||
Yes, I did. But since Firefox is correct I will update my patch tomorrow and re-request check-in with the correct test expectation (and I'll close bug 1410139, since we're doing the right thing).
Comment 35•7 years ago
|
||
The leaks look unrelated to this bug, they also showed up two pushes after the backout https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=627e043ba274a983ffe9a22bdf842f76c47719ec&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=retry
But then they vanished. Sorry about that.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 37•7 years ago
|
||
Okay, I've updated the patch's test expectation to reflect that we're correct in the "lone surrogate" case.
Since it doesn't seem that this patch is the one causing any of the failures we saw, I'm re-requesting checkin.
Keywords: checkin-needed
Comment 38•7 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/2dd370ac9599
pass the document encoding to NS_NewURI for XMLHttpRequest; r=bkelly
Keywords: checkin-needed
Comment 39•7 years ago
|
||
Backed out for failing web-platform-test /workers/semantics/encodings/003.html like in in comment 30:
https://hg.mozilla.org/integration/autoland/rev/6f058c57e3d0a8d9474cda1b60a679f487c72ed1
Push which ran many failing tests: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=328f5e2f6939516b6519a45d0341b863a91747de&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=retry
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=138573796&repo=autoland
> TEST-UNEXPECTED-FAIL | /workers/semantics/encodings/003.html | URL encoding, dedicated worker - assert_true: expected true got false
Flags: needinfo?(wisniewskit)
Assignee | ||
Comment 40•7 years ago
|
||
jgraham, do you have any idea why my patch would be triggering bug 1404733, or do you think it's not related? I can't quite tell what the problem is causing those WebRTC failures, nor why it would be affected by my XMLHttpRequest patch.
Flags: needinfo?(wisniewskit) → needinfo?(james)
Comment 41•7 years ago
|
||
bug 1404733 is not related. If you click on the failure and then on 'Similar Jobs' at the bottom followed by a click on a green job, the panel on the right will also show these "failures" - they are not fatal.
Assignee | ||
Comment 42•7 years ago
|
||
Ah, thanks aryx, I see it now.
The problem is that I should not be using the responsible document's encoding for dedicated workers, but rather I should just fall back on UTF-8 (a trivial change to the patch).
I'll do one more try-run before I try landing again.
Flags: needinfo?(james)
Comment hidden (mozreview-request) |
Assignee | ||
Comment 44•7 years ago
|
||
Alright, based on this try-run I think we're good to go this time: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1d96616db5a9f8e7b9a4857d5ffbcb8da7a1b470
Re-requesting check-in.
Keywords: checkin-needed
Comment 45•7 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/cb92520aacde
pass the document encoding to NS_NewURI for XMLHttpRequest; r=bkelly
Keywords: checkin-needed
Comment 46•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Comment 47•7 years ago
|
||
New regression in 57 that includes a test - please request Beta approval on this when you're comfortable doing so.
Flags: needinfo?(wisniewskit)
Flags: in-testsuite+
Assignee | ||
Comment 48•7 years ago
|
||
Yang, could you confirm if the site if now working properly for you again in the latest nightly builds? Thanks!
Flags: needinfo?(wisniewskit) → needinfo?(yfdyh000)
Assignee | ||
Comment 50•7 years ago
|
||
Comment on attachment 8919921 [details]
Bug 1405696 - pass the document encoding to NS_NewURI for XMLHttpRequest;
Approval Request Comment
[Feature/Bug causing the regression]: regressing bug #1322874 (inadvertently changed the URL-parsing behavior of XMLHttpRequest slightly, breaking web-compat).
[User impact if declined]: one Chinese government site is confirmed to be affected by this issue so far (bug 1405696), breaking the site's search functionality. It is presumed that other sites will also be affected.
[Is this code covered by automated tests?]: yes, by the web platform tests (updated by this patch).
[Has the fix been verified in Nightly?]: yes, by myself and the reporter.
[Needs manual test from QE? If yes, steps to reproduce]: no.
[List of other uplifts needed for the feature/fix]: none.
[Is the change risky?]: low risk.
[Why is the change risky/not risky?]: low because it reverts XMLHttpRequest behavior back to what it was before the regressing bug broke compatibility.
[String changes made/needed]: none.
Attachment #8919921 -
Flags: approval-mozilla-beta?
Comment on attachment 8919921 [details]
Bug 1405696 - pass the document encoding to NS_NewURI for XMLHttpRequest;
Recent regression, must fix, beta57+
Attachment #8919921 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 52•7 years ago
|
||
bugherder uplift |
Comment 53•7 years ago
|
||
(In reply to Thomas Wisniewski from comment #50)
> [Is this code covered by automated tests?]: yes, by the web platform tests
> (updated by this patch).
> [Has the fix been verified in Nightly?]: yes, by myself and the reporter.
> [Needs manual test from QE? If yes, steps to reproduce]: no.
Setting qe-verify- based on Thomas's assessment on manual testing needs and the fact that this fix has automated coverage.
Flags: qe-verify-
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•