Closed Bug 918719 Opened 6 years ago Closed 3 years ago

[XHR2] Too many readystatechange events in LOADING state (readyState 3)

Categories

(Core :: DOM: Core & HTML, defect)

x86
Linux
defect
Not set

Tracking

()

RESOLVED INVALID
Tracking Status
firefox51 --- wontfix
firefox52 --- wontfix

People

(Reporter: hsteen, Assigned: wisniewskit)

References

()

Details

Attachments

(1 file, 6 obsolete files)

This is an old Gecko featured that allowed implementing "progressive" data load with XHR. Now it's made obsolete by progress events, and should be removed.

WARNING: this may cause compat issues. I know GMail was using this feature at some point, and other scripts and libraries may also rely on it. If it's possible to first implement a warning when Gecko fires more than one readyState 3 readystatechange event to a listener, we should consider doing so.

Test cases:
http://w3c-test.org/web-platform-tests/master/XMLHttpRequest/event-readystatechange-loaded.htm
Are webcompat issues still a concern here? If so, is there a way to tell if a script is checking for readyState=3 in onReadyStateChange, so we can distinguish for the purposes of telemetry or warning the users?
Flags: needinfo?(bzbarsky)
Flags: needinfo?(annevk)
I'm not sure, I thought other browsers might have removed this already, but it probably wouldn't hurt to verify that. (I don't think you can check if a script looks for readyState being 3.)
Flags: needinfo?(annevk)
Chrome now passes the web platform test, but Edge doesn't (neither 13 nor 14 preview). I don't have Safari handy to perform a test with it.
http://w3c-test.org/XMLHttpRequest/event-readystatechange-loaded.htm fails in Safari Technology Preview (and in stable too). Both dispatch 10 events.

Given that Chrome passes the risk is probably smaller, but we should probably be wary of regressions still and have a way to revert to the old behavior easily for at least one release.
It's not clear to me why we have to change all, or all but one, UAs here, instead of changing the spec to match reality....
Flags: needinfo?(bzbarsky)
My gut instinct says that unless compat proves to be an issue, it would be best to get rid of this feature altogether. The spec already has a better way to check for XHR progress updates that is supported by all major vendors, after all.

I'm certainly not against guarding this with an about:config flag, though. I could see it being used for progress bars in old router firmware and intranet apps, and there's always an outside chance that it would break more than progress bars.
The main reason to remove it is to make readystatechange actually reflect its name. On the other hand, new features go into fetch() so I suppose we could leave this part of XMLHttpRequest partly broken. I'm not opposed to adding this back into the specification and apologizing to Chrome (who did make this change and will have to revert).
I'd still prefer trying to remove it entirely if Chrome hasn't run into any problems (since if nobody is using the feature anymore, it hardly matters how long UAs take to remove it).

But I won't push any further. Even it's not much effort to remove it, it's still effort, especially if old addons end up breaking and we have to consider special-casing it for system XHRs and the like.

Ultimately I'll let you guys decide, since you have more experience in these matters and this really isn't a big deal in the grand scheme of things.
Okay, let's see if any of the above swayed bz.
Flags: needinfo?(bzbarsky)
Worth trying, I suppose, but I'd do it behind a pref at first so we can revert easily....
Flags: needinfo?(bzbarsky)
Since bz's away, would you mind taking a look at this instead, smaug? If you agree that it's worth trying this out, here's a patch (including a fallback preference).

A try run seems fine: https://treeherder.mozilla.org/#/jobs?repo=try&revision=45c771afa367
Assignee: nobody → wisniewskit
Status: NEW → ASSIGNED
Attachment #8780872 - Flags: review?(bugs)
Comment on attachment 8780872 [details] [diff] [review]
918719-send_only_one_loading_readystatechange_unless_pref_is_set.diff

>diff --git a/b2g/app/b2g.js b/b2g/app/b2g.js
...
>+// Whether to send more than one "loading" readystatechange during XHRs to
>+// simulate progress events for sites still not using modern progress events.
>+pref("dom.fire_extra_xhr_loading_readystatechanges", false);
Why you need this when the pref is in all.js already.

>@@ -0,0 +1,13 @@
>+Components.utils.import("resource://gre/modules/Timer.jsm");
>+
>+function handleRequest(request, response)
>+{
>+  response.processAsync();
>+  response.setHeader("Content-Type", "text/plain", false);
>+  response.setHeader("Content-Length", "10", false);
>+  response.write("chunk", 5);
>+  setTimeout(() => {
>+    response.write("chunk", 5);
>+    response.finish();
>+  }, 250);




>+runTest().then(function(count) {
>+  ok(count === 1, "Only one loading readystatechange events should have been fired with the pref off.");
>+}).then(function() {
>+  return prefChangePromise({"set": [["dom.fire_extra_xhr_loading_readystatechanges", true]]});
>+}).then(function() {
>+  return runTest();
>+}).then(function(count) {
>+  ok(count > 1, "Multiple loading readystatechange events should have been fired with the pref on.");
Hmm, this is racy. What guarantees browser doesn't get both chunks at once?
I think >= should be good enough here.
Attachment #8780872 - Flags: review?(bugs) → review+
>Hmm, this is racy. What guarantees browser doesn't get both chunks at once? I think >= should be good enough here.

Given that the point of the test is making sure that >1 event is fired (once for each OnDataAvailable, basically), maybe it would be better to send three chunks in the test instead? (or increase the timeout?)


>Why you need this when the pref is in all.js already.

Ah, didn't realize it could just be in all.js... makes sense now that I type it out again :)
Flags: needinfo?(bugs)
(In reply to Thomas Wisniewski from comment #13)
> >Hmm, this is racy. What guarantees browser doesn't get both chunks at once? I think >= should be good enough here.
> 
> Given that the point of the test is making sure that >1 event is fired (once
> for each OnDataAvailable, basically), maybe it would be better to send three
> chunks in the test instead? (or increase the timeout?)

Still racy. Can we use some explicit http chunk stuff to ensure we get separate OnDataAvailable calls? You'd need to ask some Necko folks or look at the code.
Perhaps Transfer-Encoding  ?
Flags: needinfo?(bugs)
Unfortunately, transfer-encoding wasn't doing the trick, and an IRC conversation makes it seem like the best we can do is keep the response plain and make it larger than 32k:

> < tomwis> is there a way to ensure (for a test) that OnDataAvailable will be called more than once in a 
>           non-racy way?
> < mcmanus> there is no API that guarantees that
> < mcmanus> but in practice, if you write more than 32KB, that will bridge buffers and I *think* result in multiple ODA
> < mcmanus> not 100% sure, as I said that's not a guarantee of the API
> < mcmanus> there could be something in there that gets clever and coalesces it.
> < mcmanus> but as far as I know, coalescing shouldn't happen in the plain case - especially if you're not using an 
>            encoded body (like gzip).. so I would think a larger than 32KB would be more reliable than timeout and I 
>            would write a well commented test around that if it seemed stable on try

Using the 32k-size trick seems to be working at least as well as the timeout method I was using, so the theory seems to check out. A try run just shows the usual smattering of unrelated intermittents: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3a7a0897d210

Do you think that'll suffice? I've left comments in both test files to note what we're doing any why, and that it may not be enough to avoid intermittent failures.
Attachment #8780872 - Attachment is obsolete: true
Flags: needinfo?(bugs)
Sounds good, but could you perhaps do something like
var data = (new Array(1 << 16)).join("A");
and then send that, maybe in two chunks.
Flags: needinfo?(bugs)
Sure, here's a version which just sends the data in one big whack using that line of code instead (since the lower layers split it up for us anyhow).

Should I let this patch simmer for a bit longer, or do you think it's landable with that fix?
Attachment #8783058 - Attachment is obsolete: true
Flags: needinfo?(bugs)
Is (new Array(1 << 15)).join("A") enough? Don't we want at least two 32k chunks, or something like that.
So either send 1 << 16 or 1 << 15 twice? Perhaps latter is simpler
Flags: needinfo?(bugs)
Oops, good catch, I didn't realize I had left a 15 in there instead of a 16.

It shouldn't matter if I send it in two chunks or not, so I just changed that one line in this version.
Attachment #8783119 - Attachment is obsolete: true
Comment on attachment 8783128 [details] [diff] [review]
918719-send_only_one_loading_readystatechange_unless_pref_is_set.diff

Guess I probably should have r?'d this again, huh? :)

I only changed the 1<<15 to 1<<16 in the latest version, which should do the trick.
Attachment #8783128 - Flags: review?(bugs)
Comment on attachment 8783128 [details] [diff] [review]
918719-send_only_one_loading_readystatechange_unless_pref_is_set.diff

>exporting patch:
># HG changeset patch
># User Thomas Wisniewski <wisniewskit@gmail.com>
># Date 1471650080 14400
>#      Fri Aug 19 19:41:20 2016 -0400
># Node ID 2a3f48cf78a5d811dd4672671d9bcaec3fb6d19a
># Parent  646321aa345ce4f43e40edf0b847153e2522fa8a
>Bug 918719 - Only fire one loading readystatechange per XHR, but keep the old behavior available behind the preference dom.send_multiple_xhr_loading_readystatechanges. r=smaug
>
>diff --git a/b2g/app/b2g.js b/b2g/app/b2g.js
>--- a/b2g/app/b2g.js
>+++ b/b2g/app/b2g.js
>@@ -1018,16 +1018,20 @@ pref("identity.fxaccounts.remote.oauth.u
> pref("identity.fxaccounts.remote.profile.uri", "https://profile.accounts.firefox.com/v1");
> 
> // Disable Firefox Accounts device registration until bug 1238895 is fixed.
> pref("identity.fxaccounts.skipDeviceRegistration", true);
> 
> // Enable mapped array buffer.
> pref("dom.mapped_arraybuffer.enabled", true);
> 
>+// Whether to send more than one "loading" readystatechange during XHRs to
>+// simulate progress events for sites still not using modern progress events.
>+pref("dom.fire_extra_xhr_loading_readystatechanges", false);
>+
You don't need this change since you have the pref in all.js already
Attachment #8783128 - Flags: review?(bugs) → review+
>You don't need this change since you have the pref in all.js already

Thanks for the reminder about that; this patch removes that pointless change.

Did you have another comment about the first chunk of the patch, which didn't come through in comment 21?

A fresh try run is still fine: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c2d37e13a889
Attachment #8783128 - Attachment is obsolete: true
Flags: needinfo?(bugs)
Comment on attachment 8785124 [details] [diff] [review]
918719-send_only_one_loading_readystatechange_unless_pref_is_set.diff

Just a friendly ping to make sure that comment 21 only had the one issue to address, rather than a change to the commit message or something (which didn't come through in the comment).
Flags: needinfo?(bugs)
Attachment #8785124 - Flags: review?(bugs)
No, I didn't have other comments.
Attachment #8785124 - Flags: review?(bugs) → review+
Alright, thanks!

Here's a trivially rebased version. A fresh try run still seems fine, just a TPBL hiccup and a couple of unrelated failures: https://treeherder.mozilla.org/#/jobs?repo=try&revision=13b868a6e941

Carrying over r+ and requesting check-in.
Attachment #8785124 - Attachment is obsolete: true
Keywords: checkin-needed
Blocks: 1301484
Pushed by kwierso@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ec584a62ec26
Only fire one loading readystatechange per XHR, but keep the old behavior available behind the preference dom.send_multiple_xhr_loading_readystatechanges. r=smaug a=smaug
Keywords: checkin-needed
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/af1a3c41d8c1 for Android "test_bug918719.html | Multiple loading readystatechange events should have been fired with the pref on" failures, https://treeherder.mozilla.org/logviewer.html#?job_id=35589818&repo=mozilla-inbound
Hmm. That's not good. I'm doubtful that this would be Android-specific, which means that we really can't rely on necko splitting up requests over 32k in size after all (to make sure that OnDataAvailable is definitely called more than once for a given XHR).

Patrick, you wouldn't have any other ideas that we can try here to test this case reliably?
Flags: needinfo?(mcmanus)
Yeah, not as "fails on Android" as I thought, https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&tochange=af1a3c41d8c139f070a553f1125b106cea3d596d&fromchange=cfa4bfef55aa1f78bd6c51faa46cbc23d6643715&filter-searchStr=android%20mochitest-9 fits well with "when we're (slow|memory constrained|something) in a way we only are on Android, mostly Android debug, in automation."
(In reply to Thomas Wisniewski from comment #29)
> Hmm. That's not good. I'm doubtful that this would be Android-specific,
> which means that we really can't rely on necko splitting up requests over
> 32k in size after all (to make sure that OnDataAvailable is definitely
> called more than once for a given XHR).
> 
> Patrick, you wouldn't have any other ideas that we can try here to test this
> case reliably?

not anything great.. the api really doesn't say anything about the granularity.

if you'd like to put something into the http channel that honors a pref about max ODA size I would be ok with that.. in the wild the problem would be that we don't really know what the terminal stream listener is (i.e. the channel might not be delivering directly to the XHR listener - the interface allows stuff to shim in between.. addons obviously, but also things like compression decoders, etc..).. but for a test you could probably make it work.

otoh if your current approach works "most of the time" that might be sufficient as a regression test (assuming you change things not to fail when the coverage is not fully executed) - we run these tests a lot so probability would have us covered.
Flags: needinfo?(mcmanus)
Phil, Olli, what do you think? Would it be sufficient to just let the intermittent failures happen, marking them as a known "expected" failure for reference?

Note that I don't mind figuring out how to change the API to try adding some more determinism here for tests, if you suspect that's worth the time and effort.
Flags: needinfo?(philringnalda)
Flags: needinfo?(bugs)
(In reply to Thomas Wisniewski from comment #32)
> Phil, Olli, what do you think? Would it be sufficient to just let the
> intermittent failures happen, marking them as a known "expected" failure for
> reference?
> 

that wasn't quite my suggestion - I think you do need to adjust the test to avoid fails in tbpl.

Perhaps I misunderstand, imo you would need to find a way to have the test pass when the precondition wasn't meant (i.e. when it didn't meaningfully test anything - but not when it did test something and it got the unexpected result.) rather than having expected failures in tbpl. I don't know enough about the test to know if that's feasible.
>I don't know enough about the test to know if that's feasible.

I see. Basically, XHRs currently send more than one "loading" readystatechange event, once per OnDataAvailable call, but with this bugfix they're going to follow the XHR standard, which ensures that only one "loading" event is fired per request.

And so the test must check that only one such loading event happens, even if OnDataAvailable is called more than once. The way the failing test is trying to do that is to simply have the server send more than 32k of plain text, on the assumption that would be enough to have the data split into multiple frames (and so make multiple OnDataAvailable calls). Since that isn't consistently happening though, we're seeing intermittent failures.

I guess the only thing that can be done, then, is to add a way to artificially force such a frame split. Perhaps a special type of response flush command which forces OnDataAvailable to be called, before the server sends some more data?
Flags: needinfo?(philringnalda)
Flags: needinfo?(bugs)
Could the test just send more data? Or send first x bytes and then wait for some time and then send x bytes again?
Perhaps call response.bodyOutputStream.flush(); after sending the data first time.
>Or send first x bytes and then wait for some time and then send x bytes again?

That's what I was trying initially, but in comment 12 you suggested it would be racy. We can try it though, and see if it isn't a problem after all? (Though bz also suggested the timeout method might be racy in another bug, so I'm guessing not).

>Perhaps call response.bodyOutputStream.flush(); after sending the data first time.

That was what I tried first, actually, but it doesn't end up triggering an OnDataAvailable call. It *does* trigger the following warning, but I'm not yet familiar enough with the code to know how to get it to (optionally) trigger another OnDataAvailable event when that case is hit:

    [Parent 12301] WARNING: Partial transfer, incomplete HTTP response received: file /media/ssd/mozilla/central/mozilla-central/netwerk/protocol/http/nsHttpTransaction.cpp, line 1047
Flags: needinfo?(philringnalda)
Flags: needinfo?(bugs)
The initial patch was sending very tiny bit of data, certainly less than the 32k.
Flags: needinfo?(bugs)
The only info I'm going to be able to provide is "sorry, no, you cannot just intermittently fail."
Flags: needinfo?(philringnalda)
>The initial patch was sending very tiny bit of data, certainly less than the 32k.

Ah, right, I didn't realize you meant doing both tricks at once for some reason. Yes, I'm certainly willing to try it out (say, sending two big chunks >32k each, waiting a second or two between them). But given Phil's reminder that intentional intermittents are verboten, since we can't sure it won't intermittently fail I'm not sure that will still be good enough (I'm still willing to try it out in case we get lucky).

It sounds like the only reliable thing to do here would be to figure out how to make a flush command that ensures that OnDataAvailable is triggered (though it will likely take me a while to figure out how to do that).
Flags: needinfo?(bugs)
well we don't of course control whatever the OS level network layer is doing. It might always merge packages, or split them.
But it is unclear to me why more than 32k data isn't enough here. Necko people should be able to answer to that.
Flags: needinfo?(bugs)
Patrick, is there any more light you can shed on this? I meant to ni you for comment 34, but apparently my wires got my wires somehow.
Flags: needinfo?(mcmanus)
Actually, nevermind. I managed to find another method using multiple XHRs to manipulate the main in-flight one, which should guarantee that OnDataAvailable is called more than one.

Since I'm just changing the test, I'll carry over r+ and just do another try-run before re-requesting check-in: https://treeherder.mozilla.org/#/jobs?repo=try&revision=36464374cb36
Attachment #8789506 - Attachment is obsolete: true
Flags: needinfo?(mcmanus)
The try run in comment 42 seems fine, just the usual assortment of unrelated intermittents.

Requesting check-in.
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4bdbbae12cb3
Only fire one loading readystatechange per XHR, but keep the old behavior available behind the preference dom.send_multiple_xhr_loading_readystatechanges. r=smaug
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/4bdbbae12cb3
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Depends on: 1304067
Houston, we have a problem.

After some tedious debugging with Etherpad c/o bug 1304067, it turns out that Chrome actually *does* send multiple loading events, which is corroborated by their sourcecode: https://chromium.googlesource.com/chromium/blink/+/master/Source/core/xmlhttprequest/XMLHttpRequest.cpp#478

And yet, they somehow fool the web platform test into letting them pass as well. I'm too tired to figure out how that happens right now, but it's looking like this is now a spec issue, thanks to legacy apps that still rely on this feature (which, funny enough, includes Google Hangouts).

Anne, do you have any insight to offer here, or am I correct in presuming this means we'll have to revert this patch and adjust the spec to fire multiple loading events for legacy apps?
Status: RESOLVED → REOPENED
Flags: needinfo?(annevk)
Resolution: FIXED → ---
No, I guess we just have to accept that readystatechange doesn't mean that readyState changed.
Flags: needinfo?(annevk)
Ryan, Tomcat, it seems like the best course of action here is to simply back out this changeset on inbound and central (it doesn't seem like it has made it any further yet). I'll later mark this bug as INVALID.

Is there a more formal process I should follow for such backout requests?
Flags: needinfo?(ryanvm)
Flags: needinfo?(cbook)
https://hg.mozilla.org/integration/mozilla-inbound/rev/8aee3d476c9b

Looks like this'll need to land on Aurora too.
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Flags: needinfo?(ryanvm)
Flags: needinfo?(cbook)
Resolution: --- → INVALID
Target Milestone: mozilla51 → ---
Comment on attachment 8790534 [details] [diff] [review]
918719-send_only_one_loading_readystatechange_unless_pref_is_set.diff

Approval Request Comment
[Feature/regressing bug #]: This is a request to backout this patch from Aurora since it landed prior to the last uplift.
[User impact if declined]: Known webcompat issues, such as bug 1304067.
[Describe test coverage new/current, TreeHerder]: N/A, it's reverting a change we made back to a previous known-good state.
[Risks and why]: Low-risk, it's what we're already shipping in Fx <51.
[String/UUID change made/needed]: None
Attachment #8790534 - Flags: approval-mozilla-aurora?
I've just filed spec and web platform test bugs for this issue, in case anyone's following along here:

Spec: https://github.com/whatwg/xhr/issues/92
WPT:  https://github.com/w3c/web-platform-tests/issues/3834
Comment on attachment 8790534 [details] [diff] [review]
918719-send_only_one_loading_readystatechange_unless_pref_is_set.diff

Backout, Aurora51+
Attachment #8790534 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Thanks guys!
This was merged to m-c today. Should be in tomorrow's nightlies.
https://hg.mozilla.org/mozilla-central/rev/8aee3d476c9b
You need to log in before you can comment on or make changes to this bug.