Open Bug 896666 Opened 11 years ago Updated 2 days ago

Need to not put WebSocket into loadgroups or something

Categories

(Core :: Networking: WebSockets, defect, P2)

x86
macOS
defect

Tracking

()

People

(Reporter: bzbarsky, Assigned: jesup)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-triaged][necko-priority-queue])

Attachments

(1 file)

See bug 858538 comment 21.  Per spec, these things are NOT supposed to be canceled when a document is aborted (which corresponds to cancel on the loadgroup for us), but are specially canceled explicitly at document teardown.
Is this viewed as a defect? Is there any plan for its resolution on the roadmap?
Yes, and not yet respectively.
Whiteboard: [necko-backlog]
This bug causes websocket connections to be closed when the user clicks on a link which downloads a file and when the link does not have the download attribute.  I'm adding this comment in the hopes this helps others because this bug was hard to find when searching on google.
that still happens with Firefox 53.0.3 (mac + windows at least)

I've set up some code to reproduce the issue : https://github.com/anthonydahanne/firefox-ws-and-download
And, more importantly, a live demo of the issue : http://peaceful-sierra-45424.herokuapp.com/

This is not happening with Chrome nor Safari
Component: Networking → Networking: WebSockets
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
Thanks for the Live Demo Anthony.

I can confirm this is still broken in 59.0.2 

Works in Edge, IE 11, Safari, Chrome

Only fails in Firefox.
Priority: P3 → P2
See Also: → 712329

Thanks for the demo Anthony;
Using that live demo,
I can also confirm that I can reproduce that the connection is dropped in Firefox 70.0.1 (linux)
the connection is not dropped in Chrome Version 79.0.3945.36 (Official Build) beta (64-bit))

Can someone from the DOM team have a look at this bug please?

Flags: needinfo?(htsai)

Talking to my team. Will get back to you, Nhi. :)

Flags: needinfo?(htsai)

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #0)

See bug 858538 comment 21. Per spec, these things are NOT supposed to be
canceled when a document is aborted (which corresponds to cancel on the
loadgroup for us), but are specially canceled explicitly at document
teardown.

Hmm, really?
There is for example
"cancelation of the fetch algorithm by the user agent (e.g. in response to window.stop() or the user canceling the network connection manually) must cause the user agent to fail the connection."

Has the spec changed, or am I missing something?

(we could definitely have a separate list of fetch-y objects which aren't bound to loadgroup and then cancel those objects only when unloading the document or so)

Flags: needinfo?(bzbarsky)

Hmm. So that text comes from the processing model for EventSource, right?

It looks like EventSource does in fact do a fetch (see https://html.spec.whatwg.org/#dom-eventsource step 13), and aborting a document should cancel all fetches. So that fetch would be canceled, and the relevant text would take effect. The EventSource may, of course, "reestablish the connection" later. So it looks like I was wrong about EventSource; the original issue was definitely about WebSocket.

WebSocket does not use fetch in the spec.

Flags: needinfo?(bzbarsky)
Summary: Need to not put EventSource and WebSocket into loadgroups or something → Need to not put WebSocket into loadgroups or something

thanks, I wasn't hallucinating then :)

annevk, do you recall if the difference between EventSource and WebSocket behavior in the spec level is there on purpose?

Flags: needinfo?(annevk)

Also worth testing what other browsers do with EventSource in practice...

https://html.spec.whatwg.org/#unloading-document-cleanup-steps cleans up both EventSource and WebSocket. Which is run at step 11 of https://html.spec.whatwg.org/#unload-a-document. EventSource does use fetch though so it's also covered by the aborting steps. I also vaguely recall some other bugs and there not being agreement between browsers on when to terminate fetches. I suspect that's the bigger problem if that's still around (and also something that's likely wrong in the specification and not looked into much).

(Also, WebSocket also uses fetch a bit for its handshake.)

Flags: needinfo?(annevk)

The main compat question is whether https://html.spec.whatwg.org/#navigating-across-documents:abort-a-document-2 as part of the navigation algorithm should affect EventSource and WebSocket. This can happen without the document actually getting unloaded.

Just dropping in to report that this is still an issue with version 83.0 and an update on how this compares to other mainstream browsers.

All of the following browsers behave the same as tested with http://peaceful-sierra-45424.herokuapp.com/

Chrome 86
Edge 86
Safari 14.0.1
IE 11

Currently only Firefox behaves differently.

Just to provide some context, which I think is needed, because this bug hasn't been fixed for 8 years now: complex webapps keep open websocket connections to provide real-time updates when the underlying data changes. I'm the author of an ERP/MRP app for electronics inventory and production and this bug affects my users, because the websocket connection gets killed whenever they download a (generated) PDF document.

I like Firefox, so I implemented browser detection code and a special workaround exclusively for Firefox, trying to open the download in a new window, but I'm afraid I'll be affected by popup blocking sooner or later.

Thanks for the testcase Cody. I wonder if that also happens for long-lived fetch() or <img> calls or if the difference between browsers is purely about EventSource and WebSocket (which would be a bit weird, especially for the former). Another way of testing this would be to initiate these fetches and then navigate to a 204 response (which should keep the current document alive, but would run the abort algorithm, similar to downloads).

Valentin, can someone in the networking team investigate this?

Flags: needinfo?(valentin.gosu)

(In reply to Anne (:annevk) from comment #23)

Thanks for the testcase Cody. I wonder if that also happens for long-lived fetch() or <img> calls or if the difference between browsers is purely about EventSource and WebSocket (which would be a bit weird, especially for the former). Another way of testing this would be to initiate these fetches and then navigate to a 204 response (which should keep the current document alive, but would run the abort algorithm, similar to downloads).

Valentin, can someone in the networking team investigate this?

This seems rather similar to bug 1683409 which we recently encountered
There too a navigation that triggers the download and cancels resources currently loading on the page.

See Also: → 1683409
See Also: → 1699925
See Also: → 1671309

Nika, Olli, do you have an idea how to best fix this?

Flags: needinfo?(nika)
Flags: needinfo?(bugs)
Severity: normal → S3

The severity field for this bug is relatively low, S3. However, the bug has 7 duplicates, 11 votes and 6 See Also bugs.
:kershaw, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(kershaw)

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(kershaw)

(In reply to Release mgmt bot (nomail) [:suhaib / :marco/ :calixte] from comment #30)

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

It's still relevant. I still hit this: https://bugzilla.mozilla.org/show_bug.cgi?id=712329 which is closed for some reason and points to this bug...

https://github.com/ctm/mb2-doc/issues/827#issuecomment-1026296986

This is still an issue with pwa settinng using Nuxt or Next.

This would require having a dummy LoadGroup for WebSocket (similarly to sendBeacon) and then cancel that when the document goes away

Flags: needinfo?(smaug)
See Also: → 1803431
Whiteboard: [necko-backlog] → [necko-triaged][necko-priority-new]
Flags: needinfo?(nika)

Putting into priority-review, due to steps on how to implement this, is laid out in comment 33. We are apparently not following the spec, so it is important for us to resolve this bug.

Whiteboard: [necko-triaged][necko-priority-new] → [necko-triaged][necko-priority-review]
Whiteboard: [necko-triaged][necko-priority-review] → [necko-triaged][necko-priority-next]
Whiteboard: [necko-triaged][necko-priority-next] → [necko-triaged][necko-priority-queue]
Assignee: nobody → smayya

Good day, Sunil. I see you've been assigned this bug. I just checked and verified that it still affects my (closed source) software. My development stack is a bit esoteric (actix-web for the server, actix-web-actors for the server WebSocket implementation and the Rust crate wasm-sockets for the client WebSocket implementation) and I know remarkably little about browsers and JavaScript, otherwise I'd volunteer to write a tiny test case.

The way I verified this bug is still there is that I made a connection to my site, closed my laptop, restarted my DSL router, re-opened my laptop and instead of the WebSocket informing my client that something bad had happened via the error callback, a useful message was printed on the JavaScript console and my client then produced a useless error message coming from the artifact created by, I believe, the yarn package manager.

Feel free to ask questions. I'd love to support Firefox.

Unassigned from myself as I don't have cycles to work on this.
Will resume after couple of releases.

Assignee: smayya → nobody

(In reply to clifford.t.matthews from comment #35)

Good day, Sunil. I see you've been assigned this bug. I just checked and verified that it still affects my (closed source) software. My development stack is a bit esoteric (actix-web for the server, actix-web-actors for the server WebSocket implementation and the Rust crate wasm-sockets for the client WebSocket implementation) and I know remarkably little about browsers and JavaScript, otherwise I'd volunteer to write a tiny test case.

The way I verified this bug is still there is that I made a connection to my site, closed my laptop, restarted my DSL router, re-opened my laptop and instead of the WebSocket informing my client that something bad had happened via the error callback, a useful message was printed on the JavaScript console and my client then produced a useless error message coming from the artifact created by, I believe, the yarn package manager.

Feel free to ask questions. I'd love to support Firefox.

Hey Clifford,
Thank you for verifying this.
I wont be working on this for the next couple of months.
We will get back to you in-case we need further help in reproduction.

Assignee: nobody → rjesup
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: