Closed Bug 791011 Opened 12 years ago Closed 12 years ago

When hitting the back or forward button socket connections are being dropped

Categories

(Core :: DOM: Navigation, defect)

16 Branch
x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla18
Tracking Status
firefox16 + verified
firefox17 + verified
firefox18 + verified
firefox-esr10 16+ fixed

People

(Reporter: zachaller, Assigned: justin.lebar+bug)

References

Details

(Keywords: regression, Whiteboard: [qa-])

Attachments

(1 file)

We have a single page web app using Director (https://github.com/flatiron/director#client-side) for client side routing along with NodeJS and NowJS (https://github.com/Flotype/now) which uses socket.io (https://github.com/LearnBoost/socket.io) and when we hit the back or forward buttons its dropping the socket connection which in turn makes our NowJS RPC calls fail this works as expected in FF15 but is not working in FF16
Summary: When hitting the back on forward button socket connections are being dropped → When hitting the back or forward button socket connections are being dropped
Olli, Patrick, can one of you look at this?
zach do you have a url where I can see the decsribed behavior?
Ok was able to get this setup for you its related to the forum only so just click around in there a little then try using the back button

username: moz
password: moz

also this is using some software running on my dev box so it might not be up/stable all the time.

http://dev.asn.und.edu/learn.aero.und.edu/users.asp?Tools=7&url=/learn.aero.und.edu/forum.asp?ForumID=61
websockets!

jason is probably best prepared to comment on the changes in FF16

Timestamp: 09/14/2012 03:16:45 PM
Error: The connection to wss://node1.aero.und.edu/socket.io/1/websocket/rFcQwRmPQR_2Wvt2nhvP was interrupted while the page was loading.
Source File: https://node1.aero.und.edu/socket.io/socket.io.js
Line: 2371

there is no similar error on ff14 (and probly 15, but i didn't have it handy to check.. and indeed the log above is from ff18 which I also had handy, but the reporter cites 16 as the bisect point.)
Product: Firefox → Core
Component: General → Networking: WebSockets
worth pointing out that socket.io is a very common way of deploying websockets, so if we've got a bug late in beta this should be a highish priority to look at.
Thanks for bringing this to our attention Patrick.  Jason do you have enough to go on here or is there any other help QA could provide to give you more visibility into the issue?
Assignee: nobody → jduell.mcbugs
Setting this to NEW based on comment 4.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Assignee: jduell.mcbugs → mcmanus
Smaug or Andrea - can you look into this?

We should really resolve this before Firefox 16 ships and we think it might be fallout from new bindings, or a DOM issue in general.
Yes--given that the DOM bindings for websockets have completely changed recently, and we haven't changed a thing AFAIK in the /netwerk code, I think the DOM folks are the first logical people to look into this one.
Assignee: mcmanus → amarchesini
Why do you think it's related to new bindings?  Bug 775368 is only on mozilla-central.
Zach:

I see this error already when I *first* connect to the URI in comment 3:

  The connection to wss://node1.aero.und.edu/socket.io/1/websocket/MOix9j6U-H2Ze30Vnhwq was interrupted while the page was loading. @ https://node1.aero.und.edu/socket.io/socket.io.js:2371

I should be seeing that only once I hit back/forward, right?

I'm also a bit unclear about the exact bug here.  Hitting back/forward is *supposed* to cancel existing websockets on a page, IIUC.  So is the bug that the websockets on the page being navigated *to* are not working?
Were there other DOM changes from 15-16?  Perhaps changes to the events or ordering - websockets drops connections on navigation on purpose, and has a bunch of code to watch for 'ghost' connections that are opened while navigating and close them.  (jduell and smaug were involved in that)
Jason:

hmm... yea the initial load error didn't not see before had to turn on persist errors to catch that one so not really sure what that is about, however the sockets do work fine until back and forward are pressed on the "FORUM" pages only sockets are not used else where. You mentioned that its suppose to drop connection on navigation and I would agree with that generally however the form navigation never performs a for lack of better word refresh its just hash tag changes using client side routing that is why I don't think it should drop the socket connection.
We need regression range here.
Zach, do you think you could try to find the last Nightly build which works and the first one
which doesn't and report the changesets. 
about:buildconfig tells the changeset.

http://harthur.github.com/mozregression/ can be useful tool, but finding the regression range shouldn't
be hard even without it (may just take some time)
Last good nightly: 2012-08-25
First bad nightly: 2012-08-26

Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f077de66e52d&tochange=b3cce81fef1a

if i did everything correct
Oh, I bet that's right.  Here's the cset, for those who can't see the bug:

  https://hg.mozilla.org/mozilla-central/rev/b431f498a9ba

Sigh.

bz, should we be passing STOP_CONTENT, or is that the wrong thing?
STOP_CONTENT won't stop network loads.

What we should really do is one of two things, perhaps.  Either only call Stop() if we already have a full-page navigation in flight, or perhaps even better just cancel the full-page navigation in question and nothing else...
So this sounds like some change in the navigation that didn't use to send Cancel() to websockets for navigation between anchors on the same page, but now does.  Moving component...

BTW The "ghost connection" fix landed in FF 13 (see bug 696085), so while it's possible that it regressed something it doesn't seem likely.
Assignee: amarchesini → nobody
Component: Networking: WebSockets → Document Navigation
hot potato!
Assignee: nobody → justin.lebar+bug
Attached patch Patch, v1Splinter Review
Like so?
Attachment #663092 - Flags: review?(bzbarsky)
Comment on attachment 663092 [details] [diff] [review]
Patch, v1

Oh, nice.  This is even simpler than I thought!

r=me
Attachment #663092 - Flags: review?(bzbarsky) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/882794d0063c

I'm pretty sure this will fix your problem, Zach, but if you could use a nightly build and verify that the fix, that would be very much appreciated!

This change will make it in to tomorrow's or the day after's nightly build, depending on when it gets merged into mozilla-central.
Comment on attachment 663092 [details] [diff] [review]
Patch, v1

[Approval Request Comment]
Regression caused by (bug #): bug 775009
User impact if declined: This bug.
Testing completed (on m-c, etc.): Pushed to m-c.
Risk to taking this patch (and alternatives if risky): This could regress the security fix bug 775009 in some way.  But we should either take this plus bug 775009 or neither patch, because this is a serious regression.
Attachment #663092 - Flags: approval-mozilla-esr10?
Attachment #663092 - Flags: approval-mozilla-beta?
Attachment #663092 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/882794d0063c
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Yup seems to be working fine again.
Comment on attachment 663092 [details] [diff] [review]
Patch, v1

[Triage Comment]
Given the positive verification, approving for all branches in support of keeping bug 775009 in the build.
Attachment #663092 - Flags: approval-mozilla-esr10?
Attachment #663092 - Flags: approval-mozilla-esr10+
Attachment #663092 - Flags: approval-mozilla-beta?
Attachment #663092 - Flags: approval-mozilla-beta+
Attachment #663092 - Flags: approval-mozilla-aurora?
Attachment #663092 - Flags: approval-mozilla-aurora+
Zach, can you check to see if this is now fixed for you? It should be working in Firefox 16.0.1, 17.0b1, 18.0a2, and 10.0.9esr. Thanks.
Whiteboard: [qa-]
> Zach, can you check to see if this is now fixed for you?

Comment 28?  But I'm not going to ask Zach to verify in four different builds, unless he wants to.
Was able to test 16.0.1, 17 (what ever version is on the beta channel), and 18.0a2 (aurora), it worked on all of those, didn't have a 10.0.9esr to test on however.
No longer depends on: CVE-2014-1530
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: