Websocket reload page timeout issue resulting from server bug

RESOLVED FIXED

Status

()

Core
General
--
minor
RESOLVED FIXED
8 years ago
6 years ago

People

(Reporter: David Lindauer, Assigned: smaug)

Tracking

(Depends on: 1 bug)

Trunk
x86
Windows XP
Points:
---

Firefox Tracking Flags

(blocking2.0 final+)

Details

(Whiteboard: [sg:moderate?])

Attachments

(5 attachments, 1 obsolete attachment)

(Reporter)

Description

8 years ago
User-Agent:       Mozilla/5.0 (Windows; Windows NT 5.1; rv:2.0b2pre) Gecko/20100720 Minefield/4.0b2pre ( .NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; Windows NT 5.1; rv:2.0b2pre) Gecko/20100720 Minefield/4.0b2pre ( .NET CLR 3.5.30729)

If I make a bug in my websocket server so that it ignores the protocol level \xff\x00 sequence (close request) then firefox goes into a timeout scenario, which has some issues as described in the detail below.



Reproducible: Always

Steps to Reproduce:
1.  Take a websocket server and modify it to ignore the request to close message (\xff\x00) in such a way that it will cause the end-of-connection timeout scenario on firefox to trigger.
2.  Run a web page that opens a websocket connection as part of page load, and let it connect to the bogus server
3.  Hit 'Reload' on the browser

Actual Results:  
firefox starts a timeout to see if the server will ever respond, but reloads the page during the timeout.  After the timeout, the socket is finally closed but the close event for the websocket on the old page instance gets sent to the new page instance.  If I then close the connection on the new page instance, I get a second close event on the new page instance.


Expected Results:  
I wasn't expecting the close event from the old page instance to be routed to the new page instance; probably the page should not reload during this timeout period but at least there shouldn't be confusing signals to the new page instance.

Addressing the server side issue where the server didn't honor the closing message resulted in better behavior.  In this case there was no timeout triggered.  Firefox closed the socket and sent the close event to the original page instance pretty much immediately and before reloading the page.
Component: General → General
Product: Firefox → Core
QA Contact: general → general
Version: unspecified → Trunk
Yeah, sending messages to the wrong page is _definitely_ wrong...

David, any chance of a public testcase for this?
Status: UNCONFIRMED → NEW
blocking2.0: --- → ?
Ever confirmed: true
Whiteboard: [sg:?]
Depends on: 571013
I'm not sure, but I think that it happens because nsDOMEventTargetHelper::CheckInnerWindowCorrectness doesn't work as expected. See bug 571013.
I'm not aware of any problems in nsDOMEventTargetHelper::CheckInnerWindowCorrectness (it is used heavily
for example by XHR), but perhaps there is something to tweak how
WebSocket uses it.
(Reporter)

Comment 4

8 years ago
Created attachment 460875 [details]
test server & javascript which exhibits problem

Sorry I can't share source for the server; it is proprietary...  but I was okayed on making a minimal binary distribution that exhibits the problem.  It also has javascript.  There is a readme file included with some comments.

install this to its default location as it has hard-coded paths.  This is a
web server and some javascript that exhibits the problem.  The web server has a deliberate bug as outlined in the report text.  The setup program is for WIN32.
(Reporter)

Comment 5

8 years ago
well I ran into some problems making that demo attachment...  just from my POV the issue now seems two-part, first the websocket throws an exception on a page that has been unloaded, and second my usage of javascript (specifically the inner functions) seems to 'hold onto' the old page's javascript but somehow it is allowed to get DIV id's from the new page.  Well that is what it seems like anyway, I don't know much about the firefox source code...

BTW while I was making the demo I noted that I was also getting 'ondata' events from the old page...  I didn't notice that before.
This patches reproduces the steps to this issue. It opens a popup
which connect to a ws connection that never closes. After connecting, it
reload the popup. I've tested this on linux and the issue didn't happened.
David, can you see any difference between this test and yours?
Created attachment 461612 [details] [diff] [review]
tracing patch

David, can you upload your client javascript/html files with the issue, 
please? Also, if possible, can you build your Firefox with this
tracing patch, and provide me the console output results, please?
(Reporter)

Comment 9

8 years ago
the html/javascript is in the install directory for the test case "c:\program files\webserverwithclosingbugtest"; it is a single html file with embedded javascript.

I'll get back to you with the rest early next week...
Created attachment 461640 [details] [diff] [review]
issue testcase

Hmm, this testcase gets the issue. I will take a look at this problem ASAP. Well, David, I don't need that output results anymore :) Thanks.
Assignee: nobody → wfernandom2004
Attachment #461608 - Attachment is obsolete: true
Attachment #461612 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #461608 - Attachment is obsolete: false
Comment on attachment 461640 [details] [diff] [review]
issue testcase

This test isn't right. The previous one addresses the steps to reproduce accordingly.
Attachment #461640 - Attachment is obsolete: true
Attachment #461612 - Attachment is obsolete: false
Sorry, I can't reproduce this issue. If someone can, please apply the tracing patch and send me the output results.
Status: ASSIGNED → UNCONFIRMED
Ever confirmed: false

Updated

8 years ago
Assignee: wfernandom2004 → Olli.Pettay
blocking2.0: ? → final+
David, any chance to get the source code of the test page, so that I could
look at it on non-Windows OS?
(Reporter)

Comment 14

8 years ago
Created attachment 472653 [details]
HTML page that shows bug...

As asked here is the HTML/javascript for the bug.  You may have to change addressing info in the page.  Note that this HTML page will likely not show any problems when connected to a properly designed websockets server.   To see the problem you need to use it to attach to a websockets server that does not honor the websocket 'close' message and instead leaves the connection dangling open until FireFox times out and does a forced close of the socket.  Unfortunately I cannot share suitably broken server code with you.
Uh, bugzilla never sent me email about comment 14.
(In reply to comment #5)
> BTW while I was making the demo I noted that I was also getting 'ondata' events
What ondata events?
(In reply to comment #3)
> I'm not aware of any problems in
> nsDOMEventTargetHelper::CheckInnerWindowCorrectness (it is used heavily
> for example by XHR), but perhaps there is something to tweak how
> WebSocket uses it.

This is the case here.
Created attachment 476809 [details] [diff] [review]
patch

This should do it. I don't yet have a mochitest for this, although
I can reproduce the problem when I test this manually.
The problem is that mOwner is actually always null, even if
aOwnerWindow was passed to Init method.

XHR created in web pages doesn't have this problem, since it sets
mOwner here http://hg.mozilla.org/mozilla-central/annotate/b439e73f33b7/content/base/src/nsXMLHttpRequest.cpp 

The ::Disconnect change prevents a deadlock during shutdown.
I could post it in another bug, if really needed.
Attachment #476809 - Flags: review?(jst)
Attachment #476809 - Flags: feedback?(wfernandom2004)
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment on attachment 476809 [details] [diff] [review]
patch

Hmm, good. Thanks Olli.
Attachment #476809 - Flags: feedback?(wfernandom2004) → feedback+

Updated

8 years ago
Attachment #476809 - Flags: review?(jst) → review+
http://hg.mozilla.org/mozilla-central/rev/ffb3d22c1f10
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
David, could you perhaps try a nightly build tomorrow or so and confirm
that the bug you saw has been fixed?
(Reporter)

Comment 22

8 years ago
That seems good.  Thanks guys!
Whiteboard: [sg:?] → [sg:moderate?]
You need to log in before you can comment on or make changes to this bug.