Last Comment Bug 669819 - [TI] Permaorange on WinXP debug, frequent on 10.5 debug and Win7 debug: test_websocket_basic.html | Test timed out. and test_websocket_hello.html | Connection closed cleanly
: [TI] Permaorange on WinXP debug, frequent on 10.5 debug and Win7 debug: test_...
Status: RESOLVED FIXED
: intermittent-failure
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: x86 All
: -- normal (vote)
: ---
Assigned To: general
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on: 537787
Blocks: 736325 438871 LandTIOnMC
  Show dependency treegraph
 
Reported: 2011-07-06 22:11 PDT by Phil Ringnalda (:philor)
Modified: 2012-11-25 19:31 PST (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch v1 (1.82 KB, patch)
2011-07-07 08:52 PDT, Patrick McManus [:mcmanus]
josh: review+
Details | Diff | Splinter Review

Description Phil Ringnalda (:philor) 2011-07-06 22:11:43 PDT
Came in on a TM -> JM merge, needs to go out before we can merge TM -> m-c.

The assertion stacks in the Windows logs are clearly crap; if the Mac ones are, it's not so obvious.

http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310004908.1310008532.27157.gz

39252 INFO TEST-PASS | /tests/content/base/test/test_websocket_basic.html | Wrong message - "test message100" should equal "test message100"
39253 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39254 INFO TEST-PASS | /tests/content/base/test/test_websocket_basic.html | Didn't receive all the messages! - 100 should equal 100
39255 INFO TEST-PASS | /tests/content/base/test/test_websocket_basic.html | Connection closed cleanly
39256 INFO TEST-END | /tests/content/base/test/test_websocket_basic.html | finished in 300987ms

39257 INFO TEST-START | /tests/content/base/test/test_websocket_hello.html
39258 INFO TEST-PASS | /tests/content/base/test/test_websocket_hello.html | 'this' should point to the WebSocket. (2) - [object WebSocket @ 0x27acca60 (native @ 0x29f67110)] should equal [object WebSocket @ 0x27acca60 (native @ 0x29f67110)]
39259 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
WARNING: NS_ENSURE_TRUE(mState == STATE_TRANSFERRING) failed: file /builds/slave/tm-osx-dbg/build/netwerk/base/src/nsSocketTransport2.cpp, line 1889
before 991232, after 987136, break 07b34000
++DOMWINDOW == 297 (0x257040c8) [serial = 827] [outer = 0x1f6176b0]
39260 INFO TEST-PASS | /tests/content/base/test/test_websocket_hello.html | Wrong data - "Hello world!" should equal "Hello world!"
###!!! ASSERTION: nsLoadGroup not thread-safe: '_mOwningThread.GetThread() == PR_GetCurrentThread()', file /builds/slave/tm-osx-dbg/build/netwerk/base/src/nsLoadGroup.cpp, line 213
nsLoadGroup::Internal::Release [netwerk/base/src/nsLoadGroup.cpp:213]
nsLoadGroup::Release [netwerk/base/src/nsLoadGroup.cpp:213]
nsCOMPtr<nsILoadGroup>::~nsCOMPtr [nsCOMPtr.h:533]
mozilla::net::BaseWebSocketChannel::~BaseWebSocketChannel [netwerk/protocol/websocket/BaseWebSocketChannel.h:57]
mozilla::net::WebSocketChannel::~WebSocketChannel [netwerk/protocol/websocket/WebSocketChannel.cpp:571]
mozilla::net::WebSocketChannel::Release [netwerk/protocol/websocket/WebSocketChannel.cpp:81]
nsCOMPtr<nsIInputStreamCallback>::assign_assuming_AddRef [nsCOMPtr.h:518]
nsCOMPtr<nsIInputStreamCallback>::assign_with_AddRef [nsCOMPtr.h:1204]
nsCOMPtr<nsIInputStreamCallback>::operator= [nsCOMPtr.h:664]
nsInputStreamReadyEvent::Run [xpcom/io/nsStreamUtils.cpp:117]
nsThread::ProcessNextEvent [xpcom/threads/nsThread.cpp:617]
NS_ProcessNextEvent_P [obj-firefox/xpcom/build/nsThreadUtils.cpp:245]
nsSocketTransportService::Run [netwerk/base/src/nsSocketTransportService2.cpp:639]
nsThread::ProcessNextEvent [xpcom/threads/nsThread.cpp:617]
NS_ProcessNextEvent_P [obj-firefox/xpcom/build/nsThreadUtils.cpp:245]
nsThread::ThreadFunc [xpcom/threads/nsThread.cpp:271]
_pt_root [nsprpub/pr/src/pthreads/ptthread.c:190]
libSystem.B.dylib + 0x32155
libSystem.B.dylib + 0x32012
39261 INFO TEST-END | /tests/content/base/test/test_websocket_hello.html | finished in 2135ms
Comment 1 Treeherder Robot 2011-07-06 22:12:31 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310009986.1310013665.16109.gz
Rev3 MacOSX Leopard 10.5.8 tracemonkey debug test mochitests-1/5 on 2011/07/06 20:39:46

s: talos-r3-leopard-038
39276 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39282 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 2 Treeherder Robot 2011-07-06 22:48:47 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310013920.1310017528.31431.gz
Rev3 MacOSX Leopard 10.5.8 tracemonkey debug test mochitests-1/5 on 2011/07/06 21:45:20

s: talos-r3-leopard-035
39261 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39267 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 3 Treeherder Robot 2011-07-06 23:07:37 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310014698.1310018443.2072.gz
Rev3 MacOSX Leopard 10.5.8 tracemonkey debug test mochitests-1/5 on 2011/07/06 21:58:18

s: talos-r3-leopard-057
39274 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39280 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 4 Treeherder Robot 2011-07-06 23:19:26 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310014820.1310019213.5728.gz
Rev3 WINNT 6.1 tracemonkey debug test mochitests-1/5 on 2011/07/06 22:00:20

s: talos-r3-w7-033
39299 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39305 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 5 Treeherder Robot 2011-07-06 23:21:51 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310014818.1310019372.6311.gz
Rev3 WINNT 5.1 tracemonkey debug test mochitests-1/5 on 2011/07/06 22:00:18

s: talos-r3-xp-011
39266 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39272 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 6 Patrick McManus [:mcmanus] 2011-07-07 07:19:44 PDT
it appears WebSocketChannel::mLoadGroup need to be proxy released on the main thread.

I'll want to checkout tracemonkey and run the patch through (at least os x) try though.. so it will take some time.
Comment 7 Patrick McManus [:mcmanus] 2011-07-07 08:42:52 PDT
the real cause is a regression from 537787, mLoadGroup was supposed to be moved to a base class but was instead duplicated in both the base and derived classes.

It should also be proxy released as I mentioned in comment 6, but that's apparently only relevant on a corner case that isn't being tested.

I'll attach a patch in a minute.
Comment 8 Patrick McManus [:mcmanus] 2011-07-07 08:52:01 PDT
Created attachment 544505 [details] [diff] [review]
patch v1
Comment 9 Treeherder Robot 2011-07-07 09:30:30 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310051303.1310054955.29548.gz
Rev3 MacOSX Leopard 10.5.8 tracemonkey debug test mochitests-1/5 on 2011/07/07 08:08:23

s: talos-r3-leopard-004
39261 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39267 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 10 Josh Matthews [:jdm] (on vacation until Dec 5) 2011-07-07 10:22:54 PDT
Comment on attachment 544505 [details] [diff] [review]
patch v1

Review of attachment 544505 [details] [diff] [review]:
-----------------------------------------------------------------
Comment 11 Jason Duell [:jduell] (needinfo me) 2011-07-07 10:59:37 PDT
Comment on attachment 544505 [details] [diff] [review]
patch v1

The e10s patch landed in aurora, and this fixes a relatively important yet simple bug in it.
Comment 12 Patrick McManus [:mcmanus] 2011-07-07 11:45:48 PDT
crud - this patch didn't fix the orange with try. (it did clear up all those assertions, which are legit problems so I'll push the patch.)
Comment 13 Patrick McManus [:mcmanus] 2011-07-07 11:50:04 PDT
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mcmanus@ducksong.com-1761e7290d98/try-macosx-debug/try_leopard-o-debug_test-mochitests-1-build414.txt.gz

JS team - the portion of the test that is failing deals with a very large (225KB) message. Are there any diffs between TM and M-C that might be relevant to that?
Comment 14 Brian Hackett (:bhackett) 2011-07-07 12:11:31 PDT
Can you attach the message or link to it?  TM recently pulled the TypeInference project (bug 608741) from the JM repo, which has a large effect on how JS is processed.
Comment 15 Patrick McManus [:mcmanus] 2011-07-07 12:28:22 PDT
I was thinking it was the large message in test part 4 (with the huge message, which is both sent and recvd through ws) of this test:

http://mxr.mozilla.org/mozilla-central/source/content/base/test
/test_websocket_basic.html?force=1#22

But it actually looks like part 3 of the same test didn't quite complete.. it moved all 100 of its messages but doesn't move onto the close until a timeout kicks in. Note all those calls to forcegc(). [the test is just to an echo server, so each time a message is sent one is recvd].

hth.
Comment 16 Patrick McManus [:mcmanus] 2011-07-07 12:33:48 PDT
I've moved the ASSERTION(loadgroup not threadsafe) patch to a different bug 669975 as it is just a distraction from the TM issue.
Comment 17 Brian Hackett (:bhackett) 2011-07-07 13:34:56 PDT
I've tried to repro this on both OSX debug and Win7 debug and they both work fine.
Comment 18 Brian Hackett (:bhackett) 2011-07-07 13:41:46 PDT
Have to wonder if this test is doing several hundred GCs and just hitting a timeout before it finishes.  Will reduce the testCount and tryserver that.
Comment 19 Treeherder Robot 2011-07-07 15:12:44 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=Jaegermonkey/1309933587.1309937474.16722.gz
Rev3 MacOSX Leopard 10.5.8 jaegermonkey debug test mochitests-1/5 on 2011/07/05 23:26:27

s: talos-r3-leopard-035
39258 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39264 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 20 Treeherder Robot 2011-07-07 15:14:40 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=Jaegermonkey/1309935934.1309940463.31441.gz
Rev3 WINNT 6.1 jaegermonkey debug test mochitests-1/5 on 2011/07/06 00:05:34

s: talos-r3-w7-012
39287 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39293 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 21 Treeherder Robot 2011-07-07 15:15:05 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=Jaegermonkey/1309935916.1309940614.32394.gz
Rev3 WINNT 5.1 jaegermonkey debug test mochitests-1/5 on 2011/07/06 00:05:16

s: talos-r3-xp-022
39273 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39279 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 22 Treeherder Robot 2011-07-07 22:21:06 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310096896.1310101285.19018.gz
Rev3 WINNT 6.1 tracemonkey debug test mochitests-1/5 on 2011/07/07 20:48:16

s: talos-r3-w7-047
39288 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39294 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 23 Treeherder Robot 2011-07-07 22:42:14 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310096891.1310101617.20645.gz
Rev3 WINNT 5.1 tracemonkey debug test mochitests-1/5 on 2011/07/07 20:48:11

s: talos-r3-xp-008
39266 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39272 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 24 Brian Hackett (:bhackett) 2011-07-08 06:01:08 PDT
Reduced the testCount to 50 on tryserver and this test was green on OSX, Win7 and WinXP.  I don't think it gets stuck, it just takes longer and usually hits the timeout.
Comment 25 Olli Pettay [:smaug] 2011-07-08 06:08:18 PDT
Well, if it hits the timeout when testCount isn't modified, there is probably 
some significant performance regression which should be fixed.
Comment 26 Brian Hackett (:bhackett) 2011-07-08 06:24:08 PDT
How do we determine that?  It seems like this test has been the cause of random timeout orange in the past, and it may have been close to the timeout threshold already.  There are GC performance regressions we're working on (bug 669815), but if fixing those does not fix this then I'd say the problem is with the test and/or timeout value.
Comment 27 Patrick McManus [:mcmanus] 2011-07-08 06:46:15 PDT
(In reply to comment #26)
> How do we determine that?  It seems like this test has been the cause of
> random timeout orange in the past, and it may have been close to the timeout
> threshold already. 

I'm not a fan of the tests, which certainly rely too much on timeouts and assumptions about processing speeds. but afaik it has not had an appreciable level of timeout flakiness in the last few months at all.

perhaps a measurement of time to completion before and after the merge would be a useful data point.
Comment 28 Treeherder Robot 2011-07-08 08:02:32 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310108603.1310112617.5976.gz
Rev3 MacOSX Leopard 10.5.8 tracemonkey debug test mochitests-1/5 on 2011/07/08 00:03:23

s: talos-r3-leopard-037
39259 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39265 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 29 Treeherder Robot 2011-07-08 08:03:17 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310110878.1310115232.16603.gz
Rev3 WINNT 6.1 tracemonkey debug test mochitests-1/5 on 2011/07/08 00:41:18

s: talos-r3-w7-054
39279 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39285 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 30 Treeherder Robot 2011-07-08 08:04:19 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310110879.1310115505.17825.gz
Rev3 WINNT 5.1 tracemonkey debug test mochitests-1/5 on 2011/07/08 00:41:19

s: talos-r3-xp-011
39279 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39285 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 31 Treeherder Robot 2011-07-08 14:32:07 PDT
philor
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1310156638.1310160288.17444.gz
Rev3 MacOSX Leopard 10.5.8 tracemonkey debug test mochitests-1/5 on 2011/07/08 13:23:58

s: talos-r3-leopard-027
39282 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39288 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 32 Treeherder Robot 2011-07-12 14:20:57 PDT
Jan de Mooij
http://tinderbox.mozilla.org/showlog.cgi?log=Jaegermonkey/1310498410.1310503144.4956.gz
Rev3 WINNT 5.1 jaegermonkey debug test mochitests-1/5 on 2011/07/12 12:20:10

s: talos-r3-xp-034
39274 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39280 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 33 Treeherder Robot 2011-07-12 14:21:08 PDT
Jan de Mooij
http://tinderbox.mozilla.org/showlog.cgi?log=Jaegermonkey/1310498414.1310502955.3863.gz
Rev3 WINNT 6.1 jaegermonkey debug test mochitests-1/5 on 2011/07/12 12:20:14

s: talos-r3-w7-018
39269 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
39275 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly
Comment 34 Brian Hackett (:bhackett) 2011-07-19 11:57:15 PDT
This hasn't happened on any platform in the JM branch since the patch for bug 670185 went in (which was, in large part, cutting out work done during GC).  How should this get marked as resolved?
Comment 35 Phil Ringnalda (:philor) 2011-08-01 00:44:39 PDT
JM's the only place it was broken, so if it's fixed there it's FIXED.
Comment 36 Treeherder Robot 2011-12-12 18:41:16 PST
mbrubeck%mozilla.com
https://tbpl.mozilla.org/php/getParsedLog.php?id=7896105&tree=Mozilla-Beta
Rev3 Fedora 12x64 mozilla-beta debug test mochitests-1/5 on 2011-12-12 17:50:57

40258 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_basic.html | Test timed out.
40262 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_websocket_hello.html | Connection closed cleanly

Note You need to log in before you can comment on or make changes to this bug.