ABORT: syntimer not null: '!mSynTimer', file /builds/slave/m-in-osx-dbg/build/netwerk/protocol/http/nsHttpConnectionMgr.cpp, line 1330

RESOLVED FIXED in mozilla10

Status

()

Core
Networking: HTTP
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: rnewman, Assigned: mayhemer)

Tracking

(Blocks: 1 bug)

Trunk
mozilla10
x86_64
Mac OS X
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

6 years ago
We're getting a crash after HTTP operations in Sync tests, following on from our fix in Bug 690170.

It looks like this (and see blocked bugs):

TEST-PASS | (xpcshell/head.js) | 27 (+ 0) check(s) passed

TEST-INFO | (xpcshell/head.js) | 0 check(s) todo
WARNING: nsExceptionService ignoring thread destruction after shutdown: file /builds/slave/m-in-osx-dbg/build/xpcom/base/nsExceptionService.cpp, line 197
###!!! ABORT: syntimer not null: '!mSynTimer', file /builds/slave/m-in-osx-dbg/build/netwerk/protocol/http/nsHttpConnectionMgr.cpp, line 1330
catch_exception_raise+0x00118AEC [/Users/cltbld/talos-slave/test/build/NightlyDebug.app/Contents/MacOS/XUL +0x001567DC]
catch_exception_raise+0x00118E2B [/Users/cltbld/talos-slave/test/build/NightlyDebug.app/Contents/MacOS/XUL +0x00156B1B]
XRE_AddStaticComponent+0x0001341B [/Users/cltbld/talos-slave/test/build/NightlyDebug.app/Contents/MacOS/XUL +0x01D9435B]
XRE_AddStaticComponent+0x00010802 [/Users/cltbld/talos-slave/test/build/NightlyDebug.app/Contents/MacOS/XUL +0x01D91742]
NS_GetTraceRefcnt_P+0x00001C6E [/Users/cltbld/talos-slave/test/build/NightlyDebug.app/Contents/MacOS/XUL +0x01D0F48E]
start+0x00005AF9 [/Users/cltbld/talos-slave/test/build/NightlyDebug.app/Contents/MacOS/xpcshell +0x00006B79]
start+0x00000036 [/Users/cltbld/talos-slave/test/build/NightlyDebug.app/Contents/MacOS/xpcshell +0x000010B6]
###!!! ABORT: syntimer not null: '!mSynTimer', file /builds/slave/m-in-osx-dbg/build/netwerk/protocol/http/nsHttpConnectionMgr.cpp, line 1330
<<<<<<<
PROCESS-CRASH | /Users/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_syncstoragerequest.js | application crashed (minidump found)
(Assignee)

Comment 1

6 years ago
With high probability related to bug 684893 that changed the code around the timer and landed just recently.
Blocks: 684893
(Assignee)

Comment 2

6 years ago
Theory: 

this is a race between nsHttpConnectionMgr::Shutdown and nsSocketTransportService::Shutdown, that both get called in that order (connection manager first) while both are called on the main thread, and socket transport state notification on the network thread

In detail:

- nsHttpConnectionMgr::Shutdown calls Abandone on all half open sockets, what doesn't ensure to drop the last reference on those instance, the transport still keeps ref to the half open socket instance via mCallbacks (we may want to drop it too)
- nsSocketTransportService::Shutdown stops the socket polling and disallows any state notifications from sockets
- however, in time between those two operation the net socket still runs and may invoke CONNECTING_TO notification from the socket (via callbacks) on an already abandoned half open socket
- this will keep another (later the last) ref to the half open socket
- Abandone, that cancels and releases the timer will never be called again

-> assertion fails when the timer is destroyed (see the call stack)

Solutions:

- drop all refs to half open socket in Abandon (not really a 100% solution)
- do not allow creation of the timer if a half open socket instance that has already been abandoned
(In reply to Honza Bambas (:mayhemer) from comment #2)

> - do not allow creation of the timer if a half open socket instance that has
> already been abandoned

when mEnt == null?

makes sense to me.
(Assignee)

Comment 4

6 years ago
Created attachment 572638 [details] [diff] [review]
v1 [Commit comment 8]

As proposed.
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Attachment #572638 - Flags: review?(mcmanus)
Attachment #572638 - Flags: review?(mcmanus) → review+
(Assignee)

Comment 5

6 years ago
There might be more work to do, I just got following crash (w/o the patch applied):

 	xul.dll!nsHttpConnectionMgr::AtActiveConnectionLimit(nsHttpConnectionMgr::nsConnectionEntry * ent=0x00000000, unsigned char caps=')')  Line 644 + 0x3 bytes	
>	xul.dll!nsHttpConnectionMgr::nsHalfOpenSocket::Notify(nsITimer * timer=0x0a513d38)  Line 1519 + 0x2f bytes	
 	xul.dll!nsTimerImpl::Fire()  Line 429	
 	xul.dll!nsTimerEvent::Run()  Line 523	

mEnt is null in nsHalfOpenSocket.

Comment 6

6 years ago
I've seen this somewhat randomly on most Windows 7, a couple of Windows XP and one Linux on the following urls:

http://tankionline.com/battle-ru1.html#/battle/000000007fffe07b
http://www.freelotto.com/LottoPicker/play.asp?xyz=8605884
http://types.yuzeli.com/survey/gender30/10179009/
http://veiculo.mercadolivre.com.br/MLB-206929353-yamaha-wave-raider-700-98-carr-rodo-2007-docs-ok-_JM
http://smotri.com/live/9911223/
http://hn.24h.com.vn/phim/sao-hoa-quy-van-sexy-de-so-c74a413856i62660.html
http://webalta.ru/
http://www.wolframalpha.com/input/?i=8069424%26lk=1
http://www.elpais.com/articulo/espana/Alimentar/palomas/calle/1500/euros/multa/elpepunac/20111007elpepinac_19/Tes

I never could reproduce locally but if it requires a really fast connection to see the race, then that is perhaps why. I have seen the crash @ nsHttpConnectionMgr::AtActiveConnectionLimit on mostly Windows 7 and some Windows XP with a pseudo stack

nsHttpConnectionMgr::AtActiveConnectionLimit(nsHttpConnectionMgr::nsConnectionEntry*, unsigned char) nsHttpConnectionMgr::nsHalfOpenSocket::Notify(nsITimer*) nsTimerImpl::Fire() nsTimerEvent::Run() nsThread::ProcessNextEvent(bool, bool*)

at 

http://types.yuzeli.com/survey/gender30/10179009/
http://www.163.com/
http://jd2008.360buy.com/JdHome/OrderList.aspx
http://download.pchome.net/development/c/detail-10255.html
http://www.tiantiandy.com/daludianying/20110929/21646.html

plus a couple of others. I couldn't reproduce though.
Blocks: 532972
(Assignee)

Comment 7

6 years ago
Bob, do you see this only during shutdown or also during normal run?

(In reply to Honza Bambas (:mayhemer) from comment #5)
> There might be more work to do, I just got following crash (w/o the patch
> applied):
> mEnt is null in nsHalfOpenSocket.

If my theory is correct, then no more changes are needed.  We never reach this code while mEnt is null with the r+'ed patch.
(Assignee)

Comment 8

6 years ago
Comment on attachment 572638 [details] [diff] [review]
v1 [Commit comment 8]

https://hg.mozilla.org/integration/mozilla-inbound/rev/59cf0c12f975
Attachment #572638 - Attachment description: v1 → v1 [Commit comment 8]
Attachment #572638 - Flags: checkin+

Comment 9

6 years ago
(In reply to Honza Bambas (:mayhemer) from comment #7)
> Bob, do you see this only during shutdown or also during normal run?

shutdown.
https://hg.mozilla.org/mozilla-central/rev/59cf0c12f975
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
You need to log in before you can comment on or make changes to this bug.