Closed Bug 753663 Opened 13 years ago Closed 13 years ago

crash in mozilla::net::SpdyStream::UpdateTransportReadEvents

Categories

(Core :: Networking: HTTP, defect)

13 Branch
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
firefox13 --- wontfix
firefox14 --- verified
firefox15 --- verified

People

(Reporter: scoobidiver, Assigned: mcmanus)

References

Details

(Keywords: crash, regression, Whiteboard: [spdy])

Crash Data

It first appeared in 15.0a1/20120507. The regression range might be: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=94ce5f33a9ea&tochange=448f554f6acb Signature mozilla::net::SpdyStream::UpdateTransportReadEvents(unsigned int) More Reports Search UUID 698c8deb-7e5e-4ced-9f6a-6e06f2120509 Date Processed 2012-05-09 20:07:37 Uptime 9248 Last Crash 1.2 weeks before submission Install Age 4.2 hours since version was first installed. Install Time 2012-05-09 15:54:21 Product Firefox Version 15.0a1 Build ID 20120509030514 Release Channel nightly OS Windows NT OS Version 6.1.7601 Service Pack 1 Build Architecture x86 Build Architecture Info GenuineIntel family 6 model 37 stepping 5 Crash Reason EXCEPTION_ACCESS_VIOLATION_READ Crash Address 0xa4 App Notes AdapterVendorID: 0x10de, AdapterDeviceID: 0x1086, AdapterSubsysID: 15793842, AdapterDriverVersion: 8.17.13.124 D2D? D2D+ DWrite? DWrite+ D3D10 Layers? D3D10 Layers+ EMCheckCompatibility True Total Virtual Memory 4294836224 Available Virtual Memory 3511443456 System Memory Use Percentage 26 Available Page File 14222860288 Available Physical Memory 6223749120 Frame Module Signature Source 0 xul.dll mozilla::net::SpdyStream::UpdateTransportReadEvents netwerk/protocol/http/SpdyStream.cpp:497 1 xul.dll mozilla::net::SpdySession::HandleSynReplyForValidStream netwerk/protocol/http/SpdySession.cpp:1075 2 xul.dll mozilla::net::SpdySession::HandleSynReply netwerk/protocol/http/SpdySession.cpp:1007 3 xul.dll mozilla::net::SpdySession::WriteSegments netwerk/protocol/http/SpdySession.cpp:1725 4 xul.dll nsHttpConnection::OnSocketReadable netwerk/protocol/http/nsHttpConnection.cpp:1386 5 xul.dll nsHttpConnection::OnInputStreamReady netwerk/protocol/http/nsHttpConnection.cpp:1506 6 xul.dll nsSocketInputStream::OnSocketReady netwerk/base/src/nsSocketTransport2.cpp:256 7 xul.dll nsSocketTransport::OnSocketReady netwerk/base/src/nsSocketTransport2.cpp:1575 8 xul.dll nsSocketTransportService::DoPollIteration netwerk/base/src/nsSocketTransportService2.cpp:762 9 xul.dll nsSocketTransportService::Run netwerk/base/src/nsSocketTransportService2.cpp:645 10 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:656 11 xul.dll nsThread::ThreadFunc xpcom/threads/nsThread.cpp:289 12 nspr4.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:426 13 nspr4.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:122 14 msvcr100.dll _callthreadstartex f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c:314 15 msvcr100.dll _threadstartex f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c:292 16 kernel32.dll BaseThreadInitThunk 17 ntdll.dll __RtlUserThreadStart 18 ntdll.dll _RtlUserThreadStart More reports at: https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Anet%3A%3ASpdyStream%3A%3AUpdateTransportReadEvents%28unsigned+int%29
Assignee: nobody → mcmanus
Whiteboard: [spdy]
the first build this happens with is 050703 - which is built from http://hg.mozilla.org/mozilla-central/rev/448f554f6acb that means it is very likely a regression from summary: bug 726366 - support tunneling spdy through proxies or summary: bug 751647 - speculative connect needs to check spdy ip pooling table seems most likely to be 726366
disregard comment 1. Neither of those commits were in the first instance of this crash report. I just read the logs wrong.
this is weird. hozna if you have thoughts they would be appreciated. The stack is always death on mTransaction->OnTransportStatus() in 0 xul.dll mozilla::net::SpdyStream::UpdateTransportReadEvents netwerk/protocol/http/SpdyStream.cpp:497 1 xul.dll mozilla::net::SpdySession::HandleSynReplyForValidStream netwerk/protocol/http/SpdySession.cpp:1075 2 xul.dll mozilla::net::SpdySession::HandleSynReply netwerk/protocol/http/SpdySession.cpp:1007 SpdyStream::mTransaction is a hard reference held for the life of the object. It should be valid. Because this crash is on the receipt of spdy-syn-reply the same transaction has already been used for creating the request so its very unlikely that it was an invalid ptr from the start. Further, this means it has done mTransaction->OnTransportStatus() from UpdateTransprortSendEvents previously without crashing. SpdyStream itself is probably valid because it survived the line above in UpdateTransportReadEvents as well as the get/set fully open logic.
Depends on: 739522
crud, this has spread to ff14 and ff13.. that means it is almost certainly 739522 (the only spdy thing that has been ported to ff13). its a low volume crash, and the fix was for an also uncommmon but serious issue. (how common is hard to know as it won't generate crash data). understanding the crash would make it easier to decide what to do.
Blocks: 739522
No longer depends on: 739522
Version: 15 Branch → 13 Branch
Crash Signature: [@ mozilla::net::SpdyStream::UpdateTransportReadEvents(unsigned int)] → [@ mozilla::net::SpdyStream::UpdateTransportReadEvents(unsigned int)] [@ mozilla::net::SpdySession::HandleSynReplyForValidStream()] [@ mozilla::net::SpdySession::HandleSynReply(mozilla::net::SpdySession*)] [@ mozilla::net::SpdySession::WriteSegments(ns…
Depends on: 755467
172 total data points.. uptime #-datapoints ----- ------------ 0 1 1-113 0 114-180 5 181-999 103 >999 63 That gap below 180 makes me think this is a session shutdown issue of some kind.. its hard to know when that will happen but most existing spdy services try and give you at least ~120 seconds and our idle timer is 180. obviously the clock doesn't really start at uptime=0, it starts when you make a connection.. and its an idle clock not an elapsed time clock, but I still find the correlation consistent - I mean 2/3 of the data points are less than 1000 seconds of uptime but essentially none of them are before 114 seconds. That could bode well for the patch in bug 755467 which basically reverts the recent changes to RestrictConnection() only in the case where there is a spdy session closing down (or not reusable in some way). That patch is in nightly (ff15) 0516-03 .. we'll see if it helps the crash stats.
(In reply to Patrick McManus [:mcmanus] from comment #7) > > > That could bode well for the patch in bug 755467 which basically reverts the > recent changes to RestrictConnection() only in the case where there is a > spdy session closing down (or not reusable in some way). That patch is in > nightly (ff15) 0516-03 .. we'll see if it helps the crash stats. crud - negative result. https://crash-stats.mozilla.com/report/index/0faa6270-564b-4ba7-99c2-3e92e2120517 is a crash of this signature with 051603
Crash Signature: mozilla::net::SpdySession::WriteSegments(nsAHttpSegmentWriter*, unsigned int, unsigned int*)] → mozilla::net::SpdySession::WriteSegments(nsAHttpSegmentWriter*, unsigned int, unsigned int*)] [@ mozilla::net::SpdyStream::UpdateTransportReadEvents]
OS: Windows 7 → All
Hardware: x86 → All
It seems likely that the SpdyStream being used is invalid. Perhaps deleted? or deleted twice? This is a scenario that would match the data: 1 :: new stream 1 = ptr foo 2 :: delete {1, foo} 3 :: new stream 3 = ptr foo [foo could be reused by the allocator] 4 :: delete {1, foo} [bug!] 5 :: lookup stream 3.. get foo, deref foo and see crash The delete of {1, foo} is really two steps.. mStreamIDHash.Remove(1) is a nop because 1 isn't in the ID hash anymore, and mTransactionHash.Remove(foo), which succeeds by accidentally removing (and dtoring) stream 3. The subsequent lookup for stream 3 gets a ptr that has been destroyed..
Good theory. I have to confirm that Windows system very much likes to reuse memory for a class object allocation that has just been freed for an object of that same class. It might be specific to jemalloc or to Win directly. (I was experiencing a bug that was hidden on my local machine because of a different mem allocation strategy of debug builds). Based on the last sentence: did you try to reproduce merely on Windows with a jemalloc optimized build? Best with nightly build.
(In reply to Honza Bambas (:mayhemer) from comment #11) > > Based on the last sentence: did you try to reproduce merely on Windows with > a jemalloc optimized build? Best with nightly build. I have tried windows nightlies and haven't repro'd. But the crash isn't all that common compared to the number of users so without any kind of STR I'm not surprised. it is interesting that there are ~260 crashes and they are 100% windows. Obviously a strong windows skew is expected, but at that point you'd expect OS X to make an appearance or two. fwiw (and its not much) there haven't been any nightly crashes yet from the may-17 build which had those other 3 fixes in it. I'm going to do a patch that basically looks for index inconsistencies when doing deletes or looking up by stream id. That might at least keep us on the rails while we work out the underlying issue.
(In reply to Patrick McManus [:mcmanus] from comment #12) > but at that point you'd expect OS X to make an appearance or two. OS X is proud to crash like Windows: https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Anet%3A%3ASpdyStream%3A%3AUpdateTransportReadEvents
Crash Signature: mozilla::net::SpdySession::WriteSegments(nsAHttpSegmentWriter*, unsigned int, unsigned int*)] [@ mozilla::net::SpdyStream::UpdateTransportReadEvents] → mozilla::net::SpdySession::WriteSegments(nsAHttpSegmentWriter*, unsigned int, unsigned int*)] [@ mozilla::net::SpdyStream::WriteSegments(nsAHttpSegmentWriter*, unsigned int unsigned int*)] [@ mozilla::net::SpdyStream::Close(unsigned int)] [@ mozilla::n…
(In reply to Scoobidiver from comment #13) > (In reply to Patrick McManus [:mcmanus] from comment #12) > > but at that point you'd expect OS X to make an appearance or two. > OS X is proud to crash like Windows: oddly comforting!
Depends on: 756551
a slighty less common variant makes it pretty clear that stream is bogus, probably already freed. 0 xul.dll mozilla::net::SpdyStream::Close netwerk/protocol/http/SpdyStream.cpp:789 1 xul.dll mozilla::net::SpdySession::CleanupStream netwerk/protocol/http/SpdySession.cpp:909 2 xul.dll mozilla::net::SpdySession::HandleSynReply netwerk/protocol/http/SpdySession.cpp:1011
bug 756551 has resolved the crash. I talked with alex via email and he indicated that patch wouldn't be accepted for ff13 at this time due to the late stage of that release.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.