Closed Bug 1369509 Opened 7 years ago Closed 7 years ago

Crash in RefPtr<T>::assign_with_AddRef | mozilla::net::nsHttpConnection::Init

Categories

(Core :: Networking: HTTP, defect)

Unspecified
Windows 10
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 --- disabled
firefox56 --- fixed

People

(Reporter: mccr8, Assigned: dragana)

References

Details

(Keywords: crash, Whiteboard: [necko-active])

Crash Data

Attachments

(1 file, 3 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-cf223a27-e7e5-4471-8d1d-530300170531.
=============================================================

5 crashes in the 5-31 Nightly build. I see two crashes on older versions, but maybe this is a new regression.
Assignee: nobody → dd.mozilla
Blocks: 1188435
Status: NEW → ASSIGNED
Whiteboard: [necko-active]
The problem here is that mEnt in HalfOpenSocket is freed, the pointer has value e5e5e5e5. mEnt is a raw pointer not ref count, but that should not be a problem. nsHttpConnectionEntry (mEnt is a nsHttpConnectionEntry) holds a list of HalfOpenSocket. A nsHttpConnectionEntry is destroyed only if HalfOpenSocket list is empty.

In TFO case HalfOpenSocket are removed from this list but there is a corresponding activeConn that also prevents a nsHttpConnectionEntry to be destroyed. (during TFO setup HalfOpenSocket is in mHalfOpenFastOpenBackups and in the same time period there must be an ActiveConn). The only time when a HalfOpenSocket is not in one of the list during call to functions: StartFastOpen, OnOutputStreamReady and SetFastOpenConnected. Looking through the code I do not see how it is possible to that the corresponding nsHttpConnectionEntry is removed, i.e. I do not think that we call any of the functions that would remove it. The functions that remove nsHttpConnectionEntry are: ClearConnectionHistory, OnMsgShutdown(in this crashes we are not in shutdown), OnMsgPruneDeadConnections.

The crash are always connected with the backupStream. There are in: nsHttpConnection::Init (this bug - backupSocket timed out and SocketTransport calls OnSocketReady), nsHalfOpenSocket::SetupStreams (bug 1375030 - backup socket timer has fired), nsHalfOpenSocket::OnTransportStatus(bug 1369205). Before removing HalfOpenSocket for the list the backup timer is canceled and callbacks for transport status events and outputStream are removed! 

The same crash is happening with older versions of firefox and TFO makes this crash appear more often.

Bug 1369205 is the most frequent one and it is only happening with TFO, not with older versions.
Attached patch bug_1369509_v1.patch (obsolete) — Splinter Review
One way to mitigate these crashes is to make HalfOpenSock hold a WeakRef to nsConnectionEntry. I decided not to do it, because I want to find out what really cause these crashes.
Attachment #8880348 - Flags: review?(mcmanus)
Attached patch bug_1369509_v1.patch (obsolete) — Splinter Review
I have seen some old crashes at the same place and I have decided to make another small change.
Attachment #8880348 - Attachment is obsolete: true
Attachment #8880348 - Flags: review?(mcmanus)
Attachment #8880358 - Flags: review?(mcmanus)
Attachment #8880358 - Flags: review?(mcmanus) → review+
Pushed by dd.mozilla@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/50673cd78a02
Let nHttpConnectionEntry be a weak reference in HalfOpenSock. r=mcmanus
Backed out for crashing xpcshell test_http2.js with mozilla::net::nsHttpConnectionMgr::nsHalfOpenSocket::OnOutputStreamReady:

https://hg.mozilla.org/integration/mozilla-inbound/rev/2e283997cabd5b098623506dd68db4751e5800d1

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=50673cd78a02cf26570e5469f537a307634241a1&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=109724832&repo=mozilla-inbound

[task 2017-06-24T04:59:14.933219Z] 04:59:14     INFO -  PID 9842 | starting test number 12
[task 2017-06-24T04:59:14.934918Z] 04:59:14     INFO -  (xpcshell/head.js) | test pending (2)
[task 2017-06-24T04:59:14.936615Z] 04:59:14     INFO -  (xpcshell/head.js) | test finished (2)
[task 2017-06-24T04:59:14.938370Z] 04:59:14     INFO -  TEST-PASS | netwerk/test/unit/test_http2.js | test_onStartR - [test_onStartR : 604] 0 == 0
[task 2017-06-24T04:59:14.940059Z] 04:59:14     INFO -  PID 9842 | /altsvc1 not over h2 yet - retry
[task 2017-06-24T04:59:14.941862Z] 04:59:14     INFO -  TEST-PASS | netwerk/test/unit/test_http2.js | test_onStartR - [test_onStartR : 604] 0 == 0
[task 2017-06-24T04:59:14.948699Z] 04:59:14     INFO -  PID 9842 | /altsvc1 not over h2 yet - retry
[task 2017-06-24T04:59:14.950777Z] 04:59:14     INFO -  PID 9842 | ExceptionHandler::GenerateDump cloned child 9876
[task 2017-06-24T04:59:14.952557Z] 04:59:14     INFO -  PID 9842 | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2017-06-24T04:59:14.954301Z] 04:59:14     INFO -  PID 9842 | ExceptionHandler::WaitForContinueSignal waiting for continue signal...
[task 2017-06-24T04:59:14.959945Z] 04:59:14     INFO -  <<<<<<<
[task 2017-06-24T04:59:14.961888Z] 04:59:14     INFO -  mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/DkoYECaORuWcmo4FT5Wqmg/artifacts/public/build/target.crashreporter-symbols.zip
[task 2017-06-24T04:59:29.032692Z] 04:59:29     INFO -  mozcrash Copy/paste: /usr/local/bin/linux64-minidump_stackwalk /tmp/xpc-other-sfzkH_/7f684efa-599c-73d6-7dd5-34062e619351.dmp /tmp/tmpi2LO8P
[task 2017-06-24T04:59:47.814036Z] 04:59:47     INFO -  mozcrash Saved minidump as /home/worker/workspace/build/blobber_upload_dir/7f684efa-599c-73d6-7dd5-34062e619351.dmp
[task 2017-06-24T04:59:47.816422Z] 04:59:47     INFO -  mozcrash Saved app info as /home/worker/workspace/build/blobber_upload_dir/7f684efa-599c-73d6-7dd5-34062e619351.extra
[task 2017-06-24T04:59:47.820273Z] 04:59:47  WARNING -  PROCESS-CRASH | netwerk/test/unit/test_http2.js | application crashed [@ mozilla::net::nsHttpConnectionMgr::nsHalfOpenSocket::OnOutputStreamReady]
[task 2017-06-24T04:59:47.835693Z] 04:59:47     INFO -  Crash dump filename: /tmp/xpc-other-sfzkH_/7f684efa-599c-73d6-7dd5-34062e619351.dmp
[task 2017-06-24T04:59:47.835967Z] 04:59:47     INFO -  Operating system: Linux
[task 2017-06-24T04:59:47.844532Z] 04:59:47     INFO -                    0.0.0 Linux 3.13.0-100-generic #147-Ubuntu SMP Tue Oct 18 16:48:51 UTC 2016 x86_64
[task 2017-06-24T04:59:47.844808Z] 04:59:47     INFO -  CPU: amd64
[task 2017-06-24T04:59:47.845122Z] 04:59:47     INFO -       family 6 model 45 stepping 7
[task 2017-06-24T04:59:47.845344Z] 04:59:47     INFO -       1 CPU
[task 2017-06-24T04:59:47.845521Z] 04:59:47     INFO -  GPU: UNKNOWN
[task 2017-06-24T04:59:47.845762Z] 04:59:47     INFO -  Crash reason:  SIGSEGV
[task 2017-06-24T04:59:47.846148Z] 04:59:47     INFO -  Crash address: 0x60
[task 2017-06-24T04:59:47.846486Z] 04:59:47     INFO -  Process uptime: not available
[task 2017-06-24T04:59:47.846672Z] 04:59:47     INFO -  Thread 3 (crashed)
[task 2017-06-24T04:59:47.847230Z] 04:59:47     INFO -   0  libxul.so!mozilla::net::nsHttpConnectionMgr::nsHalfOpenSocket::OnOutputStreamReady [nsHttpConnectionMgr.cpp:50673cd78a02 : 4027 + 0x4]
[task 2017-06-24T04:59:47.847642Z] 04:59:47     INFO -      rax = 0x0000000000000000   rdx = 0x0000000000000000
[task 2017-06-24T04:59:47.848082Z] 04:59:47     INFO -      rcx = 0x0000000000000003   rbx = 0x00007fed0b66a500
[task 2017-06-24T04:59:47.848519Z] 04:59:47     INFO -      rsi = 0x0000000000000000   rdi = 0x00007fed1d486330
[task 2017-06-24T04:59:47.849004Z] 04:59:47     INFO -      rbp = 0x00007fed1a1fd4e0   rsp = 0x00007fed1a1fd4a0
[task 2017-06-24T04:59:47.849404Z] 04:59:47     INFO -       r8 = 0x0000000000000000    r9 = 0x00007fed1d500248
[task 2017-06-24T04:59:47.849840Z] 04:59:47     INFO -      r10 = 0x00007fed0bb00ff8   r11 = 0x00007fed1d5001e8
[task 2017-06-24T04:59:47.860048Z] 04:59:47     INFO -      r12 = 0x00007fed0bb1c290   r13 = 0x0000000000000000
[task 2017-06-24T04:59:47.861045Z] 04:59:47     INFO -      r14 = 0x0000000000000001   r15 = 0x0000000000000003
[task 2017-06-24T04:59:47.861903Z] 04:59:47     INFO -      rip = 0x00007fed2a2a5118
[task 2017-06-24T04:59:47.863084Z] 04:59:47     INFO -      Found by: given as instruction pointer in context
[task 2017-06-24T04:59:47.876282Z] 04:59:47     INFO -   1  libxul.so!mozilla::net::nsSocketOutputStream::OnSocketReady [nsSocketTransport2.cpp:50673cd78a02 : 559 + 0x9]
[task 2017-06-24T04:59:47.877205Z] 04:59:47     INFO -      rbx = 0x00007fed0bb1c290   rbp = 0x00007fed1a1fd520
[task 2017-06-24T04:59:47.878077Z] 04:59:47     INFO -      rsp = 0x00007fed1a1fd4f0   r12 = 0x00007fed0bb1c1f0
[task 2017-06-24T04:59:47.878927Z] 04:59:47     INFO -      r13 = 0x0000000000000000   r14 = 0x0000000000000001
[task 2017-06-24T04:59:47.879815Z] 04:59:47     INFO -      r15 = 0x0000000000000003   rip = 0x00007fed29feaafa
[task 2017-06-24T04:59:47.880627Z] 04:59:47     INFO -      Found by: call frame info
[task 2017-06-24T04:59:47.881555Z] 04:59:47     INFO -   2  libxul.so!mozilla::net::nsSocketTransport::OnSocketReady [nsSocketTransport2.cpp:50673cd78a02 : 2174 + 0x5]
[task 2017-06-24T04:59:47.888413Z] 04:59:47     INFO -      rbx = 0x00007fed0bb1c000   rbp = 0x00007fed1a1fd560
[task 2017-06-24T04:59:47.889381Z] 04:59:47     INFO -      rsp = 0x00007fed1a1fd530   r12 = 0x0000000000000003
[task 2017-06-24T04:59:47.892035Z] 04:59:47     INFO -      r13 = 0x00007fed0bba8730   r14 = 0x0000000000000001
[task 2017-06-24T04:59:47.892915Z] 04:59:47     INFO -      r15 = 0x0000000000000003   rip = 0x00007fed29fef965
[task 2017-06-24T04:59:47.893724Z] 04:59:47     INFO -      Found by: call frame info
[task 2017-06-24T04:59:47.894681Z] 04:59:47     INFO -   3  libxul.so!mozilla::net::nsSocketTransportService::DoPollIteration [nsSocketTransportService2.cpp:50673cd78a02 : 1162 + 0xd]
[task 2017-06-24T04:59:47.898366Z] 04:59:47     INFO -      rbx = 0x00007fed1d491370   rbp = 0x00007fed1a1fd5d0
[task 2017-06-24T04:59:47.902095Z] 04:59:47     INFO -      rsp = 0x00007fed1a1fd570   r12 = 0x0000000000000048
[task 2017-06-24T04:59:47.903813Z] 04:59:47     INFO -      r13 = 0x0000000000000040   r14 = 0x0000000000000001
[task 2017-06-24T04:59:47.905505Z] 04:59:47     INFO -      r15 = 0x0000000000000003   rip = 0x00007fed29ff28d6
[task 2017-06-24T04:59:47.911951Z] 04:59:47     INFO -      Found by: call frame info
[task 2017-06-24T04:59:47.913748Z] 04:59:47     INFO -   4  libxul.so!mozilla::net::nsSocketTransportService::Run [nsSocketTransportService2.cpp:50673cd78a02 : 936 + 0xf]
[task 2017-06-24T04:59:47.915432Z] 04:59:47     INFO -      rbx = 0x0000000000000000   rbp = 0x00007fed1a1fd6b0
[task 2017-06-24T04:59:47.917141Z] 04:59:47     INFO -      rsp = 0x00007fed1a1fd5e0   r12 = 0x00007fed1d491454
[task 2017-06-24T04:59:47.918790Z] 04:59:47     INFO -      r13 = 0x00007fed1a1fd63f   r14 = 0x00007fed1d49145c
[task 2017-06-24T04:59:47.920458Z] 04:59:47     INFO -      r15 = 0x00007fed1d491370   rip = 0x00007fed29ffc32f
[task 2017-06-24T04:59:47.922076Z] 04:59:47     INFO -      Found by: call frame info
[task 2017-06-24T04:59:47.923817Z] 04:59:47     INFO -   5  libxul.so!nsThread::ProcessNextEvent [nsThread.cpp:50673cd78a02 : 1421 + 0x6]
[task 2017-06-24T04:59:47.925503Z] 04:59:47     INFO -      rbx = 0x00007fed1d4e26a0   rbp = 0x00007fed1a1fd7c0
[task 2017-06-24T04:59:47.927169Z] 04:59:47     INFO -      rsp = 0x00007fed1a1fd6c0   r12 = 0x0000000000000001
[task 2017-06-24T04:59:47.928850Z] 04:59:47     INFO -      r13 = 0x0000000000000000   r14 = 0x0000000000000000
[task 2017-06-24T04:59:47.930589Z] 04:59:47     INFO -      r15 = 0x00007fed1a1fd7df   rip = 0x00007fed29f4bc93
[task 2017-06-24T04:59:47.934406Z] 04:59:47     INFO -      Found by: call frame info
[task 2017-06-24T04:59:47.936169Z] 04:59:47     INFO -   6  libxul.so!NS_ProcessNextEvent [nsThreadUtils.cpp:50673cd78a02 : 471 + 0xd]
[task 2017-06-24T04:59:47.937849Z] 04:59:47     INFO -      rbx = 0x0000000000000000   rbp = 0x00007fed1a1fd7f0
[task 2017-06-24T04:59:47.939562Z] 04:59:47     INFO -      rsp = 0x00007fed1a1fd7d0   r12 = 0x00007fed1aabbb40
[task 2017-06-24T04:59:47.941241Z] 04:59:47     INFO -      r13 = 0x00007fed1d4e26c0   r14 = 0x00007fed1d4e26a0
[task 2017-06-24T04:59:47.948016Z] 04:59:47     INFO -      r15 = 0x00007fed1aabbb60   rip = 0x00007fed29f4992d
[task 2017-06-24T04:59:47.949700Z] 04:59:47     INFO -      Found by: call frame info
[task 2017-06-24T04:59:47.951520Z] 04:59:47     INFO -   7  libxul.so!mozilla::ipc::MessagePumpForNonMainThreads::Run [MessagePump.cpp:50673cd78a02 : 338 + 0xa]
[task 2017-06-24T04:59:47.953190Z] 04:59:47     INFO -      rbx = 0x00007fed1d4e29e0   rbp = 0x00007fed1a1fd850
[task 2017-06-24T04:59:47.954854Z] 04:59:47     INFO -      rsp = 0x00007fed1a1fd800   r12 = 0x00007fed1aabbb40
[task 2017-06-24T04:59:47.956544Z] 04:59:47     INFO -      r13 = 0x00007fed1d4e26c0   r14 = 0x00007fed1d4e26a0
[task 2017-06-24T04:59:47.958222Z] 04:59:47     INFO -      r15 = 0x00007fed1aabbb60   rip = 0x00007fed2a36ff2a
[task 2017-06-24T04:59:47.959877Z] 04:59:47     INFO -      Found by: call frame info
[task 2017-06-24T04:59:47.961565Z] 04:59:47     INFO -   8  libxul.so!MessageLoop::Run [message_loop.cc:50673cd78a02 : 320 + 0x5]
[task 2017-06-24T04:59:47.963236Z] 04:59:47     INFO -      rbx = 0x00007fed1d4e26a0   rbp = 0x00007fed1a1fd880
[task 2017-06-24T04:59:47.964939Z] 04:59:47     INFO -      rsp = 0x00007fed1a1fd860   r12 = 0x00007fed1a1fd8c0
[task 2017-06-24T04:59:47.966595Z] 04:59:47     INFO -      r13 = 0x00007fed1d4e26c0   r14 = 0x00007fed1d4e29e0
[task 2017-06-24T04:59:47.970211Z] 04:59:47     INFO -      r15 = 0x00007fed1a1fe3d0   rip = 0x00007fed2a3426d2
[task 2017-06-24T04:59:47.971876Z] 04:59:47     INFO -      Found by: call frame info
[task 2017-06-24T04:59:47.973593Z] 04:59:47     INFO -   9  libxul.so!nsThread::ThreadFunc [nsThread.cpp:50673cd78a02 : 503 + 0x8]
[task 2017-06-24T04:59:47.975291Z] 04:59:47     INFO -      rbx = 0x00007fed1d4e26a0   rbp = 0x00007fed1a1fd900
[task 2017-06-24T04:59:47.983962Z] 04:59:47     INFO -      rsp = 0x00007fed1a1fd890   r12 = 0x00007fed1a1fd8c0
[task 2017-06-24T04:59:47.985669Z] 04:59:47     INFO -      r13 = 0x00007fed1d4e26c0   r14 = 0x00007fed1d4e29e0
[task 2017-06-24T04:59:47.987362Z] 04:59:47     INFO -      r15 = 0x00007fed1a1fe3d0   rip = 0x00007fed29f4a260
[task 2017-06-24T04:59:47.989017Z] 04:59:47     INFO -      Found by: call frame info
[task 2017-06-24T04:59:47.990685Z] 04:59:47     INFO -  10  libnspr4.so!_pt_root [ptthread.c:50673cd78a02 : 216 + 0x7]
[task 2017-06-24T04:59:47.992409Z] 04:59:47     INFO -      rbx = 0x00007fed1d48c5c0   rbp = 0x00007fed1a1fd950
[task 2017-06-24T04:59:47.994101Z] 04:59:47     INFO -      rsp = 0x00007fed1a1fd910   r12 = 0x0000000000000000
[task 2017-06-24T04:59:47.995774Z] 04:59:47     INFO -      r13 = 0x0000000000002677   r14 = 0x00007fed1a1fe700
[task 2017-06-24T04:59:47.997457Z] 04:59:47     INFO -      r15 = 0x00007fed1a1fe3d0   rip = 0x00007fed2fd07a21
[task 2017-06-24T04:59:47.999071Z] 04:59:47     INFO -      Found by: call frame info
[task 2017-06-24T04:59:48.000745Z] 04:59:48     INFO -  11  libpthread-2.23.so + 0x76ba
[task 2017-06-24T04:59:48.002466Z] 04:59:48     INFO -      rbx = 0x0000000000000000   rbp = 0x0000000000000000
[task 2017-06-24T04:59:48.004123Z] 04:59:48     INFO -      rsp = 0x00007fed1a1fd960   r12 = 0x0000000000000000
[task 2017-06-24T04:59:48.005803Z] 04:59:48     INFO -      r13 = 0x00007ffcdf393b5f   r14 = 0x00007fed1a1fe9c0
[task 2017-06-24T04:59:48.007476Z] 04:59:48     INFO -      r15 = 0x00007fed1d48c5c0   rip = 0x00007fed2ff226ba
[task 2017-06-24T04:59:48.009108Z] 04:59:48     INFO -      Found by: call frame info
[task 2017-06-24T04:59:48.010788Z] 04:59:48     INFO -  12  libc-2.23.so + 0x10682d
[task 2017-06-24T04:59:48.016024Z] 04:59:48     INFO -      rsp = 0x00007fed1a1fda00   rip = 0x00007fed2628382d
[task 2017-06-24T04:59:48.017669Z] 04:59:48     INFO -      Found by: stack scanning
Flags: needinfo?(dd.mozilla)
Attached patch bug_1369509_v1.patch (obsolete) — Splinter Review
SetupConn can abandon a HalfOpen (e.g. when we find already connected h2 connection).
Attachment #8880358 - Attachment is obsolete: true
Flags: needinfo?(dd.mozilla)
Attachment #8881433 - Flags: review?(mcmanus)
Comment on attachment 8881433 [details] [diff] [review]
bug_1369509_v1.patch

Review of attachment 8881433 [details] [diff] [review]:
-----------------------------------------------------------------

as discussed, MOZ_DIAGNOSTIC_ASSERT will give us some non debug coverage without doing the asserts on release channels... after the problems are resolved we can revert it to MOZ_ASSERT
Attachment #8881433 - Flags: review?(mcmanus) → review+
Attachment #8881433 - Attachment is obsolete: true
Attachment #8881752 - Flags: review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/750bd7a2b8f5422760426e2b1c9baf00c807ba13
Bug 1369509 - Do not destroy nsHttpConnectionEntry if HalfOpenSock needs it. r=mcmanus
https://hg.mozilla.org/mozilla-central/rev/750bd7a2b8f5
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
You need to log in before you can comment on or make changes to this bug.