Closed Bug 1433557 Opened 2 years ago Closed 2 years ago

Crash in mozilla::net::nsHttpConnectionMgr::nsHalfOpenSocket::SetFastOpenStatus

Categories

(Core :: Networking: HTTP, defect, P1, critical)

59 Branch
x86_64
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- unaffected
firefox59 --- disabled
firefox60 --- fixed

People

(Reporter: philipp, Assigned: mayhemer)

References

Details

(Keywords: crash, regression, Whiteboard: [necko-triaged])

Crash Data

Attachments

(1 file, 1 obsolete file)

This bug was filed from the Socorro interface and is
report bp-4562b2f1-e736-4bb8-bbc2-deafc0180126.
=============================================================

Top 10 frames of crashing thread:

0 xul.dll mozilla::net::nsHttpConnectionMgr::nsHalfOpenSocket::SetFastOpenStatus netwerk/protocol/http/nsHttpConnectionMgr.cpp:4708
1 xul.dll mozilla::net::nsSocketTransport::InitiateSocket netwerk/base/nsSocketTransport2.cpp:1570
2 xul.dll mozilla::net::nsSocketTransport::OnSocketEvent netwerk/base/nsSocketTransport2.cpp:2100
3 xul.dll mozilla::net::nsSocketEvent::Run netwerk/base/nsSocketTransport2.cpp:90
4 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1040
5 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:517
6 xul.dll mozilla::net::nsSocketTransportService::Run netwerk/base/nsSocketTransportService2.cpp:961
7 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1040
8 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:517
9 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:364

=============================================================

hi, this crash signature that got fixed once in the past in bug 1362961 is reappearing now that 59.0b is rolling out to the beta population, so far only on win64 builds on windows 10.

when looking into a number of crash report manually, they all seemed to have modules relating to trend micro hooking into the firefox process (TmUmEvt64.dll, tmmon64.dll).
Flags: needinfo?(dd.mozilla)
Flags: needinfo?(dd.mozilla)
Flags: needinfo?(honzab.moz)
Assignee: nobody → honzab.moz
Flags: needinfo?(honzab.moz)
Priority: -- → P1
Whiteboard: [necko-triaged]
I think what's happening here is that the transaction or the connection is closed before we trigger this point what nullifies the conn->trans pointer.  This sounds to me as a valid possibility with a very low probability.

A null check would probably do.  My local debug build is currently broken, so I can't investigate a bit more.
Status: NEW → ASSIGNED
Attached patch v1 (obsolete) — Splinter Review
Valentin, I'm not sure how much you are/were involved in TFO, but this should be a simple thing to review.  

Let you know that the place we crash at is called asynchronously (host resolution) potentially long ago after we have assigned the TFO connection on the half-open at https://searchfox.org/mozilla-central/rev/9d47b5fb14152865ad1375eb3ee2e571d81ecdb9/netwerk/protocol/http/nsHttpConnectionMgr.cpp#4948

feel free to forward to nick, he may be more familiar.

thanks.
Attachment #8951006 - Flags: review?(valentin.gosu)
Comment on attachment 8951006 [details] [diff] [review]
v1

Review of attachment 8951006 [details] [diff] [review]:
-----------------------------------------------------------------

Seems fair.

::: netwerk/protocol/http/nsHttpConnectionMgr.cpp
@@ +4705,5 @@
>  {
>      MOZ_ASSERT(mFastOpenInProgress);
>      mFastOpenStatus = tfoStatus;
>      mConnectionNegotiatingFastOpen->SetFastOpenStatus(tfoStatus);
> +    if (mConnectionNegotiatingFastOpen->Transaction()) {

Can the transaction be nullified on another thread between these two calls?
I haven't checked, but we should make sure.
Attachment #8951006 - Flags: review?(valentin.gosu) → review+
(In reply to Valentin Gosu [:valentin] from comment #4)
> Can the transaction be nullified on another thread between these two calls?
> I haven't checked, but we should make sure.

good point!  I went through the code and there are numerous code paths (at least socket::close is one hard to track).  but I presume the manipulation can only happen on the socket thread.  not sure if adding an assertion here is of any help...  but adding on all places we assign/drop mTransaction would probably work.  I can find time and do it.
Attached patch v1.1Splinter Review
I've added few thread assertions on places I found the transaction could be RELEASED (not assigned).

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5cd5ef3dcf6d2766d07e95a2bcb05a89db78fcc5
Attachment #8951006 - Attachment is obsolete: true
Attachment #8951644 - Flags: review+
Try looks good.
Keywords: checkin-needed
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/mozilla-inbound/rev/51b090021867
Add a null-check for transaction when setting TCP Fast Open state on an associated connection, r=valentin
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/51b090021867
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
TFO is disabled on Beta59, right?
Flags: needinfo?(honzab.moz)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #10)
> TFO is disabled on Beta59, right?

yes:

https://hg.mozilla.org/releases/mozilla-beta/file/6da57d64acd8/modules/libpref/init/all.js#l4820
Flags: needinfo?(honzab.moz)
Thanks, setting the Fx59 status to disabled then.
You need to log in before you can comment on or make changes to this bug.