Closed Bug 1451293 Opened 2 years ago Closed 9 months ago

Crash in mozilla::net::nsHttpConnection::DontReuse

Categories

(Core :: Networking, defect, P2, critical)

56 Branch
Unspecified
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla67
Tracking Status
firefox-esr60 --- wontfix
firefox59 --- wontfix
firefox60 --- wontfix
firefox61 --- wontfix
firefox65 --- wontfix
firefox66 --- wontfix
firefox67 --- fixed

People

(Reporter: philipp, Assigned: kershaw)

References

(Depends on 1 open bug)

Details

(Keywords: crash, Whiteboard: [necko-triaged])

Crash Data

Attachments

(2 files, 1 obsolete file)

This bug was filed from the Socorro interface and is
report bp-bed8e1f5-9467-4205-942b-1dbfb0180404.
=============================================================

Top 10 frames of crashing thread:

0 xul.dll mozilla::net::nsHttpConnection::DontReuse netwerk/protocol/http/nsHttpConnection.cpp:958
1 xul.dll mozilla::net::nsHttpChannel::OnStopRequest netwerk/protocol/http/nsHttpChannel.cpp:7131
2 xul.dll nsInputStreamPump::OnStateStop netwerk/base/nsInputStreamPump.cpp:700
3 xul.dll nsInputStreamPump::OnInputStreamReady netwerk/base/nsInputStreamPump.cpp:428
4 xul.dll nsInputStreamReadyEvent::Run xpcom/io/nsStreamUtils.cpp:97
5 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1040
6 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:97
7 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:319
8 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:299
9 xul.dll nsBaseAppShell::Run widget/nsBaseAppShell.cpp:157

=============================================================

crash reports with a signature and a stack similar to this are showing up on windows since firefox 56 - tentatively related to bug 1351462.
overall it's happening with a fairly low volume on release though.
Someone is still manipulating (means: refs+-) nsHttpTransaction::mConnection outside nsHttpTransaction::mLock.

That needs to be found and fixed.

We have improved this in bug 1011354, but I think we need to make more finer audit.
See Also: → 1011354
This is very cyclic (incredibly low crash rate during the weekend), and looks to be most commonly happening on ESR builds, which says to me there's some sort of strange enterprise environment making this more likely. Also an indicator that this could be quite tricky to track down.
Priority: -- → P2
Whiteboard: [necko-triaged]
Depends on: 1505861
Kershaw -- can you have a look at this please? See comment #1.
Assignee: nobody → kershaw
Flags: needinfo?(kershaw)
(In reply to Selena Deckelmann :selenamarie :selena use ni? pronoun: she from comment #3)
> Kershaw -- can you have a look at this please? See comment #1.

Sure.
Flags: needinfo?(kershaw)
Status: NEW → ASSIGNED
A possible reason of crash could be ConnectionHandle::TakeHttpConnection() is called in sts thread and someone wants to call nsHttpConnection::DontReuse from main thread.
To avoid the crash, this patch simply adds some null checks in ConnectionHandle.
Attachment #9034749 - Attachment is obsolete: true

I think there could be a race between ConnectionHandle::TakeHTTPConnection [1] which is called in socket thread and ConnectionHandle::DontReuse which is called in main thread.

Honza, do you think it's worth to add a lock here?

[1] https://searchfox.org/mozilla-central/rev/f8de61826903996f6bdf41b11a2844dd59ac144f/netwerk/protocol/http/nsHttpConnectionMgr.cpp#1874

Flags: needinfo?(honzab.moz)

Good one! We MUST enforce either of those:

  • single thread access to ConnectionHandle::mConn (with an assert)
  • lock access to it, everywhere

I would rather try to go the way of single thread access, i.e. by posting DontReuse to the socket thread. this has been done for h2 session at [1], but this really has to be in the handle. Just move it?

[1] https://searchfox.org/mozilla-central/rev/c3ebaf6de2d481c262c04bb9657eaf76bf47e2ac/netwerk/protocol/http/Http2Session.cpp#692-706

Flags: needinfo?(honzab.moz)
Pushed by kjang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/83f419699bf1
single thread access to ConnectionHandle::mConn r=mayhemer
https://hg.mozilla.org/integration/autoland/rev/51df10abf1da
P2: Call SetConnRefTaken r=michal
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
Duplicate of this bug: 1505834
You need to log in before you can comment on or make changes to this bug.