Closed Bug 1522076 Opened 7 years ago Closed 7 years ago

Crash in mozilla::dom::AlternativeDataStreamListener::OnStopRequest

Categories

(Core :: Networking, defect, P1)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
firefox-esr60 --- unaffected
firefox64 --- unaffected
firefox65 --- unaffected
firefox66 + fixed

People

(Reporter: calixte, Assigned: valentin)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression, Whiteboard: [necko-triaged])

Crash Data

This bug is for crash report bp-4be59300-454b-404f-b586-9cea60190123.

Top 10 frames of crashing thread:

0 xul.dll mozilla::dom::AlternativeDataStreamListener::OnStopRequest dom/fetch/FetchDriver.cpp:299
1 xul.dll void mozilla::net::HttpChannelChild::DoOnStopRequest netwerk/protocol/http/HttpChannelChild.cpp:1226
2 xul.dll mozilla::net::HttpChannelChild::ActorDestroy netwerk/protocol/http/HttpChannelChild.cpp:3803
3 xul.dll mozilla::net::PNeckoChild::DestroySubtree ipc/ipdl/PNeckoChild.cpp:2531
4 xul.dll void mozilla::dom::PContentChild::DestroySubtree ipc/ipdl/PContentChild.cpp:10190
5 xul.dll mozilla::dom::PContentChild::OnChannelClose ipc/ipdl/PContentChild.cpp:10020
6 xul.dll nsresult mozilla::detail::RunnableMethodImpl<mozilla::dom::ServiceWorkerRegistration*, void  xpcom/threads/nsThreadUtils.h:1158
7 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1167
8 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:468
9 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:110

There are 9 crashes (from 4 installations) in nightly 66 with buildid 20190122215349. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1520062.

[1] https://hg.mozilla.org/mozilla-central/rev?node=1b6a90a3d820

Flags: needinfo?(valentin.gosu)

I think we call OnStartRequest and OnStopRequest twice! mStatus of the FetchDriver has to be LOADING (we would crash on the diag-assert otherwise) and it's set so in OnStartRequest. There is probably a call site (in http base channel?) where we don't set mOnStopRequestCalled after calling onstop on the listener.

Or it can be that simple as a sequence of:

  • FetchDriver has not yet received onstart/stop notification
  • someone calls FetchDriver->Cancel(), that nullifies mPipeAlternativeOutputStream
  • we call OnStart (does mStatus = AlternativeDataStreamListener::LOADING)
  • we call OnStop -> STATUS is "valid", stream is null -> CRASH

s/FetchDriver/AlternativeDataStreamListener/

And frankly, AlternativeDataStreamListener impl may use some revision, there is also a case for which ondata will not read from the given input stream, breaking the contract.

See Also: → 1522109

(In reply to Honza Bambas (:mayhemer) from comment #1)

I think we call OnStartRequest and OnStopRequest twice! mStatus of the FetchDriver has to be LOADING (we would crash on the diag-assert otherwise) and it's set so in OnStartRequest. There is probably a call site (in http base channel?) where we don't set mOnStopRequestCalled after calling onstop on the listener.

It's either that, or we actually get the ActorDestroy before OnStartRequest - I haven't seen it happening, but seems to be possible. In which case we'd end up only calling OnStopRequst, and mPipeAlternativeOutputStream would be null.
I think the best course is to backout bug 1520062, and try again there. A small leak before the process is destroyed is a less of a problem than crashing when the process is destroyed :)

Assignee: nobody → valentin.gosu
Flags: needinfo?(valentin.gosu)
Priority: -- → P1
Whiteboard: [necko-triaged]
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
You need to log in before you can comment on or make changes to this bug.