Crash in [@ mozilla::MozPromise<T>::ThenInternal | mozilla::MozPromise<T>::ThenCommand<T>::~ThenCommand]
Categories
(Core :: DOM: Navigation, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox-esr78 | --- | unaffected |
firefox80 | --- | unaffected |
firefox81 | --- | unaffected |
firefox82 | --- | fixed |
People
(Reporter: gsvelto, Assigned: Gijs)
References
(Regression)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
Crash report: https://crash-stats.mozilla.org/report/index/bd541ee5-2352-4b94-8af4-4c0980200905
Top 10 frames of crashing thread:
0 xul.dll mozilla::MozPromise<nsresult, nsresult, 1>::ThenInternal xpcom/threads/MozPromise.h:862
1 xul.dll mozilla::MozPromise<nsresult, nsresult, 1>::ThenCommand<mozilla::MozPromise<nsresult, nsresult, 1>::ThenValue<`lambda at /builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h:358:11', `lambda at /builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h:361:11'> >::~ThenCommand xpcom/threads/MozPromise.h:906
2 xul.dll static mozilla::MozPromise<nsresult, nsresult, 1>::All xpcom/threads/MozPromise.h:356
3 xul.dll nsWebBrowserPersist::EndDownload dom/webbrowserpersist/nsWebBrowserPersist.cpp:2304
4 xul.dll mozilla::detail::RunnableMethodImpl< xpcom/threads/nsThreadUtils.h:1240
5 xul.dll mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal xpcom/threads/TaskController.cpp:512
6 xul.dll mozilla::detail::RunnableFunction<`lambda at /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:83:7'>::Run xpcom/threads/nsThreadUtils.h:577
7 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1234
8 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:87
9 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:327
This crash appears to be a regression from bug 1658202. Gijs can you have a look?
The raw crash reason is MOZ_DIAGNOSTIC_ASSERT(!IsExclusive || !mHaveRequest) (Using an exclusive promise in a non-exclusive fashion)
. Note that the only relevant crashes are those in the nightly channel, the other ones are unrelated.
Comment 1•4 years ago
|
||
Interestingly the non-nightly crashes (both on Windows 7) seem to indicate a kind of UAF (EXCEPTION_IN_PAGE_ERROR_EXEC and EXCEPTION_ACCESS_VIOLATION_WRITE). So maybe the MOZ_DIAGNOSTIC_ASSERT
is just triggered by random memory content pointed to through some dangling ThenValue
pointer ?
Anyway, the MOZ_DIAGNOSTIC_ASSERT
has been re-enabled here: https://phabricator.services.mozilla.com/D14138. Any thoughts, Jean-Yves?
Reporter | ||
Comment 2•4 years ago
|
||
The non-nightly crashes have all different stacks so they appear to be unrelated. They also have very different crash reasons (EXCEPTION_IN_PAGE_ERROR_EXEC / STATUS_IO_TIMEOUT
, EXCEPTION_IN_PAGE_ERROR_READ / STATUS_DEVICE_DATA_ERROR
and EXCEPTION_PRIV_INSTRUCTION
) and addresses so they're probably coming from machines with flaky hardware.
Comment 3•4 years ago
|
||
That assert happens if you happen to run multiple Then() on an exclusive promise.
From the previous comment, it seems to be a race on mFileClosePromises member
I'm not familiar with this code, nor is there any documentations on which thread thread all of this is running.
However, we have nsWebBrowserPersist::OnStopRequest that I'm guessing run on the main thread and https://searchfox.org/mozilla-central/source/dom/webbrowserpersist/nsWebBrowserPersist.cpp#2304-2309
Which can run on the background taskqueue added by bug 1658202.
Worse, the various dispatches are made using NS_DispatchToCurrentThread(; the background taskqueue is running on top of a thread pool with multiple threads.
So NS_DispatchToCurrentThread isn't the method you want to use here as there's no guarantee this is the thread you want. All of this code should be using GetCurrentSerialEventTarget()->Dispatch(...) instead
Comment 4•4 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] (PTO until September 5th) from comment #2)
The non-nightly crashes have all different stacks so they appear to be unrelated. They also have very different crash reasons (
EXCEPTION_IN_PAGE_ERROR_EXEC / STATUS_IO_TIMEOUT
,EXCEPTION_IN_PAGE_ERROR_READ / STATUS_DEVICE_DATA_ERROR
andEXCEPTION_PRIV_INSTRUCTION
) and addresses so they're probably coming from machines with flaky hardware.
Sorry for the noise then. I cannot remove the security flag, it seems.
Assignee | ||
Comment 5•4 years ago
|
||
(In reply to Jean-Yves Avenard [:jya] from comment #3)
That assert happens if you happen to run multiple Then() on an exclusive promise.
From the previous comment, it seems to be a race on mFileClosePromises member
I'm not familiar with this code, nor is there any documentations on which thread thread all of this is running.
I'm confused; I did try to add documentation (next to the methods and data structures that are used away from the main thread) when I moved some of it away from the main thread. Where did you look and not find the documentation?
Specifically:
https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/dom/webbrowserpersist/nsWebBrowserPersist.cpp#102-112
https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/dom/webbrowserpersist/nsWebBrowserPersist.cpp#948-955
https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/dom/webbrowserpersist/nsWebBrowserPersist.h#168-169,176-177
However, we have nsWebBrowserPersist::OnStopRequest that I'm guessing run on the main thread
Correct.
and https://searchfox.org/mozilla-central/source/dom/webbrowserpersist/nsWebBrowserPersist.cpp#2304-2309
Which can run on the background taskqueue added by bug 1658202.
Which part of the highlighted lines, and why/how? All of the callers of EndDownload
are on the main thread, so I'd expect all of the highlighted lines to run on the main thread, too, as all the event targets they refer to will be the main thread, too?
Worse, the various dispatches are made using NS_DispatchToCurrentThread(; the background taskqueue is running on top of a thread pool with multiple threads.
I didn't add any of these, and all the callers of NS_DispatchToCurrentThread
that I see in that file are supposed to run on the main thread. The dispatches I added either use the queue or NS_DispatchToMainThread
...
So NS_DispatchToCurrentThread isn't the method you want to use here as there's no guarantee this is the thread you want. All of this code should be using GetCurrentSerialEventTarget()->Dispatch(...) instead
But this should be fine if the current thread is the main thread, right?
Assignee | ||
Comment 7•4 years ago
|
||
If I had to guess about what happens, here, it'd be that there's some existing confusion in the persist code that leads to us calling EndDownload
multiple times, and that violates the assert because we don't clear the promise array from the Then
call, so we try to Then
the promises again, which violates the assert in comment #0.
Comment 8•4 years ago
|
||
discussed off-line on slack.
IPDL actor runs on the main thread, so there's no issue than=t EndDownload is called from multiple threads.
Adding MOZ_ASSERT to clarify the code would help.
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 10•4 years ago
|
||
Set release status flags based on info from the regressing bug 1658202
Assignee | ||
Comment 11•4 years ago
|
||
AFAICT at least one case where we can call EndDownload
multiple times is when someone calls Cancel()
on the webbrowserpersist object. This sets mCancel
to true and calls EndDownload
directly.
But inside OnDataAvailable
, if mCancel
is true, we also call EndDownload
. After bug 1658202 this is dispatched to the main thread but it was a problem already.
I can't tell from the stacks in the crash reports whether this is the case triggering these crashes, but we may as well fix it, I guess...
Assignee | ||
Comment 12•4 years ago
|
||
OK, running tests with asserts found me at least one more issue: SerializeNextFile
, called when we complete any single file of many which we're persisting, dispatches FinishDownload
to main thread, which calls EndDownload
with NS_OK
. But we may have failed by then (e.g. because something called Cancel()
, or because one of the subresources was blocked), and already notified consumers that the download finished...
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 13•4 years ago
|
||
This fixes a few causes of firing EndDownload twice:
- firing it from OnDataAvailable if the download was already canceled when the
method was initially invoked, rather than if we are ourselves wanting to
cancel the download because we encountered issues inside OnDataAvailable. - firing it from FinishDownload, dispatched from SerializeNextFile, after the
download has already been ended elsewhere. - calling Cancel() multiple times.
It also adds some code to avoid the re-entrancy on the Promise code that the
bug was originally filed for, and a diagnostic assert to check if there are any
other cases of repeated EndDownload calling that we've missed.
As a driveby, it adds a few thread assertions to help with code clarity.
Comment 14•4 years ago
|
||
Pushed by gijskruitbosch@gmail.com: https://hg.mozilla.org/integration/autoland/rev/ef3f6fba28f9 fix ending webbrowserpersist downloading so it only happens once, r=valentin
Comment 15•4 years ago
|
||
bugherder |
Description
•