Assertion failure: aTerminated || mDocument->GetReadyStateEnum() == Document::READYSTATE_LOADING (Bad readyState), at src/dom/base/nsContentSink.cpp:789
Categories
(Core :: Audio/Video, defect, P2)
Tracking
()
People
(Reporter: tsmith, Assigned: kinetik)
References
(Blocks 1 open bug)
Details
(Keywords: assertion, sec-moderate, Whiteboard: [post-critsmash-triage][adv-main100+r][adv-esr91.9+r])
Attachments
(3 files)
Found while fuzzing m-c 20220103-1cb2015e6fbc (--enable-debug --enable-fuzzing)
A reliable rescued testcase is unavailable. I was able to get an rr trace and I will attach a Pernosco session shortly.
Assertion failure: aTerminated || mDocument->GetReadyStateEnum() == Document::READYSTATE_LOADING (Bad readyState), at src/dom/base/nsContentSink.cpp:789
#0 0x7fca8f612cb7 in nsContentSink::DidBuildModelImpl(bool) src/dom/base/nsContentSink.cpp:787:3
#1 0x7fca8eb94bd9 in nsHtml5TreeOpExecutor::DidBuildModel(bool) src/parser/html/nsHtml5TreeOpExecutor.cpp:178:3
#2 0x7fca8eb965af in nsHtml5TreeOpExecutor::RunFlushLoop() src/parser/html/nsHtml5TreeOpExecutor.cpp:680:7
#3 0x7fca8eb9dfca in nsHtml5ExecutorReflusher::Run() src/parser/html/nsHtml5TreeOpExecutor.cpp:78:16
#4 0x7fca8d5e1ed2 in mozilla::SchedulerGroup::Runnable::Run() src/xpcom/threads/SchedulerGroup.cpp:140:20
#5 0x7fca8d6121de in mozilla::RunnableTask::Run() src/xpcom/threads/TaskController.cpp:467:16
#6 0x7fca8d5ec066 in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) src/xpcom/threads/TaskController.cpp:770:26
#7 0x7fca8d5ead28 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) src/xpcom/threads/TaskController.cpp:606:15
#8 0x7fca8d5eafa3 in mozilla::TaskController::ProcessPendingMTTask(bool) src/xpcom/threads/TaskController.cpp:390:36
#9 0x7fca8d615216 in operator() src/xpcom/threads/TaskController.cpp:124:37
#10 0x7fca8d615216 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:531:5
#11 0x7fca8d600933 in nsThread::ProcessNextEvent(bool, bool*) src/xpcom/threads/nsThread.cpp:1195:16
#12 0x7fca8d607a1a in NS_ProcessNextEvent(nsIThread*, bool) src/xpcom/threads/nsThreadUtils.cpp:467:10
#13 0x7fca8e0a7ce6 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) src/ipc/glue/MessagePump.cpp:85:21
#14 0x7fca8dfcca47 in MessageLoop::RunInternal() src/ipc/chromium/src/base/message_loop.cc:331:10
#15 0x7fca8dfcc952 in RunHandler src/ipc/chromium/src/base/message_loop.cc:324:3
#16 0x7fca8dfcc952 in MessageLoop::Run() src/ipc/chromium/src/base/message_loop.cc:306:3
#17 0x7fca92291ed8 in nsBaseAppShell::Run() src/widget/nsBaseAppShell.cpp:137:27
#18 0x7fca942e9633 in XRE_RunAppShell() src/toolkit/xre/nsEmbedFunctions.cpp:870:20
#19 0x7fca8e0a8bda in mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) src/ipc/glue/MessagePump.cpp:235:9
#20 0x7fca8dfcca47 in MessageLoop::RunInternal() src/ipc/chromium/src/base/message_loop.cc:331:10
#21 0x7fca8dfcc952 in RunHandler src/ipc/chromium/src/base/message_loop.cc:324:3
#22 0x7fca8dfcc952 in MessageLoop::Run() src/ipc/chromium/src/base/message_loop.cc:306:3
#23 0x7fca942e8c6c in XRE_InitChildProcess(int, char**, XREChildData const*) src/toolkit/xre/nsEmbedFunctions.cpp:707:34
#24 0x558975a8f009 in content_process_main src/browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
#25 0x558975a8f009 in main src/browser/app/nsBrowserApp.cpp:327:18
#26 0x7fcaa240c0b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
#27 0x558975a6a79c in _start (/home/worker/builds/m-c-20220130093554-fuzzing-debug/firefox-bin+0x1579c)
Reporter | ||
Comment 1•2 years ago
|
||
A Pernosco session is available here: https://pernos.co/debug/eKh9ZmlVAbHsViiD5V-TTg/index.html
Updated•2 years ago
|
Comment 2•2 years ago
|
||
The mForegroundCount
of the load group seems to underflow before the parse starts. Still looking further.
As an oddity that may be a distraction or may be important, it seems that Necko reports the content of the HTML document via OnDataAvailable
twice (to the same parser instance, so it doesn't look like a reload).
Comment 3•2 years ago
|
||
The underflow happens from here:
https://searchfox.org/mozilla-central/rev/38652b98c6dd3bf42403eeb8c5305902b9a6e938/netwerk/ipc/DocumentChannelChild.cpp#442
Comment 4•2 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #3)
The underflow happens from here:
https://searchfox.org/mozilla-central/rev/38652b98c6dd3bf42403eeb8c5305902b9a6e938/netwerk/ipc/DocumentChannelChild.cpp#442
Nevermind. The root cause underflow happens much earlier.
Comment 5•2 years ago
•
|
||
Some observations:
- The key lines for tracking the problem are
nsLoadGroup.cpp
lines 461 and 602 withthis==0x5650e4b23750
. (See the Pernosco Notebook for more.) - When things go wrong, we have a pending WebM request whose removal underflows
mForegroundCount
. - Necko appears to use the same
nsIRequest
address for a bunch of different subresources. Am I reading things wrong? Is this normal? - When a media element encounters a decode error, it takes the request out of the load group, changes flags, and puts it back.
- It seems bad that we have a single hashtable for background and foreground loads and a counter for tracking how many are foreground loads. Would it be bad to have two hashtables: one for background requests and another for foreground requests, and instead looking at
mForegroundCount
we'd look at the size of the foreground hashtable? It would be harder to confuse that kind of setup.
Needinfoing Valentin for point 3, Bryce for point 4, and smaug for point 5 for ideas.
Anyway, AFAICT, this isn't a parser bug but a bug earlier in subresource bookkeeping that manifests upon the parse finishing.
As for the reproducibility: The test case seems to repeatedly blow away the document by document.write
implying document.open
, followed by window.location.reload()
. Eventually the timing of the subresources differs enough to trigger the bug.
Comment 6•2 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #5)
- Necko appears to use the same
nsIRequest
address for a bunch of different subresources. Am I reading things wrong? Is this normal?
I haven't looked at the pernosco trace, but I assume the channel gets freed and another one gets allocated at the same address.
Updated•2 years ago
|
Comment 7•2 years ago
|
||
Sounds scary, but not very reliable. I'll mark it sec-moderate for now.
Comment 8•2 years ago
|
||
Where is the point 4 from comment 5 happening? Media code certainly is doing unusual stuff with flags and loadgroup.
https://searchfox.org/mozilla-central/rev/ad7ecfa618ec3a65db8405d9f1125059fe4a6a15/dom/media/BaseMediaResource.cpp#143
https://searchfox.org/mozilla-central/rev/ad7ecfa618ec3a65db8405d9f1125059fe4a6a15/dom/media/ChannelMediaResource.cpp#327,342
Point 5 is really a question to Necko peers.
Comment 9•2 years ago
|
||
This is suspicious:
Media code modifies the load flags
https://searchfox.org/mozilla-central/rev/ad7ecfa618ec3a65db8405d9f1125059fe4a6a15/dom/media/ChannelMediaResource.cpp#327,342 which calls
https://searchfox.org/mozilla-central/rev/ad7ecfa618ec3a65db8405d9f1125059fe4a6a15/dom/media/BaseMediaResource.cpp#143
That is done, I think, after the foreground counter has been decreased already in
https://searchfox.org/mozilla-central/source/netwerk/base/nsLoadGroup.cpp#602,614,624
but before the request is removed from the hashtable. So when the code in BaseMediaResource calls RemoveRequest, it decreases the counter again. (the pernosco session isn't available atm, so can't verify this from there.)
Anyhow, this feels like something very unusual happening in the media loading code.
Reporter | ||
Updated•2 years ago
|
Updated•2 years ago
|
Reporter | ||
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 10•2 years ago
•
|
||
I wonder if one issue is that BaseMediaResource::ModifyLoadFlags
handles the load group removal/addition incorrectly here - if loadGroup
is non-null but RemoveRequest
fails, ModifyLoadFlags
will go ahead and modify the channel's load flags while the channel is still in the load group. See also bug 910329 for a related instance of this.
It also appears to be unsafe to modify load group membership from OnStopRequest
, which is violated by ChannelMediaResource
here. In the Pernosco trace, I see a ChannelMediaResource::OnStopRequest
call leading to the BaseMediaResource::ModifyLoadFlags
failure case I mentioned above, where the channel's status and the result of RemoveRequest
are both NS_BINDING_ABORTED
. Solving that may be as simple as dispatching a new event to the main thread to call BaseMediaResource::ModifyLoadFlags
.
Assignee | ||
Comment 11•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 12•2 years ago
|
||
The patch in comment 11 addresses the ModifyLoadFlags
error handling mentioned in the first paragraph. I'm not sure if the issue mentioned in the second paragraph also needs to be addressed to solve this.
Comment 13•2 years ago
|
||
The severity field is not set for this bug.
:jimm, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•2 years ago
|
Comment 14•2 years ago
|
||
(In reply to Matthew Gregan [:kinetik] from comment #12)
The patch in comment 11 addresses the
ModifyLoadFlags
error handling mentioned in the first paragraph. I'm not sure if the issue mentioned in the second paragraph also needs to be addressed to solve this.
Probably worth a shot, but I'm not sure what to do to exercise this.
Comment 15•2 years ago
|
||
Landed: https://hg.mozilla.org/integration/autoland/rev/ef513b126e22fc25252063c2f5bb2aea09fe0020
Backed out for causing failures at test_autoplay_contentEditable.html:
https://hg.mozilla.org/integration/autoland/rev/fbbefd952570815fa946a35143d8548715f71aaf
Push with failures: https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry&revision=ef513b126e22fc25252063c2f5bb2aea09fe0020&selectedTaskRun=XCNS-kEDTnmkXm61QaKzCw.0
Failure log: https://treeherder.mozilla.org/logviewer?job_id=372628997&repo=autoland
ASSERTION: ModifyLoadFlags() failed!: 'NS_SUCCEEDED(rv)', file /builds/worker/checkouts/gecko/dom/media/ChannelMediaResource.cpp:343
Assignee | ||
Comment 16•2 years ago
|
||
I shouldn't have used NS_ASSERTION for the case in ChannelMediaResource::OnStopRequest, since we expect that to fail in this situation. I'll push a small fix.
At least this proves this code path is triggered in CI, so we have some test coverage for the fix.
Assignee | ||
Comment 17•2 years ago
|
||
Comment 18•2 years ago
|
||
Improve error handling in BaseMediaResource::ModifyLoadFlags. r=media-playback-reviewers,padenot
https://hg.mozilla.org/integration/autoland/rev/7ec903d74f9feb4dbcdfb4fd4489a840adc5b855
https://hg.mozilla.org/mozilla-central/rev/7ec903d74f9f
Quick build fix. CLOSED TREE
https://hg.mozilla.org/integration/autoland/rev/e6ef52fdc91bb3022a475244d1efa506c73a9501
https://hg.mozilla.org/mozilla-central/rev/e6ef52fdc91b
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 19•2 years ago
|
||
(In reply to Matthew Gregan [:kinetik] from comment #16)
I shouldn't have used NS_ASSERTION for the case in ChannelMediaResource::OnStopRequest, since we expect that to fail in this situation. I'll push a small fix.
This was also true for the assert added in BaseMediaResource::SetLoadInBackground, resulting in bug 1761865. I've changed that into a warning also.
Updated•2 years ago
|
Assignee | ||
Comment 20•2 years ago
|
||
Assignee | ||
Comment 21•2 years ago
|
||
Comment on attachment 9272573 [details]
Bug 1753298 - Improve error handling in BaseMediaResource::ModifyLoadFlags. r=padenot
Beta/Release Uplift Approval Request
- User impact if declined: Browser crash when a media load fails, possibly with confusing/unrelated stacks due to the load group being corrupted - possible UAF.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Low risk - small change adding error handling to channel flag handling.
- String changes made/needed: no
Updated•2 years ago
|
Comment 22•2 years ago
|
||
:kinetik The nominated patch failed to import. can you take a look?
Assignee | ||
Comment 23•2 years ago
|
||
Comment on attachment 9272573 [details]
Bug 1753298 - Improve error handling in BaseMediaResource::ModifyLoadFlags. r=padenot
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Simple/safe fix to address a crash and potential security issue.
- User impact if declined: Browser crash when a media load fails, possibly with confusing/unrelated stacks due to the load group being corrupted - possible UAF.
- Fix Landed on Version: 100
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Low risk - small change adding error handling to channel flag handling.
Assignee | ||
Comment 24•2 years ago
|
||
(In reply to Dianna Smith [:diannaS] from comment #22)
:kinetik The nominated patch failed to import. can you take a look?
Apologies, I requested the wrong approval - this patch is for esr91, not beta.
Comment 25•2 years ago
|
||
Comment on attachment 9272573 [details]
Bug 1753298 - Improve error handling in BaseMediaResource::ModifyLoadFlags. r=padenot
Approved for 91.9esr
Comment 26•2 years ago
|
||
uplift |
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Description
•