Closed Bug 1690169 Opened 3 years ago Closed 3 years ago

Assertion failure: mDocument->GetReadyStateEnum() == Document::READYSTATE_LOADING, at src/dom/script/ScriptLoader.cpp:4259

Categories

(Core :: DOM: Core & HTML, defect)

defect

Tracking

()

VERIFIED FIXED
87 Branch
Tracking Status
firefox-esr78 87+ verified
firefox85 --- wontfix
firefox86 + wontfix
firefox87 + verified

People

(Reporter: tsmith, Assigned: hsivonen)

References

(Blocks 1 open bug)

Details

(Keywords: assertion, sec-moderate, testcase, Whiteboard: [bugmon:bisected,confirmed][adv-main87+r][post-critsmash-triage][adv-esr78.9+r])

Attachments

(3 files)

Attached file testcase.html —

Assertion failure: mDocument->GetReadyStateEnum() == Document::READYSTATE_LOADING, at src/dom/script/ScriptLoader.cpp:4259

#0 0x7f773c737adf in mozilla::dom::ScriptLoader::AddDeferRequest(mozilla::dom::ScriptLoadRequest*) src/dom/script/ScriptLoader.cpp:4259:5
#1 0x7f773c7358e0 in mozilla::dom::ScriptLoader::ProcessExternalScript(nsIScriptElement*, mozilla::dom::ScriptKind, nsTAutoStringN<char16_t, 64ul>, nsIContent*) src/dom/script/ScriptLoader.cpp:1887:5
#2 0x7f773c724170 in mozilla::dom::ScriptLoader::ProcessScriptElement(nsIScriptElement*) src/dom/script/ScriptLoader.cpp:1748:12
#3 0x7f773c7237dd in mozilla::dom::ScriptElement::MaybeProcessScript() src/dom/script/ScriptElement.cpp:118:18
#4 0x7f773c5dd7d0 in AttemptToExecute /builds/worker/workspace/obj-build/dist/include/nsIScriptElement.h:211:18
#5 0x7f773c5dd7d0 in nsXMLContentSink::CloseElement(nsIContent*) src/dom/xml/nsXMLContentSink.cpp:565:24
#6 0x7f773c5dfa43 in nsXMLContentSink::HandleEndElement(char16_t const*, bool) src/dom/xml/nsXMLContentSink.cpp:1049:12
#7 0x7f77395cfd18 in nsExpatDriver::HandleEndElement(void*, char16_t const*) src/parser/htmlparser/nsExpatDriver.cpp:349:32
#8 0x7f773d6a10b5 in doContent src/parser/expat/lib/xmlparse.c:2978:11
#9 0x7f773d69f033 in contentProcessor src/parser/expat/lib/xmlparse.c:2528:27
#10 0x7f773d69aff9 in MOZ_XML_ResumeParser src/parser/expat/lib/xmlparse.c:2162:15
#11 0x7f77395d3a0f in nsExpatDriver::ParseBuffer(char16_t const*, unsigned int, bool, unsigned int*) src/parser/htmlparser/nsExpatDriver.cpp:901:16
#12 0x7f77395d3fee in nsExpatDriver::ConsumeToken(nsScanner&, bool&) src/parser/htmlparser/nsExpatDriver.cpp:998:5
#13 0x7f77395d9952 in nsParser::Tokenize(bool) src/parser/htmlparser/nsParser.cpp:1410:25
#14 0x7f77395d864a in nsParser::ResumeParse(bool, bool, bool) src/parser/htmlparser/nsParser.cpp:961:45
#15 0x7f77395d8466 in nsParser::ContinueInterruptedParsing() src/parser/htmlparser/nsParser.cpp:597:12
#16 0x7f773c5e5066 in applyImpl<nsXMLContentSink, void (nsXMLContentSink::*)()> /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:1148:12
#17 0x7f773c5e5066 in apply<nsXMLContentSink, void (nsXMLContentSink::*)()> /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:1154:12
#18 0x7f773c5e5066 in mozilla::detail::RunnableMethodImpl<nsXMLContentSink*, void (nsXMLContentSink::*)(), true, (mozilla::RunnableKind)0>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:1201:13
#19 0x7f7738303132 in mozilla::SchedulerGroup::Runnable::Run() src/xpcom/threads/SchedulerGroup.cpp:146:20
#20 0x7f77383091ef in mozilla::RunnableTask::Run() src/xpcom/threads/TaskController.cpp:459:16
#21 0x7f77383077ca in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) src/xpcom/threads/TaskController.cpp:739:26
#22 0x7f7738306874 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) src/xpcom/threads/TaskController.cpp:598:15
#23 0x7f7738306a27 in mozilla::TaskController::ProcessPendingMTTask(bool) src/xpcom/threads/TaskController.cpp:382:36
#24 0x7f773830cab6 in operator() src/xpcom/threads/TaskController.cpp:123:37
#25 0x7f773830cab6 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_3>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:534:5
#26 0x7f773831e195 in nsThread::ProcessNextEvent(bool, bool*) src/xpcom/threads/nsThread.cpp:1200:14
#27 0x7f773832425a in NS_ProcessNextEvent(nsIThread*, bool) src/xpcom/threads/nsThreadUtils.cpp:548:10
#28 0x7f7738c316e6 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) src/ipc/glue/MessagePump.cpp:87:21
#29 0x7f7738b9d3d3 in MessageLoop::RunInternal() src/ipc/chromium/src/base/message_loop.cc:334:10
#30 0x7f7738b9d2ed in RunHandler src/ipc/chromium/src/base/message_loop.cc:327:3
#31 0x7f7738b9d2ed in MessageLoop::Run() src/ipc/chromium/src/base/message_loop.cc:309:3
#32 0x7f773c959a38 in nsBaseAppShell::Run() src/widget/nsBaseAppShell.cpp:137:27
#33 0x7f773e1748b3 in XRE_RunAppShell() src/toolkit/xre/nsEmbedFunctions.cpp:902:20
#34 0x7f7738c325cc in mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) src/ipc/glue/MessagePump.cpp:237:9
#35 0x7f7738b9d3d3 in MessageLoop::RunInternal() src/ipc/chromium/src/base/message_loop.cc:334:10
#36 0x7f7738b9d2ed in RunHandler src/ipc/chromium/src/base/message_loop.cc:327:3
#37 0x7f7738b9d2ed in MessageLoop::Run() src/ipc/chromium/src/base/message_loop.cc:309:3
#38 0x7f773e174488 in XRE_InitChildProcess(int, char**, XREChildData const*) src/toolkit/xre/nsEmbedFunctions.cpp:733:34
#39 0x5585f2df3ef6 in content_process_main src/browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
#40 0x5585f2df3ef6 in main src/browser/app/nsBrowserApp.cpp:306:18
#41 0x7f774d05f0b2 in __libc_start_main /build/glibc-ZN95T4/glibc-2.31/csu/../csu/libc-start.c:308:16
#42 0x5585f2dd1c9c in _start (/home/worker/builds/m-c-20210117095130-fuzzing-debug/firefox-bin+0x14c9c)
Flags: in-testsuite?

A Pernosco session is available here: https://pernos.co/debug/08ta-GcIdivE-NfsJZLXZQ/index.html

Bugmon Analysis:
Verified bug as reproducible on mozilla-central 20210202033500-babdc3b3a300.
The bug appears to have been introduced in the following build range:

Start: 77dc70a7ac251056a60abbcdb89c8db32a79a253 (20210113213439)
End: 0f5e4a3c6f0a6ab0fcd77c46cedc819e2bd0b3f8 (20210115035053)
Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=77dc70a7ac251056a60abbcdb89c8db32a79a253&tochange=0f5e4a3c6f0a6ab0fcd77c46cedc819e2bd0b3f8

Whiteboard: [bugmon:bisected,confirmed]

Hi Henri,
Could you help to take a look? Thank you.

Severity: -- → S3
Flags: needinfo?(hsivonen)

It seems to me that DocumentChannel removes its request from the nsLoadGroup causing the end-of-load machinery to act even though there's still pending input in the XML parser's buffer, because the parsing was blocked on a script even though the stream ended (and got buffered up).

Redirecting needinfo based on the DocumentChannel aspect.

(In reply to bugmon from comment #2)

The bug appears to have been introduced in the following build range:

Start: 77dc70a7ac251056a60abbcdb89c8db32a79a253 (20210113213439)
End: 0f5e4a3c6f0a6ab0fcd77c46cedc819e2bd0b3f8 (20210115035053)
Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=77dc70a7ac251056a60abbcdb89c8db32a79a253&tochange=0f5e4a3c6f0a6ab0fcd77c46cedc819e2bd0b3f8

I didn't notice anything immediate obvious here.

Flags: needinfo?(hsivonen) → needinfo?(matt.woodrow)

Do you know what the expected behaviour is here?

It looks like DocumentChannel shut itself down with NS_BINDING_RETARGETED, which means that we handled the response in the parent (for a download).

Flags: needinfo?(matt.woodrow) → needinfo?(hsivonen)

Hmm. Why does this involve the XML parser at all? I'll take another look at the Pernosco trace.

(In reply to Henri Sivonen (:hsivonen) from comment #6)

Hmm. Why does this involve the XML parser at all?

We're parsing aboutNetError.xhtml.

Flags: needinfo?(hsivonen)

So first the iframe starts loading aboutNetError.xhtml. Then, while it's still loading, the src attribute is set to a different value. This ends up running stopping steps with STOP_NETWORK instead of STOP_ALL (which would include STOP_CONTENT). This means that we don't stop the parser, but we do remove the load blocker pseudo-request from the load group. Bad stuff happens if the load blocker is removed by anything but actions directed by the parser itself.

STOP_NETWORK alone being enough to remove the load blocker from the load group seems very, very bad. I'm surprised everything hasn't blown up much more visibly much earlier.

smaug, do you have ideas of what the most appropriate fix would be?

Group: dom-core-security

Should the parser/contentsink perhaps use its internal nsIRequest implementation and handle Cancel() call in some reasonable way?
Then replace the current mDocument->BlockOnload() call with something like
mDocument->GetDocumentLoadGroup()->AddRequest(that_new_request);
That should be relatively easy to implement.

For crashtest, we should make the WOFF base64 part shorter.

I'm having trouble getting the test case crash as a crashtest without the patch.

Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Attached file Bug 1690169 test. —

Comment on attachment 9202340 [details]
Bug 1690169 - StopDocumentLoad even on STOP_NETWORK.

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: Not immediate, but should be doable with some thinking by a person determined to figure out what the patch fixes.

It's unclear what exploiting this bug would accomplish. This is a major violation of the invariants of the parsers about what they believe the state of the rest of Gecko to be, but it's unclear to me what nefarious things can be done with that invariant violation.

  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
  • Which older supported branches are affected by this flaw?: all
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: Yes
  • If not, how different, hard to create, and risky will they be?: The same patch should apply to old branches.

The bisection range in the bug suggests that the being able to reach the bug in the reported way is a recent thing. However, the part of code being patched is ancient, so it's quite possible that there are other ways to reach the problem that apply to Gecko from the last 20 years.

  • How likely is this patch to cause regressions; how much testing does it need?: Unlikely. If we had test cases unintentionally depending on this, they should have asserted already. However, the patch hasn't been on try.
Attachment #9202340 - Flags: sec-approval?

What is the consequence of this bug? We're adding a deferred script load when we should have just loaded it? The document stays in a loading state? Trying to judge whether this is something we should backport or if it's not all that bad (that is, what sec- rating should it have?)

(In reply to Daniel Veditz [:dveditz] from comment #16)

What is the consequence of this bug?

The XML parser keeps parsing when the DOM has transitioned to a state where it assumes the parser is done. It seems plausible that the bug affects HTML parsing, too. That is, this is a violation of invariants related to the interaction of the parser and the DOM, but I don't have anything concrete to explain how it would be exploited for nefarious purposes. I'm not even sure that this is a security bug. However, it looks concerning enough to me that I think we should backport.

Comment on attachment 9202340 [details]
Bug 1690169 - StopDocumentLoad even on STOP_NETWORK.

sec-approval=dveditz

Attachment #9202340 - Flags: sec-approval? → sec-approval+

This is a sec-moderate and we are building 86RC today, could this fix wait a week and target the 87 release? Thanks

(In reply to Pascal Chevrel:pascalc from comment #19)

This is a sec-moderate and we are building 86RC today, could this fix wait a week and target the 87 release? Thanks

This has been on autoland for several hours, although the bots don't say so for bugs marked as security bugs. Is it OK that this stays there, or should this be backed out?

Flags: needinfo?(pascalc)

(In reply to Henri Sivonen (:hsivonen) from comment #20)

(In reply to Pascal Chevrel:pascalc from comment #19)

This is a sec-moderate and we are building 86RC today, could this fix wait a week and target the 87 release? Thanks

This has been on autoland for several hours, although the bots don't say so for bugs marked as security bugs. Is it OK that this stays there, or should this be backed out?

No it's fine, nevermind, it's not a sec-high and you said yourself in comment #17 that it's maybe not even a security issue so it can land, sorry for the noise :)

Flags: needinfo?(pascalc)
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 87 Branch
Group: dom-core-security → core-security-release
Whiteboard: [bugmon:bisected,confirmed] → [bugmon:bisected,confirmed][adv-main87+r]

Comment on attachment 9202340 [details]
Bug 1690169 - StopDocumentLoad even on STOP_NETWORK.

Beta/Release Uplift Approval Request

  • User impact if declined: Parser can keep accessing the DOM the way it does when parsing even though the DOM tree is in a "parsing finished" state. This is a blatant violation of the assumptions, but it's unclear if there are actual security implications.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch stops the parser when we stop the network, and it's very clear that not doing that is bad.
  • String changes made/needed: None.

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Violation of parsing-related assumptions.
  • User impact if declined: Parser can keep accessing the DOM the way it does when parsing even though the DOM tree is in a "parsing finished" state. This is a blatant violation of the assumptions, but it's unclear if there are actual security implications.
  • Fix Landed on Version: 87
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch stops the parser when we stop the network, and it's very clear that not doing that is bad.
  • String or UUID changes made by this patch: None.
Attachment #9202340 - Flags: approval-mozilla-esr78?
Attachment #9202340 - Flags: approval-mozilla-beta?

Comment on attachment 9202340 [details]
Bug 1690169 - StopDocumentLoad even on STOP_NETWORK.

87 merges to beta on monday, and we already built the 86 release candidates

Attachment #9202340 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Flags: qe-verify+
Whiteboard: [bugmon:bisected,confirmed][adv-main87+r] → [bugmon:bisected,confirmed][adv-main87+r][post-critsmash-triage]
QA Whiteboard: [qa-triaged]

Hello Tyson, I tried to reproduce this issue on an Ubuntu 18.04 using older Nightly builds (87.0a1 (2021-02-02)) as well as newer Nightly builds by dragging the attached test case into the browser.
At the pop up panel I have the option to save the file or open it, dismissing, saving or ignoring the file makes no difference.
I cannot see any errors in terminal or Browser console, do we need a special build or are there any steps that might help us reproduce this issue on our side ?

Flags: needinfo?(twsmith)

You need to allow popup windows and then reload for the assertion to be triggered in old builds.

Flags: needinfo?(twsmith)

I tried these steps and I only got the following errors in Browser console :

Beta 86.0b3

TypeError: this.contentWindow is null PopupBlockingChild.jsm:28:5
didDestroy resource://gre/actors/PopupBlockingChild.jsm:28

TypeError: browser is nullBrowserElementParent.jsm:24:21
receiveMessage resource://gre/actors/BrowserElementParent.jsm:24

Release 85.0.2
TypeError: browser is nullBrowserElementParent.jsm:24:21
receiveMessage resource://gre/actors/BrowserElementParent.jsm:24

Nightly 87.0a1 (2021-02-02)
TypeError: can't access property "removeEventListener", this.contentWindow is null
PopupBlockingChild.jsm:28:5
didDestroy resource://gre/actors/PopupBlockingChild.jsm:28

TypeError: can't access property "ownerGlobal", browser is null
BrowserElementParent.jsm:24:21
receiveMessage resource://gre/actors/BrowserElementParent.jsm:24

TypeError: can't access dead objectbrowsing-context.js:1889:15
onLoad resource://devtools/server/actors/targets/browsing-context.js:1889

Latest Nightly 87.0a1 (2021-02-22)
TypeError: can't access property "ownerGlobal", browser is nullBrowserElementParent.jsm:24:21
receiveMessage resource://gre/actors/BrowserElementParent.jsm:24
(Async: JSActor query)
handleEvent resource://gre/actors/BrowserElementChild.jsm:15

These are the only errors that appear in Browser Console, in Beta and Release I would only get the Browser is Null error.

In Old Nightly I would get the "can't access property" and "can't access dead object" while in the latest Nightly build it would only show "can't access property ownerGlobal, browser is null"

Was the problem the "Browser is Null" error or the "can't access dead object" ? or there shouldnt be any errors at all on reload ?
Please note that nothing shows up in Terminal, no errors at all on either build.

Flags: needinfo?(twsmith)

Were these debug builds? This needs a build with assertion enabled for a clear repro.

Flags: needinfo?(rares.doghi)

Rares,

As mentioned by Henri you will need a debug build, if you are not already using it fuzzfetch can help[1]. You can also use Grizzly replay[2] to help with verification. Both tools can be pip installed.

[1] https://github.com/MozillaSecurity/fuzzfetch
[2] https://github.com/MozillaSecurity/grizzly/wiki/grizzly-replay

Flags: needinfo?(twsmith)

Hi Tyson, sorry for the late reply, I was caught up with meetings, I was able to reproduce this issue on a build from February 12th using a debug build where I would get the Assertion error but also the Tab would crash after reloading the page.

This issue is Verfied as fixed in our latest nightly build 88.0a1 (2021-02-22) I got from :
https://treeherder.mozilla.org/jobs?repo=mozilla-central&collapsedPushes=866758&fromchange=32690d048b75cc54ead9118c98333d5442d2c6be&selectedTaskRun=JZ0CPjmtSiejMOcnbIhE2g.0

On Windows 10, Ubuntu 20.04 as well as Mac OSX 10.15.

Flags: needinfo?(rares.doghi)

Comment on attachment 9202340 [details]
Bug 1690169 - StopDocumentLoad even on STOP_NETWORK.

Approved for 78.9esr.

Attachment #9202340 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+

This issue is verified as fixed in our latest Firefox 78.9.0esr on Windows 10, Ubuntu 20.04 and Mac OSX 10.15.

Status: RESOLVED → VERIFIED
QA Whiteboard: [qa-triaged]
Flags: qe-verify+
Whiteboard: [bugmon:bisected,confirmed][adv-main87+r][post-critsmash-triage] → [bugmon:bisected,confirmed][adv-main87+r][post-critsmash-triage][adv-esr78.9+r]

:hsivonen, since this bug contains a bisection range, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.

Flags: needinfo?(hsivonen)

(In reply to Release mgmt bot [:sylvestre / :calixte / :marco for bugbug] from comment #34)

:hsivonen, since this bug contains a bisection range, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.

AFAICT, the bug itself is ancient. The regression range is wide enough that I don't know what in it unmasked the bug.

Flags: needinfo?(hsivonen)

dveditz, is it OK to land the test now?

Flags: needinfo?(dveditz)

Yes, thanks for asking. Generally unless the tests are a working exploit it's OK to land them 2 or 3 weeks after the fix is shipped in Release.

Please set the in-testsuite flag to + after you land.

Group: core-security-release
Flags: needinfo?(dveditz)
Flags: in-testsuite? → in-testsuite+
Keywords: bugmon
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: