Closed Bug 1641211 Opened 4 years ago Closed 4 years ago

Crash in [@ nsTArray_Impl<T>::AppendElementInternal<T> | mozilla::dom::ContentParent::GetUsedBrowserProcess]

Categories

(Core :: DOM: Content Processes, defect, P1)

Unspecified
Windows 10
defect

Tracking

()

VERIFIED FIXED
mozilla79
Fission Milestone M6a
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 78+ verified
firefox76 --- unaffected
firefox77 --- unaffected
firefox78 blocking verified
firefox79 + verified

People

(Reporter: sg, Assigned: kmag)

References

Details

(5 keywords, Whiteboard: [sec-survey])

Crash Data

Attachments

(4 files)

This bug is for crash report bp-64118ec8-115a-4698-bfc7-ff2ac0200527.

Top 10 frames of crashing thread:

0 xul.dll nsTArray_Impl<RefPtr<imgRequestProxy>, nsTArrayInfallibleAllocator>::AppendElementInternal<nsTArrayInfallibleAllocator, imgRequestProxy*&> xpcom/ds/nsTArray.h:2512
1 xul.dll static mozilla::dom::ContentParent::GetUsedBrowserProcess dom/ipc/ContentParent.cpp:823
2 xul.dll static mozilla::dom::ContentParent::GetNewOrUsedBrowserProcessInternal dom/ipc/ContentParent.cpp:892
3 xul.dll static mozilla::dom::ContentParent::GetNewOrUsedBrowserProcess dom/ipc/ContentParent.cpp:979
4 xul.dll static mozilla::dom::ContentParent::CreateBrowser dom/ipc/ContentParent.cpp:1221
5 xul.dll nsFrameLoader::TryRemoteBrowserInternal dom/base/nsFrameLoader.cpp:2579
6 xul.dll nsFrameLoader::TryRemoteBrowser dom/base/nsFrameLoader.cpp:2641
7 xul.dll nsFrameLoader::ShowRemoteFrame dom/base/nsFrameLoader.cpp:1018
8 xul.dll nsFrameLoader::Show dom/base/nsFrameLoader.cpp:889
9 xul.dll nsSubDocumentFrame::ShowViewer layout/generic/nsSubDocumentFrame.cpp:192

Reports started with Nightly build id 20200514094044 and then somewhat more consistently with build id 20200520213050. Most reports are from Windows, but one is from OS X as well (https://crash-stats.mozilla.org/report/index/800307b4-50af-4fea-b7e5-b99c30200515).

Component: DOM: Core & HTML → DOM: Content Processes

This looks like a UAF, so marking as a possible sec issue.

Assigning to kmag.

Assignee: nobody → kmaglione+bmo
Group: core-security
Severity: -- → S1
Fission Milestone: --- → M6a
Keywords: csectype-uaf
Priority: -- → P1
Group: core-security → dom-core-security
Keywords: sec-high
Keywords: topcrash

Comment on attachment 9152845 [details]
Bug 1641211: Part 1 - Add stronger checks for ContentParent pool membership consistency. r=nika

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: Not easily. Assuming that this is a UAF, it should only ever happen at some unspecified point after a process has shut down and a new one is needed. This doesn't happen very often in practice outside of Fission, and the apparent UAF is even less common. Even in the rare instances that it does happen, crafting a targeted exploit would not be easy, if at all possible.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
  • Which older supported branches are affected by this flaw?: It appears to exist in 77 onwards
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: Not very difficult, or very risky, but it would probably be better to wait until we have a proper fix, rather than just stricter assertions to narrow down the source of the problem.
  • How likely is this patch to cause regressions; how much testing does it need?: Not likely. It simply adds stricter assertions which, if they fail, imply that a UAF would otherwise happen at some point down the road.
Attachment #9152845 - Flags: sec-approval?
Attachment #9152846 - Flags: sec-approval?

Based on the crash volume in beta marking this as a blocker for 78.

Crash Signature: [@ nsTArray_Impl<T>::AppendElementInternal<T> | mozilla::dom::ContentParent::GetUsedBrowserProcess] → [@ nsTArray_Impl<T>::AppendElementInternal<T> | mozilla::dom::ContentParent::GetUsedBrowserProcess] [@ nsTArray_Impl<T>::AppendElementInternal<T> | nsTArray_Impl<T>::AppendElementInternal<T> | mozilla::dom::ContentParent::GetUsedBrowserProcess]

Comment on attachment 9152845 [details]
Bug 1641211: Part 1 - Add stronger checks for ContentParent pool membership consistency. r=nika

sec-approval+

Attachment #9152845 - Flags: sec-approval? → sec-approval+
Attachment #9152846 - Flags: sec-approval? → sec-approval+
Keywords: leave-open
See Also: → 1644231

Kris, as we're entering the last week of beta what are our options for this bug? Is there something we can back out of beta to get rid of these crashes?

Flags: needinfo?(kmaglione+bmo)

(In reply to Julien Cristau [:jcristau] from comment #7)

Kris, as we're entering the last week of beta what are our options for this bug? Is there something we can back out of beta to get rid of these crashes?

Not that I can think of. We still don't know what's causing this. The best candidate I can think of would be disabling async content process launch, but I don't really know if that would make a difference.

Flags: needinfo?(kmaglione+bmo)

There are 3 crashes in nightly 20200617215206 with signature mozilla::dom::ContentParent::AssertNotInPool and the moz_crash_reason is MOZ_RELEASE_ASSERT(!mIsInPool).

Crash Signature: [@ nsTArray_Impl<T>::AppendElementInternal<T> | mozilla::dom::ContentParent::GetUsedBrowserProcess] [@ nsTArray_Impl<T>::AppendElementInternal<T> | nsTArray_Impl<T>::AppendElementInternal<T> | mozilla::dom::ContentParent::GetUsedBrowserProcess] → [@ nsTArray_Impl<T>::AppendElementInternal<T> | mozilla::dom::ContentParent::GetUsedBrowserProcess] [@ nsTArray_Impl<T>::AppendElementInternal<T> | nsTArray_Impl<T>::AppendElementInternal<T> | mozilla::dom::ContentParent::GetUsedBrowserProcess] [@ mozil…

I just hit this MOZ_RELEASE_ASSERT(!mIsInPool) twice in 79 Nightly build 2020-06-18 on Windows 10:

bp-a6be687b-3ded-4d39-9928-97d0e0200618
bp-63003951-c67a-44e1-945b-ba2e80200618

STR:

  1. Open about:crashcontent with Fission enabled.
  2. Quickly switch to a different tab before the "Gah. Your tab just crashed." crash reporter page is displayed.
Flags: needinfo?(kmaglione+bmo)

Comment on attachment 9157787 [details]
Bug 1641211: Clear crashed processes from the e10s preallocator. r=jesup

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: Not easily. This can only be triggered when a web content process crashes at a time when we have fewer than the maximum number of content processes and don't have an existing preallocated process. Even then, the exploit would need to happen in a content process and still have enough control over the parent process to manipulate the freed memory.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Unknown
  • Which older supported branches are affected by this flaw?: 78
  • If not all supported branches, which bug introduced the flaw?: Unknown
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: Not difficult and not risky.
  • How likely is this patch to cause regressions; how much testing does it need?: Unlikely. It adds some diagnostic assertions, but they won't affect release builds, and even if they did, they would only trip in cases where they would otherwise lead to a UAF.
Flags: needinfo?(kmaglione+bmo)
Attachment #9157787 - Flags: sec-approval?

Comment on attachment 9157787 [details]
Bug 1641211: Clear crashed processes from the e10s preallocator. r=jesup

sec-approval+

Attachment #9157787 - Flags: sec-approval? → sec-approval+
Crash Signature: mozilla::dom::ContentParent::AssertNotInPool] → mozilla::dom::ContentParent::AssertNotInPool] [@ mozilla::dom::ContentParent::GetUsedBrowserProcess] [@ mozilla::dom::ContentParent::LaunchSubprocessAsync]
Crash Signature: mozilla::dom::ContentParent::AssertNotInPool] [@ mozilla::dom::ContentParent::GetUsedBrowserProcess] [@ mozilla::dom::ContentParent::LaunchSubprocessAsync] → mozilla::dom::ContentParent::AssertNotInPool] [@ mozilla::dom::ContentParent::GetUsedBrowserProcess] [@ mozilla::dom::ContentParent::LaunchSubprocessAsync] [@ mozilla::dom::ContentParent::MinTabSelect]
Crash Signature: mozilla::dom::ContentParent::AssertNotInPool] [@ mozilla::dom::ContentParent::GetUsedBrowserProcess] [@ mozilla::dom::ContentParent::LaunchSubprocessAsync] [@ mozilla::dom::ContentParent::MinTabSelect] → mozilla::dom::ContentParent::AssertNotInPool] [@ mozilla::dom::ContentParent::GetUsedBrowserProcess] [@ mozilla::dom::ContentParent::LaunchSubprocessAsync] [@ mozilla::dom::ContentParent::MinTabSelect]

Comment on attachment 9158306 [details]
Bug 1641211: Clear crashed processes from the e10s preallocator. r=jesup

Beta/Release Uplift Approval Request

  • User impact if declined: This bug sometimes causes the parent process to crash shortly after a content process crashes. There may also be other UAF-related effects other than crashes.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): It's a relatively simple change to remove a cached reference to a content process after it crashes so that we don't try to reuse it.
  • String changes made/needed:
Attachment #9158306 - Flags: approval-mozilla-beta?
Status: NEW → RESOLVED
Closed: 4 years ago
Keywords: leave-open
Resolution: --- → FIXED

Comment on attachment 9158306 [details]
Bug 1641211: Clear crashed processes from the e10s preallocator. r=jesup

Approved for 78.0rc1 (and esr78 by extension). Let's wait to resolve the bug until it's merged to m-c, though :-). Thanks for the patch!

Attachment #9158306 - Flags: approval-mozilla-beta? → approval-mozilla-release+
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla79
Group: dom-core-security → core-security-release

As part of a security bug pattern analysis, we are requesting your help with a high level analysis of this bug. It is our hope to develop static analysis (or potentially runtime/dynamic analysis) in the future to identify classes of bugs.

Please visit this google form to reply.

Flags: needinfo?(kmaglione+bmo)
Whiteboard: [sec-survey]
Flags: needinfo?(kmaglione+bmo)
Flags: qe-verify+
QA Whiteboard: [qa-triaged]

As per comment 12, reproduced instantly on Windows 10(1903) with 79.0a1(20200618212828).
Crash no longer encountered with 79.0a1 (2020-06-25), 78.0, 78.0esr(20200625154640).
Based on this, marking the bug as verified.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
Group: core-security-release
See Also: → 1677074
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: