Closed Bug 1575257 Opened 6 years ago Closed 1 year ago

Crash in [@ mozilla::ipc::IProtocol::DestroySubtree]

Tracking

()

Status:

RESOLVED INCOMPLETE

Tracking Flags:

Tracking

Status

firefox-esr60

---

unaffected

firefox-esr68

---

unaffected

firefox68

---

unaffected

firefox69

---

wontfix

firefox70

---

wontfix

firefox71

---

wontfix

People

(Reporter: philipp, Assigned: jld)

References

(Regression)

Details

(4 keywords)

Crash Data

[:philipp]

Reporter

Description

•

6 years ago

This bug is for crash report bp-6109a158-bce5-44bd-94df-3160f0190820.

Top 10 frames of crashing thread:

0  @0x771e5600 
1 xul.dll mozilla::ipc::IProtocol::DestroySubtree ipc/glue/ProtocolUtils.cpp:659
2 xul.dll mozilla::dom::PContentParent::OnChannelError ipc/ipdl/PContentParent.cpp:12291
3 xul.dll void mozilla::dom::ContentParent::OnChannelError dom/ipc/ContentParent.cpp:1511
4 xul.dll nsresult mozilla::detail::RunnableMethodImpl<mozilla::dom::Performance*, void  xpcom/threads/nsThreadUtils.h:1176
5 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1225
6 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
7 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:88
8 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:308
9 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:290

this crash signature is starting to show up sporadically during the 69.0b cycle, it might have regressed with bug 1540731.

Andrew McCreight [:mccr8]

Updated

•

6 years ago

Group: core-security → dom-core-security

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Assignee

Comment 1

•

6 years ago

It's hard to tell what's going on. Some of them are reading from a null pointer here, some of them are jumping to a bad address after ActorDestroy called from here has set up the frame pointer, and one of them is reading unmapped memory for IProtocol::mLinkStatus here (note unaligned address).

I wonder if there's a race condition with one of ContentParent's managees being destroyed on the wrong thread, or something like that.

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Assignee

Comment 2

•

6 years ago

It's worth noting that bug 1540731 fixed at least one and probably several use-after-free bugs — see also bug 1471154 — and this crash appears to be relatively low-volume.

(Making this a separate comment just in case it needs to be hidden after the rest of the bug is public.)

Daniel Veditz [:dveditz]

Updated

•

6 years ago

Keywords: sec-high

Daniel Veditz [:dveditz]

Updated

•

6 years ago

Assignee: nobody → jld

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Assignee

Comment 3

•

6 years ago

Nika, do you know if there are any DOM actors with sketchy lifetime stuff that we should be suspecting as the cause of this? There doesn't seem to be much actionable info in the crash reports themselves.

Flags: needinfo?(nika)

Nika Layzell [:nika] (ni? for response)

Comment 4

•

6 years ago

Given the time frame which it started occuring in, this is probably an existing bug which simply had it's signature changed. The IProtocol::DestroySubtree method was previously known as, based on the call stack, PContentParent::DestroySubtree. The signature may have also looked different though, so I'm not 100% confident.

Based on the causes you mention earlier, I'm guessing this could be caused by some odd lifecycle behaviour. Looking at some of the crash sites jld linked:

(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #1)

It's hard to tell what's going on. Some of them are reading from a null pointer here,

Only way that this should be nullptr would be if a managee was in the mFooManagee list but has had its ActorLifecycleProxy destroyed already. Theoretically that shouldn't happen(tm).

I suppose we could null-check each of the proxies as we read them? No clue how they'd be null here though, even with a fairly poorly behaving actor.

some of them are jumping to a bad address after ActorDestroy called from here has set up the frame pointer,

Hmm, might be a UAF which leaves a weird VTable pointer laying around?

and one of them is reading unmapped memory for IProtocol::mLinkStatus here (note unaligned address).

Wild. I suppose the value inside of the proxy has been clobbered after it was freed or something.

I wonder if there's a race condition with one of ContentParent's managees being destroyed on the wrong thread, or something like that.

That would be pretty odd. We do a RELEASE_ASSERT to check the worker thread matches what we expect when sending a message like __delete__. Looking a bit though I suppose we don't currently release assert the "channel is shut down" status when ~IProtocol is run, so perhaps that's what's happening?

All of these smell strongly of a UAF, which is especially interesting because we have a kungFuDeathGrip on ContentParent throughout the call into OnChannelError here: https://searchfox.org/mozilla-central/rev/325c1a707819602feff736f129cb36055ba6d94f/dom/ipc/ContentParent.cpp#1552. Perhaps we have a double-Release happening somewhere?

Flags: needinfo?(nika)

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Assignee

Comment 5

•

5 years ago

At one point in the IPC meeting we were wondering if this might just be a few people with bad RAM, but there are some more crashes:

bp-fbb58f48-b9ef-4e55-b7c2-bf1290190911, where stack unwinding failed despite this being one of our builds that crashed in our code.
bp-f68160c5-16d9-4d00-9fcb-d5e170190912, in PHttpBackgroundChannelParent::Send__delete__
bp-1fe014be-c125-4e99-9fae-397350190913, in PBackgroundParent::OnChannelError
bp-1c38047a-7ec8-48db-8dc6-faf490190913, also in PBackgroundParent::OnChannelError
bp-cf45dedd-3585-4172-baaa-817990190914, in PBackgroundIDBRequestChild::OnMessageReceived

Extending the search to the past two weeks, there are a couple more related to PBackground, as well as bp-fe080a28-c838-45e4-999b-488230190908 and bp-1f27fbc5-4b3f-4d18-9e6d-2c09a0190908 in PBrowserChild::Send__delete__ from BrowserChild::DelayedDeleteRunnable.

Julien Cristau [:jcristau]

Updated

•

5 years ago

status-firefox69: affected → wontfix

status-firefox70: ? → affected

status-firefox71: --- → ?

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Assignee

Comment 6

•

5 years ago

I hate to mark a likely UAF as P3, but there just isn't actionable information here.

Priority: -- → P3

Andrew McCreight [:mccr8]

Updated

•

5 years ago

Keywords: stalled

Liz Henry (:lizzard) (relman/hg->git project)

Updated

•

5 years ago

status-firefox70: affected → wontfix

status-firefox71: ? → fix-optional

BugBot [:suhaib / :marco/ :calixte]

Updated

•

3 years ago

Has Regression Range: --- → yes

Ryan VanderMeulen [:RyanVM]

Updated

•

3 years ago

Severity: critical → S3

status-firefox71: fix-optional → wontfix

BugBot [:suhaib / :marco/ :calixte]

Comment 7

•

3 years ago

The severity field for this bug is set to S3. However, the bug is flagged with the sec-high keyword.
:jld, could you consider increasing the severity of this security bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jld)

Andrew McCreight [:mccr8]

Comment 8

•

3 years ago

stalled.

Flags: needinfo?(jld)

Tom Ritter [:tjr]

Comment 9

•

1 year ago

We are reviewing and closing unactionable stalled bugs.

Status: NEW → RESOLVED

Closed: 1 year ago

Resolution: --- → INCOMPLETE

BugBot (nomail) [:suhaib / :marco/ :calixte]

Comment 10

•

1 year ago

Since the bug is closed, the stalled keyword is now meaningless.
For more information, please visit BugBot documentation.

Keywords: stalled

Daniel Veditz [:dveditz]

Updated

•

1 year ago

Group: dom-core-security

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Crash in [@ mozilla::ipc::IProtocol::DestroySubtree]

Categories

(Core :: IPC, defect, P3)

Tracking

()

People

(Reporter: philipp, Assigned: jld)

References

(Regression)

Details

(4 keywords)

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Updated

Updated

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Updated

Updated

Updated

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Updated