Closed Bug 1518524 Opened 2 years ago Closed 1 year ago

Crash in SkBitmapCache::PrivateDeleteRec

Categories

(Core :: Web Painting, defect)

x86
Windows 10
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
firefox-esr60 --- unaffected
firefox64 --- wontfix
firefox65 --- wontfix
firefox66 + fixed
firefox67 + fixed

People

(Reporter: marcia, Assigned: mattwoodrow)

Details

(4 keywords, Whiteboard: [adv-main66+])

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is
report bp-63627a70-820e-4d66-a4dd-277350190106.


Seen while looking at release crash stats. Filing as sec sensitive due to a few possible UAF signatures: https://bit.ly/2VAqnOZ. Could not find an existing bug that covers this.

A few URLs:

Top 10 frames of crashing thread:

0 xul.dll SkBitmapCache::PrivateDeleteRec intl/icu/source/common/brkeng.cpp:122
1 xul.dll mozilla::gfx::UserData::Add gfx/2d/UserData.h:34
2 xul.dll nsDisplayList::PaintRoot layout/painting/nsDisplayList.cpp:2879
3 xul.dll nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:3887
4 xul.dll mozilla::PresShell::Paint layout/base/PresShell.cpp:6368
5 xul.dll nsViewManager::ProcessPendingUpdatesForView view/nsViewManager.cpp:412
6 xul.dll void nsRefreshDriver::Tick layout/base/nsRefreshDriver.cpp:2044
7 xul.dll void mozilla::RefreshDriverTimer::TickRefreshDrivers layout/base/nsRefreshDriver.cpp:300
8 xul.dll void mozilla::RefreshDriverTimer::Tick layout/base/nsRefreshDriver.cpp:318
9 xul.dll void mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver layout/base/nsRefreshDriver.cpp:672

=============================================================

(er, sorry for the gibberish comment 1; I was using "preview" to test out bugzilla markdown behavior w/ soccoro boilerplate text, in light of bug 1518576, & then later forgot that I'd entered some text when I adjusted the component & hit 'submit' :) )

Group: layout-core-security → gfx-core-security

Looks like a regression in 63.x (oddity, the stacks say the crash is in Skia--believable in a painting stack--but the source link is to intl/icu).

There's at least one UAF signature in the recent crashes, but the majority of them appear to be null derefs.

Flags: needinfo?(matt.woodrow)
Keywords: csectype-uaf
Keywords: sec-high

Matt, can you take a look at this or let me know who could?

Assignee: nobody → matt.woodrow

SkBitmapCache::PrivateDeleteRec, _deleteEngine (the intl/icu function from the source link) and LayerManager::LayerUserDataDestroy (the actual crashing function) all contain exactly the same code (just a single delete call), so I think this is just the compiler sharing implementations of identical functions.

Can we get all 3 of those to be ignored by Socorro, such that we classify mozilla::gfx::UserData::Add as the crashing function?

It'd be interesting to see if that changes the regression date or not.

As for the actual crash, it appears that LayerManager::mUserData contains corrupt data, I'm not sure how that would happen.

Flags: needinfo?(matt.woodrow) → needinfo?(dveditz)
Crash Signature: [@ SkBitmapCache::PrivateDeleteRec] → [@ SkBitmapCache::PrivateDeleteRec] [@ LayerManager::LayerUserDataDestroy] [@ _deleteEngine]

There don't seem to be any crashes for _deleteEngine or LayerManager::LayerUserDataDestroy. If we were to lump those into mozilla::gfx::UserData::Add there's a long tail of pre-existing crashes (particularly Thunderbird 45!) but we still see an increase due to something new starting with Firefox 63:
https://bugzilla.mozilla.org/metricsgraphics/socorro-lens.html?s=mozilla::gfx::UserData::Add

Flags: needinfo?(dveditz)

Matt - is this bug actionable?

Flags: needinfo?(matt.woodrow)

I debugged the actual minidump, and it looks like the UserData itself is fine, but we have a deleted pointer to a FrameLayerBuilder, and attempting to call the virtual destructor is crashing.

nsDisplayList::PaintRoot grabs a raw pointer to FrameLayerBuilder using oldBuilder = layerManager->GetLayerBuilder().

We then call BuildLayers, which unconditionally creates a new FrameLayerBuilder for layerManager, and FrameLayerBuilder::Init writes to the UserData. We can only store a single UserData of each type, so that will clear any existing one, and deletes it.

At this point, 'oldBuilder' within PaintRoot must be a dangling pointer.

Then at the end of PaintRoot, we try to set the UserData back to oldBuilder, which will write the dangling pointer, and set us up to crash next time.

The only reason that this doesn't explode 100% of the time, is that we initially start with no FrameLayerBuilder, so the 'dangling' pointer is actually nullptr, and we're just clearing on exit.

Somehow we're getting in to the state where oldBuilder is actually non-null, and then the code is entirely bogus. I haven't figured out how that could actually happen though, some sort of weird recursion into PaintRoot?

Flags: needinfo?(matt.woodrow)

Comment on attachment 9048081 [details]
Bug 1518524 - Don't try to restore the previous FrameLayerBuilder after a transaction since it should always be nullptr anyway. r?miko

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: Not easily, it requires an extra step to reproduce that I haven't figured out yet.

Current commit message somewhat explains the first bit, but will remove that for landing.

  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
  • Which older supported branches are affected by this flaw?: All
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: Yes
  • If not, how different, hard to create, and risky will they be?:
  • How likely is this patch to cause regressions; how much testing does it need?: It should be fairly simple and low risk.
Attachment #9048081 - Flags: sec-approval?

Comment on attachment 9048081 [details]
Bug 1518524 - Don't try to restore the previous FrameLayerBuilder after a transaction since it should always be nullptr anyway. r?miko

Sec-approval+ for trunk. We'll want Beta and ESR60 patches nominated ASAP as well.

Attachment #9048081 - Flags: sec-approval? → sec-approval+

As, I see ESR60 is unaffected. Never mind about that then. :-)

Matt, can you request uplift please? I'd like this to make it into the RC build.

Flags: needinfo?(matt.woodrow)

Comment on attachment 9048081 [details]
Bug 1518524 - Don't try to restore the previous FrameLayerBuilder after a transaction since it should always be nullptr anyway. r?miko

Beta/Release Uplift Approval Request

  • Feature/Bug causing the regression: Unknown
  • User impact if declined: Fairly rare crash, but looks to be UAF.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce: No known way to reproduce.
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Very simple patch, should be a no-op for all normal cases, except the broken one.
  • String changes made/needed: None
Flags: needinfo?(matt.woodrow)
Attachment #9048081 - Flags: approval-mozilla-beta?

Comment on attachment 9048081 [details]
Bug 1518524 - Don't try to restore the previous FrameLayerBuilder after a transaction since it should always be nullptr anyway. r?miko

Fix for UAF, ok for uplift for the 66 RC.

Attachment #9048081 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment on attachment 9048081 [details]
Bug 1518524 - Don't try to restore the previous FrameLayerBuilder after a transaction since it should always be nullptr anyway. r?miko

Moving Liz' approval to m-r since that's where 66 is now.

Attachment #9048081 - Flags: approval-mozilla-beta+ → approval-mozilla-release+
Whiteboard: [adv-main66+]
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Group: gfx-core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.