Closed Bug 1776653 Opened 1 year ago Closed 11 months ago

[macOS 13] Startup Crash in [@ TDataReference::DucRefCount]

Categories

(Core :: Graphics: Text, defect, P2)

Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
106 Branch
Tracking Status
firefox-esr102 --- wontfix
firefox105 --- fixed
firefox106 --- fixed

People

(Reporter: aryx, Assigned: bradwerth)

References

(Blocks 1 open bug)

Details

(Keywords: crash, csectype-uaf, sec-moderate, Whiteboard: [post-critsmash-triage])

Crash Data

Attachments

(2 files, 1 obsolete file)

For the upcoming macOS 13. 7 crashes with Firefox 102 betas and RCs in the last week.

Crash report: https://crash-stats.mozilla.org/report/index/11154809-690d-444f-83b7-c91eb0220626

Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS

Top 10 frames of crashing thread:

0 libFontParser.dylib TDataReference::DucRefCount 
1 None @0xbe628001b32e6288 
2 libFontParser.dylib TDataReference::~TDataReference 
3 None @0x38270001a9dcb3c8 
4 CoreFoundation __CFDataDeallocate 
5 None @0xc6468001b32e71f0 
6 libFontParser.dylib TTableCacheNode::operator= 
7 None @0x95250001b337c798 
8 libFontParser.dylib TTableCache::AddTable 
9 None @0x8a7580011589df18 
Summary: [macOS 13] Crash in [@ TDataReference::DucRefCount] → [macOS 13] Startup Crash in [@ TDataReference::DucRefCount]
Group: core-security
Group: core-security → gfx-core-security

The stacks all seem to be font loading related. Bug 1680458 is a possible additional sighting of this issue.

I suspect this is probably an Apple bug of some kind. The fonts are not special in anyway. e.g. "Bitstream Vera Sans"

Also all the crashes have MacMemoryPressureNormalTimes and have a crash time that's around 10 seconds later

It seems like it might be able to reproduce this by triggering a memory pressure event during Firefox startup

Assignee: nobody → bwerth
Priority: -- → P2
No longer blocks: gfx-triage

I've been launching on my M1 Max macOS 12 with forced memory pressure and I haven't had a crash yet.

(In reply to Jeff Muizelaar [:jrmuizel] from comment #4)

Also all the crashes have MacMemoryPressureNormalTimes and have a crash time that's around 10 seconds later

I'm not sure this is a signifier. MacMemoryPressureNormalTime is written when the last "normal" memory condition was noted, but for all the crash reports I can see, this value is equal to the StartupTime, indicating that we're just seeing the initial normal value. So yes this is a startup crash, but I don't see evidence that it is related to memory pressure.

Since all the crashes are within libFontParser.dylib, I think Bug 1583585 may be a related crash. That bug has some commentary about how we may be able to force this crash by using an intentionally-corrupted font loaded by Firefox.

See Also: → 1583585

(In reply to Brad Werth [:bradwerth] from comment #6)

I'm not sure this is a signifier. MacMemoryPressureNormalTime is written when the last "normal" memory condition was noted, but for all the crash reports I can see, this value is equal to the StartupTime, indicating that we're just seeing the initial normal value. So yes this is a startup crash, but I don't see evidence that it is related to memory pressure.

Yeah, you're right MacMemoryPressureNormal means no memory pressure.

New theory: this is a re-entrancy problem within CoreGraphics font handling. In a typical crash report the Font Loader thread is processing an async font load through MacFontInfo::LoadFontFamilyData which calls CTFontCreateWithFontDescriptor while the main thread is loading character maps through LoadCmapsRunnable::Run which calls CGFontFinderCreateFontWithName. I don't think the macOS font cache can handle it. I suspect this is manifesting as a startup crash because at startup we trigger some async font loads and we also load cmaps for fonts that have finished loading(?) Not sure about the details, but these threads are doing very similar things at similar times.

(In reply to Brad Werth [:bradwerth] from comment #10)

New theory: this is a re-entrancy problem within CoreGraphics font handling.

Though in these crashes, one of the threads is waiting on a lock. This crash report is one where the main thread crashes while the Font Loader thread is waiting.

Seems like we might be able to lock the Font Loader activities with respect to the main thread if the FontInfoData carried the lock from the gfxPlatformFontList and held that lock during LoadFontFamilyData. Not sure that's a good idea, but I can probably build a patch that does it.

Attachment #9290534 - Attachment description: WIP: Bug 1776653 Part 1: Add more crash annotations when font creation happens. → Bug 1776653 Part 1: Add more crash annotations when font creation happens.

This is a very broad lock on all the activity of LoadFontFamilyData. If we
wanted to narrow it for performanc reasons, we may only need to protect
the calls to CTFontCreateWithFontDescriptor, but those happen within a
loop so the overhead of holding and releasing the lock repeatedly could
also be a performance hit.

Depends on D155022

This feels to me like it's similar to bug 1777889 (although this signature existed at a low rate before that one showed up), and might have a common underlying cause in some kind of thread-safety bug in Apple's font libraries. According to bug 1777889 comment 20, that issue was recently fixed; maybe the same (or a related) fix has helped here as well.

Attachment #9290697 - Attachment is obsolete: true
Keywords: leave-open

Comment on attachment 9290534 [details]
Bug 1776653 Part 1: Add more crash annotations when font creation happens.

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: Extremely unlikely. This is a crash report hygiene patch. This patch notes that Apple's CGFontCreate methods can crash under certain circumstances, but there's no clear way to get Apple code to call into freed memory.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
  • Which older supported branches are affected by this flaw?: all
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: Easy to create, but not critical to do so. We'll know what to look for with crashes in other branches.
  • How likely is this patch to cause regressions; how much testing does it need?: Unlikely to cause regressions. It merely adds existing crash report annotations in some new circumstances.
  • Is Android affected?: No
Attachment #9290534 - Flags: sec-approval?

Will re-discuss in triage as we consider whether or not to land the partial fix D155107.

Blocks: gfx-triage

We'll land D155107, pending confirmation that the lock doesn't stall startup too much.

Comment on attachment 9290534 [details]
Bug 1776653 Part 1: Add more crash annotations when font creation happens.

Approved to land and uplift if desired, but I think this needs review from a data peer...

Attachment #9290534 - Flags: sec-approval? → sec-approval+
Attachment #9290697 - Attachment is obsolete: false

Comment on attachment 9290697 [details]
Bug 1776653 Part 2: Prevent concurrent font creation between Font Loader and main thread on macOS.

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: Very difficult since this is a startup crash before any specific web content is loaded. The failure is also within OS-provided library, so it's hard to know how attackers could take advantage of the UAF within those OS-controlled structures.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
  • Which older supported branches are affected by this flaw?: all
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: Not hard to create, nor risky, but probably not necessary since this is a very low-volume startup crash. This patch is a workaround for an Apple issue that appears to be fixed in macOS 13.
  • How likely is this patch to cause regressions; how much testing does it need?: Unlikely to cause regression. Adds a lock to a method that previously didn't use it. The lock is already broadly used, so this is just one more case. No possibility that it would lead to a deadlock.
  • Is Android affected?: No
Attachment #9290697 - Flags: sec-approval?

The following patch is waiting for review from inactive reviewers:

ID Title Author Reviewer Status
D155107 Bug 1776653 Part 2: Prevent concurrent font creation between Font Loader and main thread on macOS. bradwerth jrmuizel: Inactive
jfkthame: Inactive

:bradwerth, could you please find another reviewer?

For more information, please visit auto_nag documentation.

Flags: needinfo?(bwerth)
Flags: needinfo?(bwerth)

Comment on attachment 9290697 [details]
Bug 1776653 Part 2: Prevent concurrent font creation between Font Loader and main thread on macOS.

approved to land

Attachment #9290697 - Flags: sec-approval? → sec-approval+

It's very likely that macOS 13 is going to be released while Fx105 is on the release channel. Should we uplift these patches to Beta? They graft cleanly.

Flags: needinfo?(bwerth)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #25)

It's very likely that macOS 13 is going to be released while Fx105 is on the release channel. Should we uplift these patches to Beta? They graft cleanly.

I will nominate them for uplift.

Flags: needinfo?(bwerth)

Comment on attachment 9290534 [details]
Bug 1776653 Part 1: Add more crash annotations when font creation happens.

Beta/Release Uplift Approval Request

  • User impact if declined: macOS browser may crash at startup in rare circumstances.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This adds another lock on an existing synchronization primitive. There is really no possibility of introducing deadlock. There is no measurable performance impact to this additional lock.
  • String changes made/needed:
  • Is Android affected?: No
Attachment #9290534 - Flags: approval-mozilla-beta?
Attachment #9290697 - Flags: approval-mozilla-beta?

Comment on attachment 9290534 [details]
Bug 1776653 Part 1: Add more crash annotations when font creation happens.

Approved for 105.0b9.

Attachment #9290534 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9290697 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment on attachment 9290534 [details]
Bug 1776653 Part 1: Add more crash annotations when font creation happens.

Landed for 105.0b9. Removing the approval flag to get it off the needs-uplift radar.
https://hg.mozilla.org/releases/mozilla-beta/rev/4faa28565f45
https://hg.mozilla.org/releases/mozilla-beta/rev/82def17897c0

Attachment #9290534 - Flags: approval-mozilla-beta+
Attachment #9290697 - Flags: approval-mozilla-beta+
Severity: S2 → S3

The severity field for this bug is set to S3. However, the bug is flagged with the sec-high keyword.
:bradwerth, could you consider increasing the severity of this security bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(bwerth)

The best fix would be for this to change from sec-high to something lower, because this isn't practically exploitable. I'm not sure the protocol to do that.

Flags: needinfo?(bwerth)
Keywords: leave-open
Keywords: sec-highsec-moderate

This is fixed.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Group: gfx-core-security → core-security-release
Target Milestone: --- → 106 Branch
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]

(In reply to Brad Werth [:bradwerth] from comment #32)

This is fixed.

Nope, not fixed. We're still vulnerable to concurrent thread access in gfxPlatformFontList::InitializeFamily. I'll add a new patch.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

One of the crash reports shows a crash in the RegisterFonts thread when the Main Thread is running JS code. There is no crash annotation of the FontName, which means that none of the code is directly calling one of the CGFontCreate methods. Really troubling! I can't think of a way to protect the macOS internal font cache from arbitrary JS code.

Any thread that creates a CGFont can potentially collide in the internal
font cache. This closes off another case where the Main thread might be
creating a CGFont while the Font Loader is also doing so.

No longer blocks: gfx-triage
Status: REOPENED → NEW
Target Milestone: 106 Branch → ---
Attachment #9298037 - Attachment is obsolete: true

Part 3 doesn't appear to be necessary to solve this Bug. I think this is fixed (primarily due to fixes within macOS, but also due to the stopgap measures that landed in Parts 1 and 2).

Status: NEW → RESOLVED
Closed: 1 year ago11 months ago
Resolution: --- → FIXED
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.