Closed Bug 1760846 Opened 4 years ago Closed 2 years ago

Crash in [@ hb_lazy_loader_t<T>::get_stored]

Categories

(Core :: Layout: Text and Fonts, defect)

x86
All
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: jesup, Unassigned)

Details

(4 keywords)

Crash Data

Crasher with random real addresses; at least one installation crashed with UAF signature.
Not a new bug. hb seems to do a bunch of multithreaded work; I see cmpexg calls

Crash report: https://crash-stats.mozilla.org/report/index/8dcc0eae-54e6-4c4f-bc11-0432f0220316

Reason: EXCEPTION_ILLEGAL_INSTRUCTION

Top 10 frames of crashing thread:

0 xul.dll hb_lazy_loader_t<OT::GPOS_accelerator_t, hb_face_lazy_loader_t<OT::GPOS_accelerator_t, 23>, hb_face_t, 23, OT::GPOS_accelerator_t>::get_stored const gfx/harfbuzz/src/hb-machinery.hh:216
1 xul.dll gfxFont::CheckForFeaturesInvolvingSpace gfx/thebes/gfxFont.cpp:1245
2 xul.dll gfxFont::SplitAndInitTextRun<unsigned char> gfx/thebes/gfxFont.cpp:3143
3 xul.dll gfxFontGroup::MakeTextRun gfx/thebes/gfxTextRun.cpp:2466
4 xul.dll BuildTextRunsScanner::FlushFrames layout/generic/nsTextFrame.cpp:1750
5 xul.dll nsTextFrame::EnsureTextRun layout/generic/nsTextFrame.cpp:3096
6 xul.dll nsTextFrame::AddInlinePrefISize layout/generic/nsTextFrame.cpp:9004
7 xul.dll nsBlockFrame::GetPrefISize layout/generic/nsBlockFrame.cpp:926
8 xul.dll nsIFrame::RefreshSizeCache layout/generic/nsIFrame.cpp:10504
9 xul.dll nsIFrame::GetXULPrefSize layout/generic/nsIFrame.cpp:10576

In addition to Win7, I see crashes with this signature on Win10, Linux and Android, so I think we can assume it's platform-independent.

Shouldn't be threading-related AFAIK, because although harfbuzz is intended to be usable on multiple threads, we (currently -- this is expected to change shortly!) only use it from the main thread.

OS: Windows 7 → All
Group: core-security → gfx-core-security
Blocks: gfx-triage

I think this might be caused by IO errors on a mmaped file. Jonathan, can you look more closely and see if that makes sense?

Flags: needinfo?(jfkthame)

That does look quite plausible, given the kind of errors I'm seeing in the crash reports. Not sure if there's anything we can realistically do to try and mitigate this, if so?

Flags: needinfo?(jfkthame)

Assigning to get sec bugs owned. Feel free to hand off to someone else if needed.

Assignee: nobody → lsalzman
Assignee: lsalzman → jfkthame
Component: Graphics: Text → Layout: Text and Fonts
No longer blocks: gfx-triage

I'm inclined to think Jeff is probably right in comment 2: this could be a result of i/o (or similar) failures when harfbuzz tries to access mmap'd font files.

The crash reports for the past 6 months show many different types of error, but it's noticeable that quite a few of them look suspiciously like they could reflect hardware failures: STATUS_DEVICE_DATA_ERROR, STATUS_IO_DEVICE_ERROR, KERN_MEMORY_ERROR, STATUS_DEVICE_HARDWARE_ERROR, STATUS_INVALID_DEVICE_REQUEST, ...

In addition, it looks like the reports are heavily biased towards older/lower-end machines, many of them running older OS versions; e.g. only half the reports are from 64-bit machines. It seems plausible that older machines would be more likely to start showing sporadic access failures (whether memory or disk).

As such, it's not clear to me whether there's anything actionable here. Marking as stalled for now. If we get any new information that suggests a specific cause, or that there's a reproducible issue here, we can reconsider.

Assignee: jfkthame → nobody
Keywords: stalled

(In reply to Jonathan Kew [:jfkthame] from comment #5)

it's noticeable that quite a few of them look suspiciously like they could reflect hardware failures: STATUS_DEVICE_DATA_ERROR, STATUS_IO_DEVICE_ERROR, KERN_MEMORY_ERROR, STATUS_DEVICE_HARDWARE_ERROR, STATUS_INVALID_DEVICE_REQUEST, ...

[...] It seems plausible that older machines would be more likely to start showing sporadic access failures (whether memory or disk).

As such, it's not clear to me whether there's anything actionable here. Marking as stalled for now. If we get any new information that suggests a specific cause, or that there's a reproducible issue here, we can reconsider.

--> Downgrading to S3, per above (and given that crash volume remains extremely low)

Severity: S2 → S3

The severity field for this bug is set to S3. However, the bug is flagged with the sec-high keyword.
:jfkthame, could you consider increasing the severity of this security bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jfkthame)

No, let's stick with Daniel's downgrading to S3 from comment 6.

Flags: needinfo?(jfkthame)
Group: gfx-core-security → layout-core-security

Relatively low crash rate, biased towards Win7 and Android, without actionable steps.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE

Since the bug is closed, the stalled keyword is now meaningless.
For more information, please visit BugBot documentation.

Keywords: stalled
Group: layout-core-security
You need to log in before you can comment on or make changes to this bug.