Closed Bug 1495306 Opened Last year Closed Last year

Crash in nsThread::IdleDispatch

Categories

(Core :: Graphics: Text, defect, critical)

Unspecified
Android
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla64
Tracking Status
firefox-esr60 --- unaffected
firefox62 --- unaffected
firefox63 --- unaffected
firefox64 --- fixed

People

(Reporter: jseward, Assigned: jfkthame)

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is
report bp-3ce0e3b9-5076-4a9f-9569-2b8ea0180927.
=============================================================

This is the #2 topcrash for the Android nightly 20180927100037, with
15 crashes from 6 different installations.  I looked at 3 of the
crashes and all 3 involve gfxFontGroup::BuildFontList.

Top 10 frames of crashing thread:

0 libxul.so nsThread::IdleDispatch xpcom/threads/nsThread.cpp:960
1 libxul.so NS_IdleDispatchToThread xpcom/threads/nsThreadUtils.cpp:310
2 libxul.so gfxPlatformFontList::FindAndAddFamilies gfx/thebes/gfxPlatformFontList.cpp:399
3 libxul.so gfxFontGroup::BuildFontList gfx/thebes/gfxTextRun.cpp:1878
4 libxul.so gfxFontGroup::gfxFontGroup gfx/thebes/gfxTextRun.cpp:1814
5 libxul.so nsFontMetrics::nsFontMetrics gfx/src/nsFontMetrics.cpp:143
6 libxul.so nsFontCache::GetMetricsFor gfx/src/nsDeviceContext.cpp:144
7 libxul.so nsLayoutUtils::GetMetricsFor layout/base/nsLayoutUtils.cpp:10014
8 libxul.so Gecko_GetFontMetrics layout/style/ServoBindings.cpp:2551
9 libxul.so <style::gecko::wrapper::GeckoFontMetricsProvider as style::font_metrics::FontMetricsProvider>::query servo/components/style/gecko/wrapper.rs:1047

=============================================================
Flags: needinfo?(jfkthame)
Ugh, this looks like code that wasn't written with a view to running off-main-thread. The main code of FindAndAddFamilies should be OK provided the main thread is paused, I believe, which should be the case during style traversal; but what must be happening here is that it's calling InitOtherFamilyNames, which potentially posts an InitOtherFamilyNamesRunnable to the current thread, and that looks like a recipe for trouble.

Perhaps the simplest fix would be to force InitOtherFamilyNames to post its runnable to the main (not current) thread. From a brief look at the code I think that ought to avoid the issue.
Flags: needinfo?(jfkthame)
Having said that, I am curious why this has suddenly started happening. I see a couple of macOS reports, too, with similar stacks.

Maybe the thread-related changes in bug 1479035 have affected this -- Kris, does that look possible?
Flags: needinfo?(kmaglione+bmo)
(In reply to Jonathan Kew (:jfkthame) from comment #2)
> Having said that, I am curious why this has suddenly started happening.

Because while previously it would have failed by enqueuing the runnable in an event queue which never processed any events, and therefore likely caused leaks and other undefined behavior, now threads which never process events from their event queues simply have no event queues.

> Maybe the thread-related changes in bug 1479035 have affected this -- Kris,
> does that look possible?

Yes.
Flags: needinfo?(kmaglione+bmo)
So does it make sense to do something like this? AFAICS this should avoid the issue.
Attachment #9013253 - Flags: review?(kmaglione+bmo)
Attachment #9013253 - Flags: review?(kmaglione+bmo) → review+
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/868e0b4673af
When InitOtherFamilyNames is called during stylo traversal, ensure it posts its runnable back to the main thread. r=kmag
https://hg.mozilla.org/mozilla-central/rev/868e0b4673af
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Assignee: nobody → jfkthame
You need to log in before you can comment on or make changes to this bug.