[Mac] Crash in <name omitted> | nsPresContext::ThemeChangedInternal

VERIFIED FIXED in Firefox 61

Status

()

defect
P2
critical
VERIFIED FIXED
Last year
Last year

People

(Reporter: marcia, Assigned: bkelly)

Tracking

({crash, regression, topcrash})

Trunk
mozilla61
Unspecified
macOS
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox59 unaffected, firefox60 unaffected, firefox61blocking verified)

Details

(crash signature)

Attachments

(1 attachment)

This bug was filed from the Socorro interface and is
report bp-a84c89a8-7c78-43da-8f3b-a46920180404.
=============================================================

Seen while looking at nightly crash stats - crashes started using 20180404100127: https://bit.ly/2uMkqFp. 28 crashes/19 installs

Possible regression range based on Build ID: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4a3275936ddf871103b53e00608e2b8d5aee7e69&tochange=ff0efa4132f0efd78af0910762aec7dcc1a8de66

Perhaps Bug 1450189? ni on :mattwoodrow

Top 10 frames of crashing thread:

0 XUL <name omitted> dom/base/nsGlobalWindowOuter.h:438
1 XUL nsPresContext::ThemeChangedInternal layout/base/nsPresContext.cpp:1790
2 XUL mozilla::detail::RunnableMethodImpl<nsPresContext*, void  xpcom/threads/nsThreadUtils.h:1164
3 XUL nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1096
4 XUL NS_ProcessPendingEvents xpcom/threads/nsThreadUtils.cpp:461
5 XUL nsBaseAppShell::NativeEventCallback widget/nsBaseAppShell.cpp:98
6 XUL nsAppShell::ProcessGeckoEvents widget/cocoa/nsAppShell.mm:436
7 CoreFoundation __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ 
8 CoreFoundation __CFRunLoopDoSource0 
9 CoreFoundation __CFRunLoopDoSources0 

=============================================================
Flags: needinfo?(matt.woodrow)
I've had this twice in two days now. In both cases, I'd had treeherder as the selected tab, my display had locked and then gone to sleep. Coming back to it, and attempting to select a job on treeherder, crashed the tab.

In both cases, trying to switch to another tab after restarting the crashed one, also crashed the Firefox main process.
I can get this pretty consistently when attaching or detaching from my external monitor.
(In reply to Mike Conley (:mconley) (:⚙️) (Totally backlogged on reviews and needinfos) from comment #2)
> I can get this pretty consistently when attaching or detaching from my
> external monitor.

Now you mention it, at least one of those cases was definitely after attaching my external monitor.
(In reply to Mark Banner (:standard8) from comment #3)
> (In reply to Mike Conley (:mconley) (:⚙️) (Totally backlogged on reviews and
> needinfos) from comment #2)
> > I can get this pretty consistently when attaching or detaching from my
> > external monitor.
> 
> Now you mention it, at least one of those cases was definitely after
> attaching my external monitor.

I can reproduce this reliably too after a session has been running for a while, but frustratingly I can't seem to reproduce with a freshly run Firefox (either my own or a copy started by mozregression)
This is poking at a window, so it could be a regression from bug 1450266. That feels more likely to me than a display list change. Ben, any thoughts?
Flags: needinfo?(bkelly)
Maybe nsPresContext::ThemeChangedInternal() needs a null check on mDocument? Though it looks like it is a method on Window being called so maybe the window is null...
This is the top crash for the April 4 OSX Nightly, with 47% (!!!) of all crashes.
Keywords: topcrash
I'll take a look first thing in the morning.  Unfortunately I don't think bug 1450266 can be safely backed out without also backing out more bugs.
(In reply to Andrew McCreight [:mccr8] from comment #6)
> Maybe nsPresContext::ThemeChangedInternal() needs a null check on mDocument?
> Though it looks like it is a method on Window being called so maybe the
> window is null...

Agreed, looks to me like mDocument->GetWindow() is returning nullptr.
Flags: needinfo?(matt.woodrow)
I'll see if I can add the nullptr checks.
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Flags: needinfo?(bkelly)
Normally I would wait for try to be green, but since this is critical and the patch is small I'm going to ask for review now.  If try is green and if it gets a clean r+, please feel free to land for me before I'm online in the morning.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=792a811e0f4fd1c7052161ea22cf7b9ba209772d
Attachment #8965533 - Flags: review?(bugs)
I forgot to mention the nullptr check has a precedent in nsPresContext here:

https://searchfox.org/mozilla-central/rev/d0413b711da4dac72b237d0895daba2537f1514c/layout/base/nsPresContext.cpp#1928

The comment 12 patch adds a nullptr check in the other two places nsPresContext.cpp does mDocument->GetWindow().
Can you see if the try build in comment 12 (once it builds) fixes your STR with your external monitor?
Flags: needinfo?(mconley)
Why has this only become a problem recently?
Try build mac package is here if anyone who can reproduce wants to test:

https://queue.taskcluster.net/v1/task/XBFx2PPnTdSHaLftoXL9Og/runs/0/artifacts/public/build/target.dmg
Comment on attachment 8965533 [details] [diff] [review]
Check for nullptr mDocument->GetWindow() in a few more places in nsPresContext. r=smaug

This is fine, but I wonder why we are dealing with a prescontext which has document without window.
Attachment #8965533 - Flags: review?(bugs) → review+
> This is fine, but I wonder why we are dealing with a prescontext which has
> document without window.

The crashing method is called from a runnables that doesn't do anything that would prevent a window from being cleaned up.  It seems like this always could have happened based on timing.
Pushed by btara@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f9d13ad74972
Check for nullptr mDocument->GetWindow() in a few more places in nsPresContext. r=smaug
Keywords: checkin-needed
(In reply to Ben Kelly [Part-time, not reviewing][:bkelly] from comment #17)
> Try build mac package is here if anyone who can reproduce wants to test:
> 
> https://queue.taskcluster.net/v1/task/XBFx2PPnTdSHaLftoXL9Og/runs/0/
> artifacts/public/build/target.dmg

I'm trying it now, will report back if I see the crash again in the next few hours (if I don't report back, assume it's fixed). :)
(In reply to William Lachance (:wlach) (use needinfo!) from comment #21)
> (In reply to Ben Kelly [Part-time, not reviewing][:bkelly] from comment #17)
> > Try build mac package is here if anyone who can reproduce wants to test:
> > 
> > https://queue.taskcluster.net/v1/task/XBFx2PPnTdSHaLftoXL9Og/runs/0/
> > artifacts/public/build/target.dmg
> 
> I'm trying it now, will report back if I see the crash again in the next few
> hours (if I don't report back, assume it's fixed). :)

Ok, been using it for the past 3 and some hours with many disconnections/reconnections to an external monitor. Pretty happy saying the problem is fixed at this point.
Flags: needinfo?(mconley)
https://hg.mozilla.org/mozilla-central/rev/f9d13ad74972
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
No more crashes since the patch landed.
Status: RESOLVED → VERIFIED
Priority: -- → P2
You need to log in before you can comment on or make changes to this bug.