Closed Bug 1516423 Opened 6 years ago Closed 4 years ago

Crash in mozilla::widget::TaskbarPreview::MainWindowHook. Tray related?

Categories

(Thunderbird :: General, defect)

All
Windows
defect
Not set
critical

Tracking

(thunderbird_esr78+ wontfix)

VERIFIED FIXED
89 Branch
Tracking Status
thunderbird_esr78 + wontfix

People

(Reporter: wsmwk, Assigned: toshi)

References

Details

(4 keywords, Whiteboard: [tbird topcrash][regression:TB78.3.2])

Crash Data

Attachments

(2 files, 1 obsolete file)

I crashed on resume from sleep. Looking at crash-stats, this only seems to happen in the Windows world. bp-03c4a4f2-c5ad-4922-a546-408c30181226. 0 xul.dll mozilla::widget::TaskbarPreview::MainWindowHook widget/windows/TaskbarPreview.cpp:417 1 xul.dll mozilla::widget::WindowHook::CallbackData::Invoke widget/windows/WindowHook.cpp:124 2 xul.dll mozilla::widget::WindowHook::Notify widget/windows/WindowHook.cpp:111 3 xul.dll nsWindow::ExternalHandlerProcessMessage widget/windows/nsWindow.cpp:5106 4 xul.dll nsWindow::ProcessMessage widget/windows/nsWindow.cpp:5159 5 xul.dll nsWindow::WindowProcInternal widget/windows/nsWindow.cpp:5010 6 xul.dll CallWindowProcCrashProtected xpcom/base/nsCrashOnException.cpp:32 7 xul.dll nsWindow::WindowProc widget/windows/nsWindow.cpp:4962 8 user32.dll user32.dll@0x162f9 9 user32.dll user32.dll@0x16d39 bp-6f02f042-e24e-45bb-915a-c16940181024 is a Firefox crash related to bug 1277167?
Depends on: 1277167

Kikuchi-san has fixed related issue (bug 1661485). Wayne, does this still occurs after landing bug 1661485?

Interesting. Most of the recent instances of bug 1661485 were from German locale, but this case is not. This is from Thunderbird, and user comments say this happened when double-clicking a taskbar icon of a minimized Thunderbird. We should check that scenario.

Whiteboard: [tbird topcrash]

(In reply to Makoto Kato [:m_kato] from comment #2)

Kikuchi-san has fixed related issue (bug 1661485). Wayne, does this still occurs after landing bug 1661485?

Short answer is yes, still occurs. bp-e547a495-bbca-4040-9716-52a5c0201024 is 83 beta.

Magnus, is this related to another issue we have with tray?

Flags: needinfo?(mkmelin+mozilla)

I suppose it's possible (maybe even likely?) it could be related to reports we have of new mail tray icons sometimes not disappearing as they should.

Flags: needinfo?(mkmelin+mozilla)
Summary: Crash in mozilla::widget::TaskbarPreview::MainWindowHook → Crash in mozilla::widget::TaskbarPreview::MainWindowHook. Tray related?

Any further ideas about cause and regression?

To recap:

#29 crash for 78.8.1

Beta crashes were steady for 84 and 85. Then, strange, not a single 86 or 87 beta crash. And, it's now a top crash for 86 and 88 nightly.
https://crash-stats.mozilla.org/signature/?product=Thunderbird&signature=mozilla%3A%3Awidget%3A%3ATaskbarPreview%3A%3AMainWindowHook&date=%3E%3D2020-12-22T12%3A05%3A00.000Z&date=%3C2021-03-22T12%3A05%3A00.000Z#summary

Flags: needinfo?(tkikuchi)
Flags: needinfo?(benc)
Whiteboard: [tbird topcrash] → [tbird topcrash][regression:TB78.3.2]

My notes from looking at the crash stack and poking about (and knowing nothing about the taskbar or windows integration!):

The crash occurs in the static fn TaskbarPreview::MainWindowHook(), and I think it's being passed a borked aContext pointer - dereferencing it causes a EXCEPTION_ACCESS_VIOLATION_READ. It's not null.

The context is actually the TaskbarPreview object - it's set up previously in TaskBarPreview::Enable(), like so:

    hook->AddMonitor(nsAppShell::GetTaskbarButtonCreatedMessage(),
                     MainWindowHook, this);

(there's another AddMonitor() call in TaskbarPreview::Init() for the WM_DESTROY message, but that's not the one that's causing this crash).

So. The window receives a "TaskbarButtonCreated" message, invokes TaskbarPreview::MainWindowHook(), but by then the TaskbarPreview is borked (or at least what the hook thinks is the TaskbarPreview).

Maybe the TaskbarPreview is being deleted before the message is processed? (there is a message queue, after all)
Or maybe something is corrupting the hook's context pointer?

Looking at nsITaskbarPreview, it seems that it's for both the taskbar preview (hovering over the app in the task bar at the bottom of the screen?) and window preview (eg cycling through windows with the alt-tab or whatever?)...

Flags: needinfo?(benc)

I think I got the nearly stable repro. Setting mail.minimizeToTray = true, keep minimizing a window and reverting it by clicking a tray icon caused this crash. I'll spend more time on figuring out the root cause.

https://crash-stats.mozilla.org/report/index/7836651a-519e-4aa0-b4b0-bdbbb0210323
https://crash-stats.mozilla.org/report/index/60fc7124-aa51-4720-85a5-da77e0210323

Flags: needinfo?(tkikuchi)
Assignee: nobody → tkikuchi

I think the problem was TaskbarWindowPreview was freed a few seconds after TB started. It happens because nsWindow holds mTaskbarPreview as a weak reference. I changed it to a strong reference like this. The crash was gone, but not sure what's the impact on Firefox side.

Another approach I can think of is to keep a reference of TaskbarWindowPreview inside WinUnreadBadge (see attached). I confirmed this also fixed the crash and the risk is lower because there is no impact on Firefox. Magnus, let me add needinfo on you because you reviewed the latest WinUnreadBadge (bug 715799). What do you think?

Flags: needinfo?(mkmelin+mozilla)

(In reply to Magnus Melin [:mkmelin] from comment #5)

I suppose it's possible (maybe even likely?) it could be related to reports we have of new mail tray icons sometimes not disappearing as they should.

While investigating, I think I hit an opposite issue. When mail.minimizeToTray = true is set, if I minimize a TB window without any unread email, the tray icon should be always shown to bring back the window. However, the tray icon disappeared first time I minimized a window, and after that there was no way to bring back the window (so I needed to kill the process). I think it's a different bug.

Regardless of whether we change comm-central or convert a weak to a strong ref, I think there is a bug in TaskbarWindowPreview. The reason why we have a dangling pointer in the queue is TaskbarWindowPreview's dtor does not clean up a hook. On the other hand, a sibling class TaskbarTabPreview does it by calling Disable(). I checked all other pairs of AddHook/RemoveHook and AddMonitor/RemoveMonitor, but that's the only one left.

When TaskbarWindowPreview is constructed, it registers its adress in WindowHook
as a listener. However, when an instance of TaskbarWindowPreview is destroyed,
it does not remove all listeners. This emerges as Thunderbird's crash because TB
creates TaskbarWindowPreview to show a tray icon but the instance is accidentally
freed when the icon is still active due to another bug. As a result, TB crashes
when TaskbarPreview::MainWindowHook tries to access a freed TaskbarWindowPreview
to handle a click on the icon.

This patch adds a call to SetVisible(false) in TaskbarWindowPreview's dtor to
clean up the things. This will stop the crash in TB, but TB's taskbar icon still
does not behave correctly because it accidentally disappears without crash.
We need an additional patch to fix it.

Keywords: leave-open
Hardware: x86 → All
Blocks: 1701432

(In reply to Toshihito Kikuchi [:toshi] from comment #10)

Another approach I can think of is to keep a reference of TaskbarWindowPreview inside WinUnreadBadge (see attached). I confirmed this also fixed the crash and the risk is lower because there is no impact on Firefox. Magnus, let me add needinfo on you because you reviewed the latest WinUnreadBadge (bug 715799). What do you think?

I'd trust your judgement on which approach to prefer (or do both, I'd imagine the Thunderbird part could be good to take in addition to be sure?)

Flags: needinfo?(mkmelin+mozilla)

In Thunderbird, an instance of TaskbarWindowPreview is created when
WinUnreadBadge.updateUnreadCount calls WinTaskbar::GetOverlayIconController.
However, that instance is freed immdiately after updateUnreadCount exits
because nobody owns it. As a result, the taskbar icon disappears even when
mail.minimizeToTray = false.

This patch makes WinUnreadBadge own a reference to TaskbarWindowPreview
so that its instance is kept alive after the method exits.
This approach is the same as Firefox stores TaskbarWindowPreview as
DownloadsTaskbar._taskbarProgress.

Attachment #9211178 - Attachment is obsolete: true

(In reply to Magnus Melin [:mkmelin] from comment #15)

(In reply to Toshihito Kikuchi [:toshi] from comment #10)
I'd trust your judgement on which approach to prefer (or do both, I'd imagine the Thunderbird part could be good to take in addition to be sure?)

Thanks! Yes, I'll land both patches for mozilla-central and comm-central. I've just uploaded a new patch for comm-central with comments.

I'll also file a new bug for the taskbar icon issue mentioned in comment 11.

See Also: → 1701891
Status: NEW → ASSIGNED
Target Milestone: --- → 89 Branch

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/d783ea4ee651
Have WinUnreadBadge own TaskbarWindowPreview. r=mkmelin

What are the chances that the fix will be integrated in 88 Beta 2? I really can't work with this version. Would not like to switch to the stable version.

(In reply to Ali Savas from comment #20)

What are the chances that the fix will be integrated in 88 Beta 2?

zero.

If you are able to start thunderbird, disable minimize to tray in tools > options.
If you are not able to start thunderbird, you'll need to edit prefs.js and create a preference line if it doesn't already exist for mail.minimizeToTray with a value of false.

Pushed by tkikuchi@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e818608c2b9d Clean up MessageData holding TaskbarWindowPreview when it's destroyed. r=cmartin
Keywords: leave-open
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED

Enough code changes to the effected files, that not worth uplifting to 78.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: