Password related crashes in shutdownhang | ntdll.dll@0x21f5d

NEW
Unassigned

Status

--
critical
4 months ago
3 months ago

People

(Reporter: wsmwk, Unassigned)

Tracking

(4 keywords)

x86
All
crash, regression, regressionwindow-wanted, topcrash-thunderbird

Thunderbird Tracking Flags

(thunderbird_esr60? affected)

Details

(crash signature)

Attachments

(1 attachment)

(Reporter)

Description

4 months ago
In 60.3.1 we seem to have a new crop of users reporting password related crashes related.  Perhaps not all are regressions, but it seems at least some are new reports.  So this is pretty severe.

win7 shutdownhang | ntdll.dll@0x21f5d  bp-21ff389b-2cdb-42fb-9b02-74c3d0181118 

Top 10 frames of crashing thread:
0 ntdll.dll ntdll.dll@0x21f5d 
1 kernel32.dll kernel32.dll@0x9530f 
2 mozglue.dll mozilla::detail::ConditionVariableImpl::wait mozglue/misc/ConditionVariable_windows.cpp:58
3 xul.dll mozilla::CondVar::Wait xpcom/threads/CondVar.h:68
4 xul.dll mozilla::ThreadEventQueue<mozilla::PrioritizedEventQueue<mozilla::EventQueue> >::GetEvent xpcom/threads/ThreadEventQueue.cpp:155
5 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:967
6 xul.dll mozilla::Queue<nsCOMPtr<nsIRunnable>, 256>::Push xpcom/threads/Queue.h:54
7 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:517
8 xul.dll nsThread::Shutdown xpcom/threads/nsThread.cpp:796
9 xul.dll nsThreadManager::Shutdown xpcom/threads/nsThreadManager.cpp:332
=============================================================

win10 shutdownhang | kernelbase.dll@0x10e7f2 bp-ca395276-71cc-454e-b883-ab18d0181122

Eguivalent Mac crash signature not yet identified.
 
(There is also still Bug 1505042 - Crash in shutdownhang | BaseAllocator::malloc related to passwords)
(Reporter)

Updated

4 months ago
status-thunderbird_esr60: --- → affected
tracking-thunderbird_esr60: --- → ?
Flags: needinfo?(jorgk)

Comment 1

4 months ago
Aren't the two crash signatures listed here the same?
https://crash-stats.mozilla.com/signature/?signature=shutdownhang%20%7C%20ntdll.dll%400x21f5d&date=%3E%3D2018-11-16T18%3A21%3A19.000Z&date=%3C2018-11-23T18%3A21%3A19.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=-date&page=1#reports

Looking at the reports, they are not limited to 60.3.1 at all, there are 60.3.0 and 60.2.1 in there.

Maybe shipping the fix for the empty IMAP password has shifted something, but it doesn't look like it's the cause. I said the same in a PM re. kernelbase.dll@0x10e7f2 which doesn't appear to exist as a bug yet (unless there's a cut&paste error and you wanted to add that signature here instead of doubling up), and bug 1508649 doesn't look much different either.

Something still seems to be going on when the user shuts down leading to a MOZ_CRASH(). BTW, all those don't cause any harm or data loss.

What we need is a reproducible case. I can imagine that you will get a shutdown hang if you manage to close TB while a password prompt is in progress. But the password prompt is modal and you can't actually close TB. I tried hitting "OK", hence entering an empty password, and then shutting down. At times the main window closes and then the password prompt comes up, but answering that in any form just dismisses the panel and doesn't lead to a hang/crash.
Flags: needinfo?(jorgk)

Comment 2

4 months ago
Wayne, are you going to file a bug for kernelbase.dll@0x10e7f2 or will you include it here?
Flags: needinfo?(vseerror)
(Reporter)

Comment 3

4 months ago
> Looking at the reports, they are not limited to 60.3.1 at all, there are 60.3.0 and 60.2.1 in there.

Sure. What I meant, not so artfully, is that based on user comments "not all are regressions" and "at least some are new reports".  In other words some of these users say they never crashed before.  One might conclude that something changed.

>  BTW, all those don't cause any harm or data loss.

If you mean none of these cause dataloss you would be wrong.  Some users have in the past reported losing data, though perhaps it is a great minority - I cannot tell.  (I no longer remember the details)

> At times the main window closes and then the password prompt comes up, but answering that in any form just dismisses the panel and doesn't lead to a hang/crash.

Good thing not everyone can reproduce, then we'd be in bigger trouble. ;)


> Aren't the two crash signatures listed here the same?
> are you going to file a bug for kernelbase.dll@0x10e7f2 or will you include it here?

Let's included 0x10e7f2 here, because it begins at the same time that 0x21f5d greatly increases, namely in 60.3.1.  (But I do not know yet whether this coincides with reduced activity in another crash signature)
Crash Signature: [@ shutdownhang | ntdll.dll@0x21f5d] [@ shutdownhang | ntdll.dll@0x21f5d] → [@ shutdownhang | ntdll.dll@0x21f5d] [@ shutdownhang | kernelbase.dll@0x10e7f2]
Flags: needinfo?(vseerror)
(Reporter)

Comment 4

4 months ago
(Reporter)

Comment 5

3 months ago
#1 crash for beta.  #1 crash for 60.3.2

But gone in 60.3.3.  So related to a backout in 60.3.3?  (or did we back the same things out of 64.0b4?)
Flags: needinfo?(jorgk)
Keywords: topcrash-thunderbird

Comment 6

3 months ago
Those were Mozilla platform changes. Backed out from our branch on m-esr60, but 64 beta is running without a special branch this time (since I got tired of maintaining beta branches as well). So therefore no beta backout.
Flags: needinfo?(jorgk)

Comment 7

3 months ago
Both crash signatures happen for Firefox. Looks like backing out bug 1475775 fixes the crash. Dana, you might consider backing out that bug, given that it and follow-up 1496736 weren't actually the right approach, see bug 1496736 comment #30.
Flags: needinfo?(dkeeler)
Hmmm - I'm not convinced. For instance, bp-21ff389b-2cdb-42fb-9b02-74c3d0181118 is stuck in nsImapProtocol::GetPassword, which seems like bug 1257058.
Flags: needinfo?(dkeeler)
Here's a 60.3.3 Thunderbird report: bp-ec509c4c-cfa1-46b2-898a-80dfb0181207
There's also a number of 64.0b0 reports.
How about backporting Bug 1463901 and retry?

Comment 12

3 months ago
Hmm, good idea, can you see whether the patch applies to our branch?
> can you see whether the patch applies to our branch?

Not before Saturday but according to the bug it needs to be rebased. It seems unrelated on second glance but you never know. 

See thread 27, Name: LoadRoots and also Thread 28
https://crash-stats.mozilla.com/report/index/21ff389b-2cdb-42fb-9b02-74c3d0181118#allthreads

Comment 14

3 months ago
Unless I'm not reading this correctly, [@ shutdownhang | kernelbase.dll@0x10e7f2 ]  is completely gone. bp-ec509c4c-cfa1-46b2-898a-80dfb0181207 quoted in comment #10 is the other one, [@ shutdownhang | ntdll.dll@0x21f5d ]:
https://crash-stats.mozilla.com/signature/?signature=shutdownhang%20%7C%20ntdll.dll%400x21f5d

Crashes are all on Windows 7/XP, none on other platforms. There are crashes in TB 64 and FF 63/64, so it's unclear whether bug 1463901 which landed on mozilla62 has helped at all.

As Dana pointed out, most crashes affect TB and may be related to some IMAP password issue.
(Reporter)

Comment 15

3 months ago
kernelbase.dll@0x10e7f2 is gone in 60.3.3 and 64.0b4 (does exist in 60.3.2 and 64.0b3)

OTOH, shutdownhang | ntdll.dll@0x21f5d and shutdownhang | ntdll.dll@0x6c55c (no bug report but a high rate crash) is likewise gone in 64.0b4, but not gone in 60.3.3.

And FWIW, 64.0b4 uptake "looks" pretty crappy against 64.0b3 last 7 days [2]
[1] https://crash-stats.mozilla.com/search/?build_id=20181129151909&product=Thunderbird&date=>%3D2018-11-16T14%3A36%3A57.000Z&date=<2018-12-16T14%3A36%3A57.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature 
[2] https://crash-stats.mozilla.com/search/?build_id=20181119132201&product=Thunderbird&version=64.0b0&date=>%3D2018-12-09T14%3A40%3A17.000Z&date=<2018-12-16T14%3A40%3A17.000Z&_sort=-build_id&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature

I'm not sure yet what to make of that, because the background update rate is 100 - nearly everyone should be updated (except perhaps linux users)


> As Dana pointed out, most crashes affect TB and may be related to some IMAP password issue.

I don't distrust that.  Perhaps we take more time to analyze?

Comment 16

3 months ago
(In reply to Wayne Mery (:wsmwk) from comment #15)
> OTOH, shutdownhang | ntdll.dll@0x21f5d and shutdownhang | ntdll.dll@0x6c55c
> (no bug report but a high rate crash) is likewise gone in 64.0b4, but not
> gone in 60.3.3.
Sorry, I have trouble reading the reports:
shutdownhang | ntdll.dll@0x21f5d
https://crash-stats.mozilla.com/signature/?signature=shutdownhang%20%7C%20ntdll.dll%400x21f5d

I see a line:
Thunderbird 	64.0b0 	22 	0.9% 	23

What am I missing?
Flags: needinfo?(vseerror)
(Reporter)

Comment 17

3 months ago
(In reply to Jorg K (GMT+1) from comment #16)
> (In reply to Wayne Mery (:wsmwk) from comment #15)
> > OTOH, shutdownhang | ntdll.dll@0x21f5d and shutdownhang | ntdll.dll@0x6c55c
> > (no bug report but a high rate crash) is likewise gone in 64.0b4, but not
> > gone in 60.3.3.
> Sorry, I have trouble reading the reports:
> shutdownhang | ntdll.dll@0x21f5d
> https://crash-stats.mozilla.com/signature/
> ?signature=shutdownhang%20%7C%20ntdll.dll%400x21f5d
> 
> I see a line:
> Thunderbird 	64.0b0 	22 	0.9% 	23
> 
> What am I missing?

release, crash still active - https://crash-stats.mozilla.com/signature/?product=Thunderbird&release_channel=release&signature=shutdownhang%20%7C%20ntdll.dll%400x21f5d&date=%3E%3D2018-09-16T18%3A33%3A27.000Z&date=%3C2018-12-16T18%3A33%3A27.000Z

beta (remember crash-stats is showing all 64.0b* as 64.0b0), there are no crashes with buildid 20181129151909 which is 64.0b4 (the crashes end at build 20181119132201 which is 64.0b3)  https://crash-stats.mozilla.com/signature/?product=Thunderbird&release_channel=beta&version=64.0b0&version=64.0b1&signature=shutdownhang%20%7C%20ntdll.dll%400x21f5d&date=>%3D2018-09-16T18%3A33%3A27.000Z&date=<2018-12-16T18%3A33%3A27.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=-date&page=1#reports
Flags: needinfo?(vseerror)

Comment 18

3 months ago
OK, so you're saying that the crash disappeared between TB 64.0b3 and TB 64.0b4.

Those were build on two differente M-B versions, as the pinning shows:
https://hg.mozilla.org/releases/comm-beta/rev/7ee2051d5d5d1b587a2c0ab2280a36feb5bfb277#l1.12
-              GECKO_HEAD_REF: '7473fdd1c21c' <== used for b3
+              GECKO_HEAD_REF: 'efca407c5be1' <== used for b4

I'll check what the bug could be that fixed the crashes, but we can say for sure that it wasn't bug 1463901 since that landed on mozilla62 already.

Thanks Wayne!!

Comment 19

3 months ago
https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=7473fdd1c21c&tochange=efca407c5be1

Hard to tell what could be in there that will fix shutdown crashes, maybe:
76164912276b Andrew McCreight - Bug 1504365 - Clear weak pointers in shutdown observers. r=erahm, a=RyanVM

Let's NI the whole world here:
Wayne and FRG, would you like to play Oracle here?
Wayne do you have access to that security bug? 
Andrew and Eric, could you grant me access to that security bug and/or tell me whether that bug might explain the solution of those crashes.
Flags: needinfo?(vseerror)
Flags: needinfo?(frgrahl)
Flags: needinfo?(erahm)
Flags: needinfo?(continuation)
(Reporter)

Comment 20

3 months ago
> OK, so you're saying that the crash disappeared between TB 64.0b3 and TB 64.0b4.

Yes - my wording was not spot on, but you got it.
Flags: needinfo?(vseerror)

Comment 21

3 months ago
Oh, bug 1504365 already got uplifted to m-esr60 at 60.4. I guess we just build TB 60.4 and see what happens.
That patch does fix some shutdown crashes, so it is plausible that it is related. I could believe that Thunderbird uses the font stuff. I don't know how that would be related to shutdown hangs as in comment 0.

It looks like you are CCed now.
Flags: needinfo?(erahm)
Flags: needinfo?(continuation)
Clearing the NI after seeing comment 21.
Flags: needinfo?(frgrahl)

Comment 24

3 months ago
Turned out that TB 64 beta 4 had very little uptake since the automatic update didn't work. So the absence of this crash from that version doesn't mean that it's really gone :-(
(Reporter)

Comment 25

3 months ago
win10 shutdownhang | ntdll.dll@0x6c55c
win7  shutdownhang | ntdll.dll@0x21f5d
Crash Signature: [@ shutdownhang | ntdll.dll@0x21f5d] [@ shutdownhang | kernelbase.dll@0x10e7f2] → [@ shutdownhang | ntdll.dll@0x21f5d] [@ shutdownhang | kernelbase.dll@0x10e7f2] [@ shutdownhang | ntdll.dll@0x6c55c]

Comment 26

3 months ago
(In reply to Jorg K (GMT+1) (urgent reviews and bustage fix only, Dec 22nd to Jan 1st) from comment #24)
> Turned out that TB 64 beta 4 had very little uptake since the automatic
> update didn't work. So the absence of this crash from that version doesn't
> mean that it's really gone :-(
Sadly so, it's still there and bug 1504365 didn't help :-(
You need to log in before you can comment on or make changes to this bug.