Crash in shutdownhang | NtCreateUserProcess
Categories
(External Software Affecting Firefox :: Other, defect)
Tracking
(firefox-esr60 wontfix, firefox55 wontfix, firefox56 wontfix, firefox57 wontfix, firefox66 wontfix, firefox67 wontfix, firefox68 wontfix, firefox69 fix-optional, firefox70 fix-optional)
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | wontfix |
firefox55 | --- | wontfix |
firefox56 | --- | wontfix |
firefox57 | --- | wontfix |
firefox66 | --- | wontfix |
firefox67 | --- | wontfix |
firefox68 | --- | wontfix |
firefox69 | --- | fix-optional |
firefox70 | --- | fix-optional |
People
(Reporter: marco, Unassigned)
References
Details
(Keywords: crash, regression)
Crash Data
Attachments
(2 files)
Comment 1•7 years ago
|
||
Comment 2•7 years ago
|
||
Comment 3•7 years ago
|
||
Reporter | ||
Comment 4•7 years ago
|
||
Comment 5•7 years ago
|
||
Reporter | ||
Comment 6•7 years ago
|
||
Comment 7•7 years ago
|
||
Comment 8•6 years ago
|
||
Comment 9•6 years ago
|
||
Comment 10•6 years ago
|
||
Comment 11•6 years ago
|
||
Comment 12•6 years ago
|
||
Comment 13•6 years ago
|
||
bugherder |
Comment 14•6 years ago
|
||
We've been a bit unlucky with this one. Since the patch landed we haven't had many crashes on nightly and the few that were reported were for previous builds.
Comment 15•6 years ago
|
||
I still haven't figured out what's going on here. The new annotation isn't helping much: I found it only in a couple of reports and the contents was "1" which should not be possible! It's odd but maybe I did something wrong.
Comment 16•6 years ago
|
||
I went back and sifted through the crashes again. First of all not many have the annotation set, which means we're not catching this scenario very often with it. The second thing that came up is that I've got the annotation wrong too; I should have realized it immediately but I didn't. Instead of passing a reference to an nsACString
I'm passing a const char*
which the compiler is promptly and quietly converting to a bool
, thanks C++!
To get out of the impasse I've downloaded a bunch of raw minidumps from Socorro and I will be inspecting them manually. There's a good chance that the executable name is buried in the stack so I should be able to find it.
Comment 17•6 years ago
|
||
I've manually downloaded a lot of raw dumps at random and inspected their stacks: most of them had the pingsender executable path on the main thread stack, the rest had firefox instead. So, we're getting stuck on trying to spawn pingsender or restarting firefox after an update. My guess is that this is happening because the kernel call is hooked by whatever AV software the user has installed and the AV tries to scan the (new) executable it sees.
This is not going to be easy to solve. For pingsender I wonder if it would be possible to launch the program early during shutdown, freeze it, assemble the ping, hand it over to pingsender and then resume its execution. I'll give it a spin ASAP.
Comment 18•6 years ago
|
||
A reminder that new data collections are subject to Data Collection Review: https://wiki.mozilla.org/Firefox/Data_Collection
Comment 19•6 years ago
|
||
...and apparently by "new" I mean "four months ago when the annotation was added" :S
Updated•6 years ago
|
Comment 20•6 years ago
|
||
(In reply to Chris H-C :chutten from comment #18)
A reminder that new data collections are subject to Data Collection Review: https://wiki.mozilla.org/Firefox/Data_Collection
Does that apply to annotations that don't go in the crash ping too? Because by default crash annotations will only go in crash reports explicitly submitted by users but will not be included in the ping.
BTW I was about to remove this one since we don't need it anymore :P.
Comment 21•6 years ago
|
||
Comment 22•6 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #20)
(In reply to Chris H-C :chutten from comment #18)
A reminder that new data collections are subject to Data Collection Review: https://wiki.mozilla.org/Firefox/Data_Collection
Does that apply to annotations that don't go in the crash ping too? Because by default crash annotations will only go in crash reports explicitly submitted by users but will not be included in the ping.
BTW I was about to remove this one since we don't need it anymore :P.
Excitingly, it does! The Data Collection Review applies to all new or changed data collections, not just Telemetry ones.
This would've been an easy one, but since you've already filed a patch to remove it I guess I'll just put a vague assurance here that the collection appears to have conformed to the standards we hold for Data Collections in Firefox. It was documented, opt-in, limited data to answer a very specific question, and it was only collected for a short time.
Comment 23•6 years ago
|
||
Comment 24•6 years ago
|
||
(In reply to Chris H-C :chutten from comment #22)
Excitingly, it does! The Data Collection Review applies to all new or changed data collections, not just Telemetry ones.
Thanks, I'll keep it in mind.
This would've been an easy one, but since you've already filed a patch to remove it I guess I'll just put a vague assurance here that the collection appears to have conformed to the standards we hold for Data Collections in Firefox. It was documented, opt-in, limited data to answer a very specific question, and it was only collected for a short time.
Plus it was buggy so it always returned "1" instead of the executable name :)
Comment 25•6 years ago
|
||
bugherder |
Updated•6 years ago
|
Comment 26•6 years ago
|
||
Marking as FIXED since the patch landed on m-c in comment 25.
Updated•6 years ago
|
Comment 27•6 years ago
|
||
(In reply to Liz Henry (:lizzard) (use needinfo) from comment #26)
Marking as FIXED since the patch landed on m-c in comment 25.
That was a backout of instrumentation to look into the issue. Comment 17 has more details on what we think is going on.
Updated•6 years ago
|
Comment 28•5 years ago
|
||
Gabriel, is there an update on this one?
Comment 29•5 years ago
|
||
No, I haven't devised a way to solve this yet.
Updated•5 years ago
|
Comment 30•5 years ago
•
|
||
Quick update: I've been pulled left and right on other more pressing issues but I'm still planning a fix for this.
Comment 31•4 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:gsvelto, maybe it's time to close this bug?
Comment 33•4 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:gsvelto, maybe it's time to close this bug?
Comment 34•4 years ago
|
||
I opened a few more crashes to refresh my memory and the situation is the same as in comment 17. We're getting stuck while launching pingsender.exe or relaunching firefox.exe. In every case I can find an AV module loaded into memory so the kernel call has been hooked by the AV software and is being slowed down as it "scans" the new executable.
There is no real way to mitigate this as the time that might be spent by the AV scanner is effectively unbounded. In both cases we might tell the shutdown terminator to wait a little longer but there's no guarantee that it will be ever enough. I'm seriously tempted to close this as WONTFIX.
Comment 35•4 years ago
|
||
Unassigning myself because I don't think there's much more we can do here.
Comment 36•3 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:gcp, maybe it's time to close this bug?
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Description
•