Open Bug 1279293 (IPCError_ShutDownKill) Opened 8 years ago Updated 3 months ago

[meta] Crash in [@ IPCError-browser | ShutDownKill]

Categories

(Core :: DOM: Content Processes, defect, P2)

38 Branch
defect

Tracking

()

Tracking Status
firefox47 --- wontfix
firefox48 --- wontfix
firefox49 --- wontfix
firefox-esr45 --- wontfix
firefox50 + wontfix
firefox51 --- wontfix
firefox52 --- wontfix
firefox-esr52 --- wontfix
firefox53 --- wontfix
firefox54 --- wontfix
firefox55 --- wontfix
firefox56 --- wontfix
firefox57 --- wontfix
firefox58 --- wontfix
firefox59 --- wontfix
firefox60 --- wontfix
firefox61 --- wontfix
firefox68 --- wontfix
firefox69 --- wontfix
firefox70 --- wontfix
firefox73 --- wontfix
firefox74 --- wontfix
firefox75 --- wontfix
firefox83 --- wontfix
firefox84 --- wontfix
firefox85 --- wontfix
firefox86 --- wontfix
firefox87 --- wontfix
firefox88 --- wontfix
firefox89 --- wontfix
firefox90 --- wontfix
firefox91 --- wontfix
firefox92 --- wontfix
firefox93 --- wontfix
firefox94 --- wontfix

People

(Reporter: marvinhk, Unassigned)

References

(Depends on 11 open bugs, )

Details

(5 keywords)

Crash Data

Attachments

(2 files)

This bug was filed from the Socorro interface and is 
report bp-3bbe367b-ff88-4040-90de-0567a2160609.
=============================================================
new signature in JSStructuredCloneWriter
I had a huge FF session in SafeMode and by exit got this crash, too: https://crash-stats.mozilla.com/report/index/d0758d13-d0c8-4ccf-985b-bf1522160615
This link seems to freeze browser and I just got a 'shut down' kill' crash.

https://crash-stats.mozilla.com/report/index/cfd92e66-a5e1-44be-b18d-33b982160625

Running Win10 x64 and Win32 Nighly builds.
 	Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0
e10s in 'enabled'  Same, goes not responding with e10s 'off'

No idea when this started, just noticed today trying to visit:
http://www.mayoclinic.org/diseases-conditions/retrograde-ejaculation/basics/definition/con-20030795
Status: UNCONFIRMED → NEW
Ever confirmed: true
Just discovered that it does not Hang with 'Tracking Protection' 'OFF' 

Options->Privacy:  Had it set to 'Always',  Flipping to 'Never' stops the hang.
This signature covers a lot of possible issues. It looks like we don't have a bug on it, so this can be the one.

Jim, I filed bug 1282580 for the issue you're seeing. The shutdown hang is really a side effect of a different problem (that the site is hanging).
See Also: → 1282580
20160629030209 Mozilla/5.0 (Windows NT 6.1; rv:50.0) Gecko/20100101 Firefox/50.0

Nightly 50.0a1 crashed having the same signature as the one from this bug with the following steps:

1. Load http://html.spec.whatwg.org/
2. Press Ctrl+F to open the Find toolbar
3. Scroll the page up and down
4. Close FF

Expected results:
The page should be loaded without crashing


Actual results:
The page is loaded, Find toolbar is not opened,the page is not scrolled and after page is closed Firefox crashes: 
https://crash-stats.mozilla.com/report/index/e013a326-36b4-4cad-829c-8efc42160630
This is a generic signature, so different steps to cause a crash from this signature should have separate bugs, blocking this bug.
We'll need a multifaceted approach here. Comment 5 isn't really even a bug, exactly. If a web page is doing a lot of work and you quit, then the content process is going to be slow to respond (slower than 5 seconds). It's probably reasonable to increase the timeout to 30 seconds.
Priority: -- → P2
I thought the plan here was to get rid of this entirely by just killing the content process when we didn't need it any more. We cannot afford to let content block shutdown for significant periods of time (5 seconds is already too much in general).
Can I ask something?

I have a lots of tabs, so most of the time e.g. last year closing/restarting FireFox took forever (sometimes literally since you would just never see the damn thing going away from the task manager process list no matter for how long you wait), but in recent months there was a change that has finally put a time limit on how long FF can stay in the memory after you commanded exit. Though it is still takes unacceptably long time.

Correct me if I am wrong, but from what I understand, there were no fundamental changes in how FF operates, but what was done was literally implementation of a time limit after which FireFox/PluginContainer gets killed no matter what. As result, ever since that change, absolutely every exit/restart I have in FF ends with a crash like this one or of another type of ShutDownKill. 

Can some please explain why it is impossible to redesign things in a way that would allow FireFox exit/restart ***right away***? Or maybe such development project is already ongoing within Mozilla quarters, and it is supposed to "land" in FF version 55 or something?

Thanks in advance.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #8)
> I thought the plan here was to get rid of this entirely by just killing the
> content process when we didn't need it any more. We cannot afford to let
> content block shutdown for significant periods of time (5 seconds is already
> too much in general).

The problem I realized only recently is with "beforeunload", "unload", and "pagehide" events. Currently, Firefox fires them on shutdown. They can do sync XHRs, so they have observable side effects. Sync XHRs are "deprecated" (although no one seems very hopeful that we'll ever be able to remove them), but the beacon API is new and we would need to support that at shutdown as well (currently I'm not sure if that even works).

If we stop firing this stuff at shutdown, we're probably going to break a lot of websites. There's a github issue on the topic [1] that links to a bunch of Chrome usage counters, and it seems like a lot of sites are using sync XHR in unload (something like 0.3% of web pages as far as I understand the data).

I think we could probably make an effort to fire these event listeners but not do any of the other teardown activities associated with destroying a docshell. That might save us a good amount of time.

[1] https://github.com/whatwg/xhr/issues/20#issuecomment-185163375
Setting needinfo to Benjamin in case you have an opinion or ideas on comment 10.
Flags: needinfo?(benjamin)
I assumed that shutdown worked like this (and I'm totally terrible for assuming this):

* the Firefox UI code knows that we're quitting.
** It triggers beforeunload handlers before we've actually decided to quit, so that we can support the returnValue/confirmation UI.
** Then we trigger the unload (and pagehide?) events as part of closing the Firefox window
** This process also collects any final session restore information
* Only after we're finished shutting down the user-visible bits do we trigger the content process to quit
** At this point, the content process shouldn't contain any important user data and in non-leakchecking builds we can just kill it (using TerminateProcess/SIGTERM)
** In leakchecking builds we'd do the full/painful shutdown sequence

So you're saying that we don't trigger some of the unload events until we've actually told the content process to quit? That seems like it might be both a UI regression (in case the beforeunload event has quit confirmation prompts) and might cause weird teardown sequence errors in the Firefox UI.
Flags: needinfo?(benjamin) → needinfo?(wmccloskey)
Your description is correct as I understand things, except maybe for when we start the shutdown timer.

A typical sequence is:
1. We run "beforeunload" events before everything else.

2. Parent does session restore, which does spin the event loop waiting for the child. But at this point nothing has been closed and this is pretty fast.

3. Parent closes all the windows, which closes all tabs, which causes an async message to be sent to the child asking it to tear down the docshell for that tab. Destroying a docshell fires "unload"/"pagehide" and also frees memory for the tab (DOM, frame tree, etc.).

4. Parent ends up in ContentParent::Observe("xpcom-shutdown"), at which time it asks the child to shut down. If the child fails to shut down after 5 seconds, it kills it.

We could introduce more waiting to give the child a chance to finish running its "unload"/"pagehide"/docshell destruction code before we start the 5 second timer in step (4). However, that would defeat the purpose of the timer, since AFAIK step (3) is what takes all the time. When you have 20 tabs and we have to free the memory for all of them as well as handle any sync network requests they make, it can easily take > 5 seconds.

To put it another way, once all the docshells are gone, shutting down the content process is trivial (in opt builds). There's a little bit of message traffic with the parent, but basically the child just calls QuickExit.

If we want to save time here, I think the best we can do is avoid freeing the DOM/frame tree/whatever else. I'd be interested in how other browsers handle this. I Googled for "firefox shutdown slow" and "chrome shutdown slow". There are a lot more results for Firefox.
Flags: needinfo?(wmccloskey)
Tracking this for 50, seems like a high volume crash good to keep an eye on.
>  However, that would defeat the purpose of the timer, since AFAIK step (3) is what takes all the time.

How about tearing down tabs 1 by 1 and giving each one a separate timeout. That way a hang can still be detected by the timeout scales with the tabs.
Crash volume for signature 'IPCError-browser | ShutDownKill':
  - aurora (49): 69120
  - beta (48): 805
  - release (47): 702
  - esr (45): 14

Affected platforms: Windows, Mac OS X, Linux
(In reply to Bill McCloskey (:billm) from comment #13)

> If we want to save time here, I think the best we can do is avoid freeing
> the DOM/frame tree/whatever else. I'd be interested in how other browsers
> handle this. I Googled for "firefox shutdown slow" and "chrome shutdown
> slow". There are a lot more results for Firefox.

Andrew do you know who might be able chase this?
Flags: needinfo?(overholt)
Would the idea with this be to, instead of closing all of the windows, trigger the firing of these "beforeunload", doing SessionStore stuff, "unload", and "pagehide" etc. events, and then just kill the child process outright, without performing any of the usual cleanup? (I presume that the child process would QuickExit() itself)
Flags: needinfo?(wmccloskey)
(In reply to Michael Layzell [:mystor] from comment #18)
> Would the idea with this be to, instead of closing all of the windows,
> trigger the firing of these "beforeunload", doing SessionStore stuff,
> "unload", and "pagehide" etc. events, and then just kill the child process
> outright, without performing any of the usual cleanup? (I presume that the
> child process would QuickExit() itself)

Yes. We already avoid application-level cleanup (e.g., XPCOM shutdown) by calling QuickExit. We additionally would like to avoid any docshell-level cleanup we're doing now. Probably the first step, though, is to see how expensive that cleanup is.
Flags: needinfo?(wmccloskey)
Flags: needinfo?(overholt)
¡Hola!

Just crashed like this on Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:51.0) Gecko/20100101 Firefox/51.0 ID:20160831030224 CSet: 506facea63169a29e04eb140663da1730052db64

Report ID 	Date Submitted
bp-5141c955-afeb-4668-ac9f-c4dcb2160831
	31/08/2016	09:31 a.m.

https://crash-stats.mozilla.com/signature/?product=Firefox&signature=IPCError-browser%20%7C%20ShutDownKill says there were 35318 in the past week so this one seems to be popular.

¡Gracias!
Alex
After a normal Nightly update, I got a "you have unsubmitted crash reports" infobar. After submitting, I went to about:crashes and saw that it was a crash with this signature (https://crash-stats.mozilla.com/report/index/e9600c03-5f4c-47a8-b33f-39ee72160831). Since the "crash" wasn't actually user-visible in any way, it probably falls under the "not a real bug" scenario, and it was only made evident to me by the existence of the infobar. I've gotten that same infobar a number of times in the last month after updating Nightly, so it wouldn't surprise me if a lot of the crashes reported with this signature are also not otherwise user-visible.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #21)
> After a normal Nightly update, I got a "you have unsubmitted crash reports"
> infobar. After submitting, I went to about:crashes and saw that it was a
> crash with this signature
> (https://crash-stats.mozilla.com/report/index/e9600c03-5f4c-47a8-b33f-
> 39ee72160831). Since the "crash" wasn't actually user-visible in any way, it
> probably falls under the "not a real bug" scenario, and it was only made
> evident to me by the existence of the infobar. I've gotten that same infobar
> a number of times in the last month after updating Nightly, so it wouldn't
> surprise me if a lot of the crashes reported with this signature are also
> not otherwise user-visible.

Yes, I also get the crash report bar for this signature after updating Nightly.  I believe I see it for every Nightly update.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #21)
> Since the "crash" wasn't actually user-visible in any way, it
> probably falls under the "not a real bug" scenario, and it was only made
> evident to me by the existence of the infobar.

It might be a real bug because my YouTube tabs can never remember the time when the playback was stopped before Firefox was closed/restarted -- as FF can never finish the process properly, it always crashes. It is super annoying, and for more than half of a year nothing gets fixed. :(

https://crash-stats.mozilla.com/report/index/1ee465ba-fc81-417c-9bc2-8c4852160831

Could
Flags: needinfo?(bugmail)
I literally see it every time I restart my Nightly. 
As Kartikaya said, I never saw it, except for the constant reminders about having unsubmitted crash reports. 

I've set dom.ipc.tabs.shutdownTimeoutSecs to 30 seconds for now and I don't get those crash reports anymore. I've probably submitted around 200 reports at this point.
I just wanted to ask if it is possible to have an approximate timeline when those FF crashes during exit will go away. Maybe solving this will take a quarter or two, and maybe some one will be assigned to this bug? Thank you in advance.
Flags: needinfo?(bugmail)
IMO if the content process is blocked by the event handler of unload or pagehide, we should not treat them as shutdown hang. We should think of a way to let the script finish or set a timeout if the sync API is called in such event handlers.
I think there are maybe 36% content shutdown hang kill bugs are from the nested-event-loop handling. I'm not sure but I want to add some annotations in ContentChild to measure this.
Component: IPC → DOM: Content Processes
Depends on: 1301339
Depends on: 1301346
See Also: → 1301464
I'm thinking we should start here by increasing the timeout a little bit. I looked through the crash reports today and didn't see too many patterns. We may have a problem with sync messages to the parent that aren't being handled, but it's not certain that that's what's happening.
Based on the patch Kan-Ru added in bug 1301339, out of 5763 ShutDownKill crashes in the last week, we received the Shutdown message in 278 of them (and sent FinishShutdown in 1).

So in most cases we never even get a chance to receive the Shutdown message.
(In reply to Bill McCloskey (:billm) from comment #34)
> Based on the patch Kan-Ru added in bug 1301339, out of 5763 ShutDownKill
> crashes in the last week, we received the Shutdown message in 278 of them
> (and sent FinishShutdown in 1).
> 
> So in most cases we never even get a chance to receive the Shutdown message.

That's terrible. Maybe shutdown message should not use PContent. It can use PBackground and preempt the current task on content main thread.
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #35)
> (In reply to Bill McCloskey (:billm) from comment #34)
> > Based on the patch Kan-Ru added in bug 1301339, out of 5763 ShutDownKill
> > crashes in the last week, we received the Shutdown message in 278 of them
> > (and sent FinishShutdown in 1).
> > 
> > So in most cases we never even get a chance to receive the Shutdown message.
> 
> That's terrible. Maybe shutdown message should not use PContent. It can use
> PBackground and preempt the current task on content main thread.

The problem is that we're really supposed to run onunload handlers before we shut down, and I suspect that might be what's taking so long.
Kan-Ru & :billm
This volume of the crashes is getting worse in 51 aurora. I this the priority should be adjusted to P1. Can you help to investigate this more?
Flags: needinfo?(wmccloskey)
Flags: needinfo?(kchen)
Priority: P2 → P1
bug 1301346 should get us more insight about onunload handler behavior.
Flags: needinfo?(kchen)
Flags: needinfo?(wmccloskey)
This signature got a lot worse (from 7 crashes a day to 5000 crashes a day) on October 20th. 
Maybe we can find a regression range. Can the uptime team help?
Flags: needinfo?(n.nethercote)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #39)
> This signature got a lot worse (from 7 crashes a day to 5000 crashes a day)
> on October 20th. 
> Maybe we can find a regression range. Can the uptime team help?

Looking at the Windows builds at https://dbaron.org/mozilla/crashes-by-build...

- Oct 18: 119
- Oct 19: 172
- Oct 20: 1317
- Oct 21: 35 (*)
- Oct 22: 889

(*) Ignore this one because the internet was mostly broken that day due to the Dyn DDoS issues and very few users got an updated Nightly. This probably also means that the Oct 20 number is inflated.

Judging from this the number of daily crashes went from ~100 to ~1000. Still not good, but not quite as bad as 7 to 5000.

I then used the regression window tool at https://dbaron.org/mozilla/crashes-by-build (click on the "Choose regression window" button, then select the Oct 19 and Oct 20 Windows builds) and got this regression range:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=90d8afaddf9150853b0b68b35b30c1e54a8683e7&tochange=99a239e1866a57f987b08dad796528e4ea30e622

A couple of wild guesses about changes that might have caused this:

* Valentin Gosu — Bug 1294719 - Make sure to check mIPCClosed before calling SendRedirect1Begin r=honzab 
* Aaron Klotz — Bug 1241921: Disable async plugin init regardless of pref; r=jimm 

Valentin, Aaron, could either of these changes have caused the number of crashes with this signature to have increased by ~10x?
If anyone else could look through the regression list, that would be helpful too.
Flags: needinfo?(valentin.gosu)
Flags: needinfo?(n.nethercote)
Flags: needinfo?(aklotz)
(In reply to Nicholas Nethercote [:njn] from comment #40)
> * Aaron Klotz — Bug 1241921: Disable async plugin init regardless of pref;
> r=jimm 

My change could not have done that, as that feature was already disabled by pref. That patch simply hardcoded it.

That merge also included the enabling of a11y+e10s on Windows Vista and newer, but I took the liberty of checking the aggregations for that. Only 5% of Windows crashes since October 20 had a11y turned on.
Flags: needinfo?(aklotz)
Also if this change
68956648f506	Aaron Klotz — Bug 1241921: Remove CreateWindow* hooks from IPC glue; r=jimm
were to cause any problems, they would typically be a spike in stack overflow crashes, which is not what we're seeing here. I'm ruling that one out too.
Drop in crashes between 12th and 21st is possibly from Socorro updates.
From the crash stat report https://goo.gl/VRclPs, there is a big spike in content crashes on Oct 20. From the pushlog between Oct 19 and Oct 20 on aurora branch - https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?startdate=2016-10-19&enddate=2016-10-20, there are coupe of fixes landed in aurora.

Just a guess, the fix of bug 1308397 & bug 1304449 might have something to do with this spike.

:aklotz, can you help take a look at this?
Flags: needinfo?(aklotz)
(In reply to Nicholas Nethercote [:njn] from comment #40)
> * Valentin Gosu — Bug 1294719 - Make sure to check mIPCClosed before calling
> SendRedirect1Begin r=honzab 
> could these changes have caused the number of
> crashes with this signature to have increased by ~10x?

It is possible that some of the crashes in bug 1294719 could have transformed to this one. It would be easy to find out by backing out that patch since it didn't fix the issue. However, based on comment 44 I understand the crash spike also affected aurora. That patch didn't get uplifted.
Flags: needinfo?(valentin.gosu)
This doesn't seem to be worse on beta50 than previous releases, and it's getting late in the cycle, wontfix.
(In reply to Gerry Chang [:gchang] from comment #44)
> From the crash stat report https://goo.gl/VRclPs, there is a big spike in
> content crashes on Oct 20. From the pushlog between Oct 19 and Oct 20 on
> aurora branch -
> https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?startdate=2016-10-
> 19&enddate=2016-10-20, there are coupe of fixes landed in aurora.
> 
> Just a guess, the fix of bug 1308397 & bug 1304449 might have something to
> do with this spike.
> 
> :aklotz, can you help take a look at this?

Once again, most of those crashes happen with a11y off. Also, if you isolate this signature to Mac OS X, you see a similar spike. I maintain that something else is causing this.
Flags: needinfo?(aklotz)
Also those two patches landed on Nightly on Oct 6 and 18, respectively.
There was a spike in both Nightly and Aurora between the Oct 19 and the Oct 20 builds.

The only bugs that landed on Oct 19 in both Nightly and Aurora are bug 1305993 and bug 1309198.
Flags: needinfo?(jonathan)
The period I list is from;
https://crash-analysis.mozilla.com/rkaiser/crash-report-tools/longtermgraph/?fxaurora
Which also has a loss of plugin hangs for the period.

You or someone with knowledge of Socorro would have to pinpoint possible change. Just reading commit messages; nothing sticks out to me (don't know the code) to end the loss, bug 1306449 possibly start.
Flags: needinfo?(jonathan)
Do you know if something happened that would cause this crash to almost disappear between Oct 12 and Oct 21?
Flags: needinfo?(peterbe)
Flags: needinfo?(adrian)
(In reply to Marco Castelluccio [:marco] from comment #52)
> Do you know if something happened that would cause this crash to almost
> disappear between Oct 12 and Oct 21?

Most likely it's because of a bug in the processor. 
It was introduced in https://github.com/mozilla/socorro/commit/1e180913c464bdfc77602fc26a9cb8bea0c47712 (deployed on Oct 12)
And later fixed in https://github.com/mozilla/socorro/commit/c049e63f6bdd0e29acadbe326a68d9890fcd67f0 (deployed on Oct 21)

The original bug that found this was: https://bugzilla.mozilla.org/show_bug.cgi?id=1311697
Ludovic noticed it by simply trying to load a crash.
Here was a crash that just never succeeded in the processor. We debugged it and found the problem and after deploying that particular crash could be processed again. 

We thought this was a unique snowflake because it didn't appear to impact the general numbers. (Or we just didn't look at the right range when comparing to previous days)

By the way, this was my first processor bug that I had to work on and to avoid this happening in the future we just landed Sentry error handling in the processor. Prior to that, the only way we could find out that certain crashes fail and drop on the floor was if people ssh into the processor nodes and look at the logs. 

Marco, do you want to start a reprocessing of this date range? If so, talk to me about doing it "gently" and I'll talk to JP about scaling up the infra.
Flags: needinfo?(peterbe)
Flags: needinfo?(adrian)
Are you able to reproduce this crash consistently? Do you have steps to reproduce?
Flags: needinfo?(condacum)
After installing FF50 (ok with previous builds), I have this silent crash every day, i.e. when I close firefox at the end of the day.
Flags: needinfo?(condacum)
silent crash because the day after I notice the crash of the day before and so on...never notice before ff50.
(In reply to Marco Castelluccio [:marco] from comment #58)
> Are you able to reproduce this crash consistently? Do you have steps to
> reproduce?

:marco You didn't ask me, but I continue to have several of these crashes each day. For example, Nightly updates itself about mid-morning for me. about:crashes shows me that every updates brings a silent crash, e.g. https://crash-stats.mozilla.com/report/index/cf2a5f92-14da-4e1d-a052-f667a2161217.
I'm not sure if this is of any help, but I've managed to reproduce this crash pretty consistently with the latest m-c [1] using the following STR:

* install the latest version of m-c
* set "When nightly starts: Show my window and tabs from last time" under about:preferences#general
* opened several tabs/websites in several different custom containers (http://imgur.com/a/zMh60)
* closed/restarted fx several times

I noticed that the crash started occurring when I tried opening m-c while it was attempting to shutdown initially. After that, I started getting the crash with every fx restart.

Crashes:

* bp-ff8ee990-5b23-43c3-a3e8-4e48e2161219
* bp-1146cec8-6ed7-42cf-9571-67f872161219
* bp-1384ea42-7d95-46f8-be8b-a3d5e2161219
* bp-e296be83-7fa1-4ede-afb0-455cb2161219

[1] fx53.0a1, buildid: 20161219030207, changeset: 863c2b61bd27
I've managed to narrow down the STR from comment#62 to the following:

* install the latest version of m-c
* open https://twitter.com/i/moments/810878086774456320
* let one of the random gifs start autoplaying and quickly shutdown FX

You'll notice that the shutdown process will hang and will take several seconds for fx to correctly close.

I've created a quick video of the STR that I used to reproduce the crash on a brand new m-c installation under macOS 10.12.2:
* https://youtu.be/SQ6x5iWVaqg
(In reply to Kamil Jozwiak [:kjozwiak] from comment #63)
> I've managed to narrow down the STR from comment#62 to the following:
> 
> * install the latest version of m-c
> * open https://twitter.com/i/moments/810878086774456320
> * let one of the random gifs start autoplaying and quickly shutdown FX
> 
> You'll notice that the shutdown process will hang and will take several
> seconds for fx to correctly close.
> 
> I've created a quick video of the STR that I used to reproduce the crash on
> a brand new m-c installation under macOS 10.12.2:
> * https://youtu.be/SQ6x5iWVaqg

Can you file a new bug for this (be sure to also attach a link to a crash report)?
(In reply to Marco Castelluccio [:marco] from comment #64) 
> Can you file a new bug for this (be sure to also attach a link to a crash
> report)?

Sure :) Created bug#1324820. Let me know if there's anything else that I can help out with.
Depends on: 1346161
Could you describe your environment with more details? For example how many tabs were opened when you were closing Firefox? Can you reproduce it with addons disabled? It will help us understand the issue and increase the possibility of getting this fixed. It will help if you have clear steps to reproduce the issue.

You can also attach the content from about:support page here or mail it to me directly.
Flags: needinfo?(condacum)
10 tabs. 
yes.


Informazioni di base
--------------------

Nome: Firefox
Versione: 52.0
ID build: 20170302120751
Canale di aggiornamento: release
User agent: Mozilla/5.0 (Windows NT 6.0; rv:52.0) Gecko/20100101 Firefox/52.0
SO: Windows_NT 6.0
Finestre multiprocesso: 2/2 (Attivato automaticamente)
ModalitĂ  provvisoria: false

Segnalazioni di arresto anomalo degli ultimi 3 giorni
-----------------------------------------------------

Tutte le segnalazioni di arresto anomalo (incluse 3 in attesa nell’intervallo di tempo indicato)

Estensioni
----------

Nome: Adblock Plus
Versione: 2.8.2
Attiva: true
ID: {d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}

Nome: Application Update Service Helper
Versione: 2.0
Attiva: true
ID: aushelper@mozilla.org

Nome: Multi-process staged rollout
Versione: 1.9
Attiva: true
ID: e10srollout@mozilla.org

Nome: NoScript
Versione: 5.0.1
Attiva: true
ID: {73a6fe31-595d-460b-a920-fcc0f8843232}

Nome: Pocket
Versione: 1.0.5
Attiva: true
ID: firefox@getpocket.com

Nome: Web Compat
Versione: 1.0
Attiva: true
ID: webcompat@mozilla.org

Grafica
-------

Caratteristiche
Composizione: Direct3D 11
Panoramica/zoom asincroni (APZ): input rotella attivo; input touch attivo
Rendering WebGL: Google Inc. -- ANGLE (NVIDIA GeForce GTX 650 Direct3D11 vs_5_0 ps_5_0)
Rendering WebGL2: Google Inc. -- ANGLE (NVIDIA GeForce GTX 650 Direct3D11 vs_5_0 ps_5_0)
Decodifica hardware H264: No; Hardware video decoding disabled or blacklisted
Back-end audio: wasapi
DirectWrite: false (7.0.6002.24017)
GPU #1
Attivo: Sì
Descrizione: NVIDIA GeForce GTX 650
ID produttore: 0x10de
ID dispositivo: 0x0fc6
Versione driver: 9.18.13.4788
Data aggiornamento driver: 3-13-2015
Driver: nvd3dum,nvwgf2um,nvwgf2um
ID sottosistema: 00000000
RAM: 2048

Diagnostica
AzureCanvasAccelerated: 0
AzureCanvasBackend: skia
AzureContentBackend: skia
AzureFallbackCanvasBackend: cairo
Registro decisioni
D3D9_COMPOSITING:
disabled by default: Disabled by default
DIRECT2D:
unavailable by runtime: Failed to acquire a Direct2D 1.1 factory




Preferenze importanti modificate
--------------------------------

accessibility.typeaheadfind.flashBar: 0
browser.cache.disk.capacity: 358400
browser.cache.disk.filesystem_reported: 1
browser.cache.disk.smart_size_cached_value: 358400
browser.cache.disk.smart_size.first_run: false
browser.cache.disk.smart_size.use_old_max: false
browser.cache.frecency_experiment: 1
browser.display.focus_ring_on_anything: true
browser.download.folderList: 0
browser.download.importedFromSqlite: true
browser.download.manager.alertOnEXEOpen: true
browser.download.manager.closeWhenDone: true
browser.download.manager.quitBehavior: 2
browser.download.manager.retention: 0
browser.download.manager.scanWhenDone: false
browser.link.open_newwindow: 2
browser.places.importBookmarksHTML: false
browser.places.importDefaults: false
browser.places.leftPaneFolderId: -1
browser.places.migratePostDataAnnotations: false
browser.places.smartBookmarksVersion: 8
browser.places.updateRecentTagsUri: false
browser.search.suggest.enabled: false
browser.search.update: false
browser.sessionstore.max_tabs_undo: 45
browser.sessionstore.restore_on_demand: false
browser.sessionstore.upgradeBackup.latestBuildID: 20170302120751
browser.startup.homepage: https://startpage.com/
browser.startup.homepage_override.buildID: 20170302120751
browser.startup.homepage_override.mstone: 52.0
browser.tabs.drawInTitlebar: false
browser.tabs.onTop: false
browser.tabs.remote.autostart.2: true
browser.urlbar.formatting.enabled: false
browser.urlbar.trimURLs: false
dom.apps.lastUpdate.buildID: 20161019084923
dom.apps.lastUpdate.mstone: 49.0.2
dom.apps.reset-permissions: true
dom.disable_window_move_resize: true
dom.event.clipboardevents.enabled: false
dom.ipc.plugins.timeoutSecs: -1
dom.max_chrome_script_run_time: 40
dom.max_script_run_time: 40
dom.mozApps.used: true
dom.w3c_touch_events.expose: false
extensions.lastAppVersion: 52.0
font.internaluseonly.changed: true
gfx.crash-guard.d3d11layers.appVersion: 52.0
gfx.crash-guard.d3d11layers.deviceID: 0x0fc6
gfx.crash-guard.d3d11layers.driverVersion: 9.18.13.4788
gfx.crash-guard.d3d11layers.feature-d2d: true
gfx.crash-guard.d3d11layers.feature-d3d11: true
gfx.crash-guard.glcontext.gfx.driver-init.direct3d11-angle: true
gfx.crash-guard.glcontext.gfx.driver-init.webgl-angle: true
gfx.crash-guard.glcontext.gfx.driver-init.webgl-angle-force-d3d11: false
gfx.crash-guard.glcontext.gfx.driver-init.webgl-angle-force-warp: false
gfx.crash-guard.glcontext.gfx.driver-init.webgl-angle-try-d3d11: true
gfx.crash-guard.status.d3d11layers: 2
gfx.crash-guard.status.d3d9video: 2
gfx.crash-guard.status.glcontext: 2
gfx.direct3d.checkDX10: true
gfx.direct3d.last_used_feature_level_idx: 0
gfx.direct3d.prefer_10_1: true
gfx.driver-init.appVersion: 42.0
gfx.driver-init.deviceID: 0x0fc6
gfx.driver-init.driverVersion: 9.18.13.4788
gfx.driver-init.feature-d2d: true
gfx.driver-init.feature-d3d11: true
gfx.driver-init.status: 2
layers.offmainthreadcomposition.enabled: false
media.encoder.webm.enabled: false
media.gmp-eme-adobe.abi: x86-msvc-x86
media.gmp-eme-adobe.enabled: false
media.gmp-eme-adobe.lastUpdate: 1461675179
media.gmp-eme-adobe.version: 17
media.gmp-gmpopenh264.abi: x86-msvc-x86
media.gmp-gmpopenh264.lastUpdate: 1473678394
media.gmp-gmpopenh264.version: 1.6
media.gmp-manager.buildID: 20170302120751
media.gmp-manager.lastCheck: 1489402453
media.gmp-widevinecdm.abi: x86-msvc-x86
media.gmp-widevinecdm.enabled: false
media.gmp-widevinecdm.lastUpdate: 1474362385
media.gmp-widevinecdm.version: 1.4.8.903
media.gmp.storage.version.observed: 1
media.hardware-video-decoding.failed: true
media.mp4.enabled: false
media.webm.enabled: false
media.webrtc.debug.aec_log_dir: C:\Users\admin\AppData\Local\Temp
media.webrtc.debug.log_file: C:\Users\admin\AppData\Local\Temp\WebRTC.log
network.cookie.cookieBehavior: 1
network.cookie.lifetimePolicy: 2
network.cookie.prefsMigrated: true
network.dns.disableIPv6: true
network.predictor.cleaned-up: true
places.database.lastMaintenance: 1619559160
places.history.expiration.transient_current_max_pages: 143081
plugin.disable_full_page_plugin_for_types: audio/mpeg,audio/x-mpeg,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
plugin.importedState: true
plugin.state.java: 0
plugin.state.np-mswmp: 2
plugin.state.npctrl: 2
plugins.click_to_play: false
print.print_printer: HP Color LaserJet 2840 series PCL 6
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_bgcolor: false
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_bgimages: false
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_command:
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_downloadfonts: false
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_edge_bottom: 0
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_edge_left: 0
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_edge_right: 0
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_edge_top: 0
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_evenpages: true
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_footercenter:
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_footerleft: &PT
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_footerright: &D
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_headercenter:
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_headerleft: &T
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_headerright: &U
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_in_color: true
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_margin_bottom: 0.5
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_margin_left: 0.5
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_margin_right: 0.5
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_margin_top: 0.5
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_oddpages: true
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_orientation: 0
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_pagedelay: 500
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_paper_data: 9
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_paper_height: 11,00
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_paper_size_type: 0
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_paper_size_unit: 1
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_paper_width: 8,50
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_reversed: false
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_scaling: 1,00
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_shrink_to_fit: true
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_to_file: false
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_unwriteable_margin_bottom: 0
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_unwriteable_margin_left: 0
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_unwriteable_margin_right: 0
print.printer_HP_Color_LaserJet_2840_series_PCL_6.print_unwriteable_margin_top: 0
print.printer_HP_Officejet_J6400_series.print_bgcolor: false
print.printer_HP_Officejet_J6400_series.print_bgimages: false
print.printer_HP_Officejet_J6400_series.print_colorspace:
print.printer_HP_Officejet_J6400_series.print_command:
print.printer_HP_Officejet_J6400_series.print_downloadfonts: false
print.printer_HP_Officejet_J6400_series.print_duplex: 0
print.printer_HP_Officejet_J6400_series.print_edge_bottom: 0
print.printer_HP_Officejet_J6400_series.print_edge_left: 0
print.printer_HP_Officejet_J6400_series.print_edge_right: 0
print.printer_HP_Officejet_J6400_series.print_edge_top: 0
print.printer_HP_Officejet_J6400_series.print_evenpages: true
print.printer_HP_Officejet_J6400_series.print_footercenter:
print.printer_HP_Officejet_J6400_series.print_footerleft: &PT
print.printer_HP_Officejet_J6400_series.print_footerright: &D
print.printer_HP_Officejet_J6400_series.print_headercenter:
print.printer_HP_Officejet_J6400_series.print_headerleft: &T
print.printer_HP_Officejet_J6400_series.print_headerright: &U
print.printer_HP_Officejet_J6400_series.print_in_color: true
print.printer_HP_Officejet_J6400_series.print_margin_bottom: 0.5
print.printer_HP_Officejet_J6400_series.print_margin_left: 0.5
print.printer_HP_Officejet_J6400_series.print_margin_right: 0.5
print.printer_HP_Officejet_J6400_series.print_margin_top: 0.5
print.printer_HP_Officejet_J6400_series.print_oddpages: true
print.printer_HP_Officejet_J6400_series.print_orientation: 0
print.printer_HP_Officejet_J6400_series.print_page_delay: 50
print.printer_HP_Officejet_J6400_series.print_pagedelay: 500
print.printer_HP_Officejet_J6400_series.print_paper_data: 9
print.printer_HP_Officejet_J6400_series.print_paper_height: 11,00
print.printer_HP_Officejet_J6400_series.print_paper_name:
print.printer_HP_Officejet_J6400_series.print_paper_size_type: 0
print.printer_HP_Officejet_J6400_series.print_paper_size_unit: 1
print.printer_HP_Officejet_J6400_series.print_paper_width: 8,50
print.printer_HP_Officejet_J6400_series.print_plex_name:
print.printer_HP_Officejet_J6400_series.print_resolution: 0
print.printer_HP_Officejet_J6400_series.print_resolution_name:
print.printer_HP_Officejet_J6400_series.print_reversed: false
print.printer_HP_Officejet_J6400_series.print_scaling: 1,00
print.printer_HP_Officejet_J6400_series.print_shrink_to_fit: true
print.printer_HP_Officejet_J6400_series.print_to_file: false
print.printer_HP_Officejet_J6400_series.print_unwriteable_margin_bottom: 0
print.printer_HP_Officejet_J6400_series.print_unwriteable_margin_left: 0
print.printer_HP_Officejet_J6400_series.print_unwriteable_margin_right: 0
print.printer_HP_Officejet_J6400_series.print_unwriteable_margin_top: 0
privacy.clearOnShutdown.downloads: false
privacy.clearOnShutdown.formdata: false
privacy.clearOnShutdown.history: false
privacy.clearOnShutdown.sessions: false
privacy.cpd.downloads: false
privacy.cpd.formdata: false
privacy.cpd.history: false
privacy.cpd.sessions: false
privacy.donottrackheader.enabled: true
privacy.item.cookies: true
privacy.item.formdata: false
privacy.item.history: false
privacy.item.sessions: false
privacy.sanitize.migrateClearSavedPwdsOnExit: true
privacy.sanitize.migrateFx3Prefs: true
privacy.sanitize.promptOnSanitize: false
privacy.sanitize.sanitizeOnShutdown: true
privacy.trackingprotection.enabled: true
privacy.trackingprotection.introCount: 20
security.disable_button.openCertManager: false
security.sandbox.content.tempDirSuffix: {d719d5e2-73fd-4924-be9b-300e85bd5139}
security.ssl.errorReporting.automatic: true
security.warn_entering_secure: true
security.warn_leaving_secure: true
security.warn_viewing_mixed: false
services.sync.declinedEngines:
storage.vacuum.last.index: 1
storage.vacuum.last.places.sqlite: 1619559160

Preferenze importanti bloccate
------------------------------

Database Places
---------------

JavaScript
----------

GC incrementale: true

AccessibilitĂ 
-------------

Attivato: false
Impedisci accessibilitĂ : 0

Versioni librerie
-----------------

NSPR
Versione minima prevista: 4.13.1
Versione in uso: 4.13.1

NSS
Versione minima prevista: 3.28.3
Versione in uso: 3.28.3

NSSSMIME
Versione minima prevista: 3.28.3
Versione in uso: 3.28.3

NSSSSL
Versione minima prevista: 3.28.3
Versione in uso: 3.28.3

NSSUTIL
Versione minima prevista: 3.28.3
Versione in uso: 3.28.3

Caratteristiche sperimentali
----------------------------

Sandbox
-------

Livello sandbox content process: 1
Flags: needinfo?(condacum)
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #72)
> Could you describe your environment with more details? For example how many
> tabs were opened when you were closing Firefox? Can you reproduce it with
> addons disabled? It will help us understand the issue and increase the
> possibility of getting this fixed. It will help if you have clear steps to
> reproduce the issue.
> 
> You can also attach the content from about:support page here or mail it to
> me directly.

kanru: You didn't ask input of me, but I'll add it anyway -- as I've seen this crash daily for more than a year, 3 - 4 times per day. They happen when Firefox updates itself (like clockwork!) and often when I close Nightly. I have maybe 4 - 20 tabs active at any one time, have made fresh profiles, etc. None of that affects whether I crash. Additionally, these crashes are (for me) unannounced; I know of them only because of about:crashes.

STR? Simple: let Nightly update itself. It seems that simply shutting down Nightly also provokes a crash.


Application Basics
------------------

Name: Firefox
Version: 55.0a1
Build ID: 20170312030213
Update Channel: nightly
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:55.0) Gecko/20100101 Firefox/55.0
OS: Darwin 16.4.0
Multiprocess Windows: 1/1 (Enabled by default)
Google Key: Found
Mozilla Location Service Key: Found
Safe Mode: false

Crash Reports for the Last 3 Days
---------------------------------

Report ID: bp-ab705c52-0c6e-4322-861c-fe6eb2170312
Submitted: 17 hours ago

Report ID: bp-46001874-600f-4af6-8e09-125d52170312
Submitted: 17 hours ago

Report ID: bp-a477cbbd-f319-4803-a9f3-a3ae72170312
Submitted: 17 hours ago

Report ID: bp-ff05ff0e-6905-4c62-a26e-fb1382170312
Submitted: 17 hours ago

Report ID: bp-7a4d0265-c026-4c4c-b59a-dabbb2170311
Submitted: 2 days ago

Report ID: bp-396b9dce-dc18-4a0c-b9dc-ab1a92170311
Submitted: 2 days ago

Report ID: bp-a1e9c26c-cb50-4d9c-b20b-b2ca72170311
Submitted: 2 days ago

Report ID: bp-f75a2339-2a8c-4005-929b-459f52170311
Submitted: 2 days ago

Report ID: bp-0c37bf8e-1c78-4502-9107-aba7d2170310
Submitted: 3 days ago

Report ID: bp-820c610c-1fc5-4584-be24-3ddfa2170310
Submitted: 3 days ago

Report ID: bp-1fe8820b-5490-4cd5-b657-fcddd2170310
Submitted: 3 days ago

Report ID: bp-cfcfc73e-ba96-4832-93c1-97dc22170310
Submitted: 3 days ago

Report ID: bp-9e279adc-5a2f-4994-900a-f3de52170310
Submitted: 3 days ago

Report ID: bp-30d3d5b7-0c05-4cd7-b782-850c62170310
Submitted: 3 days ago

Report ID: bp-653ad142-ade9-4602-99ff-98bc12170310
Submitted: 3 days ago

Report ID: bp-746f997e-f51f-4f57-af23-fa0af2170310
Submitted: 3 days ago

All Crash Reports

Extensions
----------

Name: About Sync extension for Firefox
Version: 0.0.8
Enabled: true
ID: aboutsync@mhammond.github.com

Name: Activity Stream
Version: 0.0.0
Enabled: true
ID: activity-stream@mozilla.org

Name: Activity Stream
Version: 1.6.0
Enabled: true
ID: @activity-streams

Name: Adobe Acrobat DC - Create PDF
Version: 15.01.02
Enabled: true
ID: web2pdfextension@web2pdf.adobedotcom

Name: All Tabs Helper
Version: 0.2.34
Enabled: true
ID: alltabshelper@alltabshelper.org

Name: Application Update Service Helper
Version: 2.0
Enabled: true
ID: aushelper@mozilla.org

Name: FireQuery
Version: 2.0.4
Enabled: true
ID: firequery@binaryage.com

Name: FlyWeb
Version: 1.0.0
Enabled: true
ID: flyweb@mozilla.org

Name: Form Autofill
Version: 1.0
Enabled: true
ID: formautofill@mozilla.org

Name: Multi-process staged rollout
Version: 1.11
Enabled: true
ID: e10srollout@mozilla.org

Name: Nightly Tester Tools
Version: 3.9
Enabled: true
ID: {8620c15f-30dc-4dba-a131-7c5d20cf4a29}

Name: Open Tabs Next to Current
Version: 1.1.5
Enabled: true
ID: opentabsnexttocurrent@sblask

Name: Places Maintenance
Version: 2.0.2
Enabled: true
ID: places-maintenance@bonardo.net

Name: Pocket
Version: 1.0.5
Enabled: true
ID: firefox@getpocket.com

Name: Presentation
Version: 1.0.0
Enabled: true
ID: presentation@mozilla.org

Name: Send Tab to Device
Version: 0.7.1-signed.1-signed
Enabled: true
ID: jid1-mdjmA7if6lo8lA@jetpack

Name: Session Manager
Version: 0.8.1.13
Enabled: true
ID: {1280606b-2510-4fe0-97ef-9b5a22eafe30}

Name: Shield Recipe Client
Version: 1.0.0
Enabled: true
ID: shield-recipe-client@mozilla.org

Name: snoozetabs
Version: 1.0.14
Enabled: true
ID: snoozetabs@mozilla.com

Name: Tab Groups
Version: 2.1.4
Enabled: true
ID: tabgroups@quicksaver

Name: Test Pilot
Version: 1.1.1-dev-97fd716
Enabled: true
ID: @testpilot-addon

Name: Web Compat
Version: 1.1
Enabled: true
ID: webcompat@mozilla.org

Name: WebCompat Reporter
Version: 1.0.0
Enabled: true
ID: webcompat-reporter@mozilla.org

Name: LiveReload
Version: 2.1.1
Enabled: false
ID: livereload@livereload.com

Graphics
--------

Features
Compositing: OpenGL
Asynchronous Pan/Zoom: wheel input enabled; scrollbar drag enabled
WebGL 1 Driver WSI Info: CGL
WebGL 1 Driver Renderer: ATI Technologies Inc. -- AMD Radeon Pro 460 OpenGL Engine
WebGL 1 Driver Version: 2.1 ATI-1.48.21
WebGL 1 Driver Extensions: GL_ARB_color_buffer_float GL_ARB_depth_buffer_float GL_ARB_depth_clamp GL_ARB_depth_texture GL_ARB_draw_buffers GL_ARB_draw_elements_base_vertex GL_ARB_draw_instanced GL_ARB_fragment_program GL_ARB_fragment_program_shadow GL_ARB_fragment_shader GL_ARB_framebuffer_object GL_ARB_framebuffer_sRGB GL_ARB_half_float_pixel GL_ARB_half_float_vertex GL_ARB_imaging GL_ARB_instanced_arrays GL_ARB_multisample GL_ARB_multitexture GL_ARB_occlusion_query GL_ARB_pixel_buffer_object GL_ARB_point_parameters GL_ARB_point_sprite GL_ARB_provoking_vertex GL_ARB_seamless_cube_map GL_ARB_shader_objects GL_ARB_shader_texture_lod GL_ARB_shading_language_100 GL_ARB_shadow GL_ARB_shadow_ambient GL_ARB_sync GL_ARB_texture_border_clamp GL_ARB_texture_compression GL_ARB_texture_compression_rgtc GL_ARB_texture_cube_map GL_ARB_texture_env_add GL_ARB_texture_env_combine GL_ARB_texture_env_crossbar GL_ARB_texture_env_dot3 GL_ARB_texture_float GL_ARB_texture_mirrored_repeat GL_ARB_texture_non_power_of_two GL_ARB_texture_rectangle GL_ARB_texture_rg GL_ARB_transpose_matrix GL_ARB_vertex_array_bgra GL_ARB_vertex_blend GL_ARB_vertex_buffer_object GL_ARB_vertex_program GL_ARB_vertex_shader GL_ARB_window_pos GL_EXT_abgr GL_EXT_bgra GL_EXT_bindable_uniform GL_EXT_blend_color GL_EXT_blend_equation_separate GL_EXT_blend_func_separate GL_EXT_blend_minmax GL_EXT_blend_subtract GL_EXT_clip_volume_hint GL_EXT_debug_label GL_EXT_debug_marker GL_EXT_depth_bounds_test GL_EXT_draw_buffers2 GL_EXT_draw_range_elements GL_EXT_fog_coord GL_EXT_framebuffer_blit GL_EXT_framebuffer_multisample GL_EXT_framebuffer_object GL_EXT_framebuffer_sRGB GL_EXT_geometry_shader4 GL_EXT_gpu_program_parameters GL_EXT_gpu_shader4 GL_EXT_multi_draw_arrays GL_EXT_packed_depth_stencil GL_EXT_packed_float GL_EXT_provoking_vertex GL_EXT_rescale_normal GL_EXT_secondary_color GL_EXT_separate_specular_color GL_EXT_shadow_funcs GL_EXT_stencil_two_side GL_EXT_stencil_wrap GL_EXT_texture_array GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_s3tc GL_EXT_texture_env_add GL_EXT_texture_filter_anisotropic GL_EXT_texture_integer GL_EXT_texture_lod_bias GL_EXT_texture_mirror_clamp GL_EXT_texture_rectangle GL_EXT_texture_shared_exponent GL_EXT_texture_sRGB GL_EXT_texture_sRGB_decode GL_EXT_timer_query GL_EXT_transform_feedback GL_EXT_vertex_array_bgra GL_APPLE_aux_depth_stencil GL_APPLE_client_storage GL_APPLE_element_array GL_APPLE_fence GL_APPLE_float_pixels GL_APPLE_flush_buffer_range GL_APPLE_flush_render GL_APPLE_object_purgeable GL_APPLE_packed_pixels GL_APPLE_pixel_buffer GL_APPLE_rgb_422 GL_APPLE_row_bytes GL_APPLE_specular_vector GL_APPLE_texture_range GL_APPLE_transform_hint GL_APPLE_vertex_array_object GL_APPLE_vertex_array_range GL_APPLE_vertex_point_size GL_APPLE_vertex_program_evaluators GL_APPLE_ycbcr_422 GL_ATI_blend_equation_separate GL_ATI_blend_weighted_minmax GL_ATI_separate_stencil GL_ATI_texture_compression_3dc GL_ATI_texture_env_combine3 GL_ATI_texture_float GL_ATI_texture_mirror_once GL_IBM_rasterpos_clip GL_NV_blend_square GL_NV_conditional_render GL_NV_depth_clamp GL_NV_fog_distance GL_NV_light_max_exponent GL_NV_texgen_reflection GL_NV_texture_barrier GL_SGI_color_matrix GL_SGIS_generate_mipmap GL_SGIS_texture_edge_clamp GL_SGIS_texture_lod
WebGL 1 Extensions: ANGLE_instanced_arrays EXT_blend_minmax EXT_color_buffer_half_float EXT_frag_depth EXT_sRGB EXT_shader_texture_lod EXT_texture_filter_anisotropic MOZ_debug_get OES_element_index_uint OES_standard_derivatives OES_texture_float OES_texture_float_linear OES_texture_half_float OES_texture_half_float_linear OES_vertex_array_object WEBGL_color_buffer_float WEBGL_compressed_texture_s3tc WEBGL_debug_renderer_info WEBGL_debug_shaders WEBGL_depth_texture WEBGL_draw_buffers WEBGL_lose_context MOZ_WEBGL_lose_context MOZ_WEBGL_compressed_texture_s3tc MOZ_WEBGL_depth_texture
WebGL 2 Driver WSI Info: CGL
WebGL 2 Driver Renderer: ATI Technologies Inc. -- AMD Radeon Pro 460 OpenGL Engine
WebGL 2 Driver Version: 4.1 ATI-1.48.21
WebGL 2 Driver Extensions: GL_ARB_blend_func_extended GL_ARB_draw_buffers_blend GL_ARB_draw_indirect GL_ARB_ES2_compatibility GL_ARB_explicit_attrib_location GL_ARB_gpu_shader_fp64 GL_ARB_gpu_shader5 GL_ARB_instanced_arrays GL_ARB_internalformat_query GL_ARB_occlusion_query2 GL_ARB_sample_shading GL_ARB_sampler_objects GL_ARB_separate_shader_objects GL_ARB_shader_bit_encoding GL_ARB_shader_subroutine GL_ARB_shading_language_include GL_ARB_tessellation_shader GL_ARB_texture_buffer_object_rgb32 GL_ARB_texture_cube_map_array GL_ARB_texture_gather GL_ARB_texture_query_lod GL_ARB_texture_rgb10_a2ui GL_ARB_texture_storage GL_ARB_texture_swizzle GL_ARB_timer_query GL_ARB_transform_feedback2 GL_ARB_transform_feedback3 GL_ARB_vertex_attrib_64bit GL_ARB_vertex_type_2_10_10_10_rev GL_ARB_viewport_array GL_EXT_debug_label GL_EXT_debug_marker GL_EXT_depth_bounds_test GL_EXT_texture_compression_s3tc GL_EXT_texture_filter_anisotropic GL_EXT_texture_mirror_clamp GL_EXT_texture_sRGB_decode GL_APPLE_client_storage GL_APPLE_container_object_shareable GL_APPLE_flush_render GL_APPLE_object_purgeable GL_APPLE_rgb_422 GL_APPLE_row_bytes GL_APPLE_texture_range GL_ATI_texture_mirror_once GL_NV_texture_barrier
WebGL 2 Extensions: EXT_color_buffer_float EXT_texture_filter_anisotropic EXT_disjoint_timer_query MOZ_debug_get OES_texture_float_linear WEBGL_compressed_texture_s3tc WEBGL_debug_renderer_info WEBGL_debug_shaders WEBGL_lose_context MOZ_WEBGL_lose_context MOZ_WEBGL_compressed_texture_s3tc
Audio Backend: audiounit
GPU #1
Active: Yes
Vendor ID: 0x1002
Device ID: 0x67ef

Diagnostics
AzureCanvasAccelerated: 1
AzureCanvasBackend: skia
AzureContentBackend: skia
AzureFallbackCanvasBackend: none
TileHeight: 1024
TileWidth: 1024
Decision Log
WEBRENDER:
unavailable by runtime: Build doesn't include WebRender




Important Modified Preferences
------------------------------

accessibility.typeaheadfind.flashBar: 0
browser.cache.disk.capacity: 358400
browser.cache.disk.filesystem_reported: 1
browser.cache.disk.hashstats_reported: 1
browser.cache.disk.smart_size.first_run: false
browser.cache.disk.smart_size.use_old_max: false
browser.cache.frecency_experiment: 2
browser.download.importedFromSqlite: true
browser.places.smartBookmarksVersion: 8
browser.sessionstore.upgradeBackup.latestBuildID: 20170312030213
browser.startup.homepage: resource://activity-streams/data/content/activity-streams.html#/
browser.startup.homepage_override.buildID: 20170312030213
browser.startup.homepage_override.mstone: 55.0a1
browser.tabs.crashReporting.includeURL: true
browser.tabs.drawInTitlebar: false
browser.urlbar.maxRichResults: 12
browser.urlbar.suggest.searches: true
browser.urlbar.usepreloadedtopurls.enabled: false
dom.gamepad.extensions.enabled: true
dom.ipc.processCount: 4
dom.push.userAgentID: f5444e6008f44158ad9e57f428476069
extensions.lastAppVersion: 55.0a1
font.internaluseonly.changed: true
media.benchmark.vp9.fps: 204
media.benchmark.vp9.versioncheck: 2
media.gmp-gmpopenh264.abi: x86_64-gcc3
media.gmp-gmpopenh264.lastUpdate: 1481375856
media.gmp-gmpopenh264.version: 1.6
media.gmp-manager.buildID: 20170312030213
media.gmp-manager.lastCheck: 1489342030
media.gmp-widevinecdm.abi: x86_64-gcc3
media.gmp-widevinecdm.lastUpdate: 1481375858
media.gmp-widevinecdm.version: 1.4.8.903
media.gmp.storage.version.observed: 1
network.cookie.prefsMigrated: true
network.predictor.cleaned-up: true
places.database.lastMaintenance: 1489171220
places.favicons.optimizeToDimension: 64
places.history.expiration.transient_current_max_pages: 87071
plugin.disable_full_page_plugin_for_types: application/pdf
print.printer__.print_bgcolor: false
print.printer__.print_bgimages: false
print.printer__.print_duplex: -437918235
print.printer__.print_edge_bottom: 0
print.printer__.print_edge_left: 0
print.printer__.print_edge_right: 0
print.printer__.print_edge_top: 0
print.printer__.print_evenpages: true
print.printer__.print_footercenter:
print.printer__.print_footerleft: &PT
print.printer__.print_footerright: &D
print.printer__.print_headercenter:
print.printer__.print_headerleft: &T
print.printer__.print_headerright: &U
print.printer__.print_in_color: true
print.printer__.print_margin_bottom: 0.5
print.printer__.print_margin_left: 0.5
print.printer__.print_margin_right: 0.5
print.printer__.print_margin_top: 0.5
print.printer__.print_oddpages: true
print.printer__.print_orientation: 0
print.printer__.print_page_delay: 50
print.printer__.print_paper_data: 0
print.printer__.print_paper_height: 11.00
print.printer__.print_paper_name: na-letter
print.printer__.print_paper_size_unit: 0
print.printer__.print_paper_width: 8.50
print.printer__.print_resolution: -437918235
print.printer__.print_reversed: false
print.printer__.print_scaling: 1.00
print.printer__.print_shrink_to_fit: true
print.printer__.print_to_file: false
print.printer__.print_unwriteable_margin_bottom: 56
print.printer__.print_unwriteable_margin_left: 25
print.printer__.print_unwriteable_margin_right: 25
print.printer__.print_unwriteable_margin_top: 25
privacy.cpd.extensions-sessionmanager: false
privacy.cpd.formdata: false
privacy.cpd.offlineApps: true
privacy.popups.showBrowserMessage: false
privacy.sanitize.timeSpan: 0
security.sandbox.content.tempDirSuffix: {7d405653-de88-074c-9cc2-b4e6c9f9fd59}
services.sync.declinedEngines:
services.sync.engine.prefs.modified: false
services.sync.lastPing: 1489341901
services.sync.lastSync: Mon Mar 13 2017 07:28:37 GMT-0400 (EDT)
services.sync.numClients: 4
social.enabledByActivityStream: true
storage.vacuum.last.index: 1
storage.vacuum.last.places.sqlite: 1488402289

Important Locked Preferences
----------------------------

Places Database
---------------

JavaScript
----------

Incremental GC: true

Accessibility
-------------

Activated: false
Prevent Accessibility: 0

Library Versions
----------------

NSPR
Expected minimum version: 4.13.1
Version in use: 4.13.1

NSS
Expected minimum version: 3.30 Beta
Version in use: 3.30 Beta

NSSSMIME
Expected minimum version: 3.30 Beta
Version in use: 3.30 Beta

NSSSSL
Expected minimum version: 3.30 Beta
Version in use: 3.30 Beta

NSSUTIL
Expected minimum version: 3.30 Beta
Version in use: 3.30 Beta

Experimental Features
---------------------

Sandbox
-------

Content Process Sandbox Level: 2
Frank, can you confirm that just now 
	Virtual_ManPL [:Virtual] - (ni? me) 
fixed this bug?
because just now I received some very strange notification per email.
https://bugzilla.mozilla.org/show_bug.cgi?id=1331929
https://bugzilla.mozilla.org/show_bug.cgi?id=1311297
Flags: needinfo?(burleigh)
(In reply to Yorgos from comment #75)
> Frank, can you confirm that just now 
> 	Virtual_ManPL [:Virtual] - (ni? me) 
> fixed this bug?
> because just now I received some very strange notification per email.
> https://bugzilla.mozilla.org/show_bug.cgi?id=1331929
> https://bugzilla.mozilla.org/show_bug.cgi?id=1311297

Yorgos: I would stay tuned to this bug, and ignore changes to the two bugs you list. It's likely a Mozilla process is simply closing related, inactive, unhelpful or duplicate bugs with no action. Problems like this that many people have generate a lot of noise, so now and then that noise has to be cleared away so it doesn't distract from real work. IMHO, of course.
Flags: needinfo?(burleigh)
Exactly it's like Frank Burleigh said.
There is no need to create many duplicate bugs about the same issue, as they pollute related bugs in crashlog reports, making hard to speedy find anything in all that useless spam.

About bugs I marked as INCOMPLETE, they have noSTR from OP.

Reproducible bugs are tracked in blocked bug #1219672.


tl;dr - I'm cleaning.



(In reply to Yorgos from comment #75)
> fixed
more like, I marked them as INCOMPLETE
I pasted the link here because the log (crash report) told me "NEW", not Duplicate ;) otherwise I know that I dont need to post in a duplicate thread!
Virtual: I'm MacOS (OS X), so platform should perhaps be adjusted.
Don't worry, I want all these bugs to be fixed like you too, that's why I'm doing "a little cleaning" (I hope I didn't make any mistakes in this), to help Mozilla developers (or at least I think that I'm helping) getting on track faster without reading duplicates, etc.
Mass wontfix for bugs affecting firefox 52.
Keywords: topcrash
Version: 48 Branch → 38 Branch
Signature report for IPCError-browser | ShutDownKill

Showing results from 7 days ago 

Operating System
Windows 7 	46211 	46.5%
Windows 10 	29388 	29.6%
Windows 8.1 	11219 	11.3%
        	3429 	3.5%
OS X 10.12 	2899 	2.9%
Windows 8 	2313 	2.3%
OS X 10.11 	1097 	1.1%
Linux   	1011 	1.0%

Product
Firefox 	54.0a2 	37251 	37.5% 	9618
Firefox 	55.0a1 	24892 	25.1% 	10359
Firefox 	54.0b4 	12112 	12.2% 	4892
Firefox 	54.0b3 	4570 	4.6% 	3417
Firefox 	54.0b2 	2587 	2.6% 	1752
Firefox 53.0b99 	2576 	2.6% 	807
Firefox 	54.0b1 	2139 	2.2% 	1664
Firefox 	54.0a1 	1082 	1.1% 	90
Firefox 	53.0b9 	1003 	1.0% 	769

Process Type
content 	99347 	100.0%

Uptime Range
> 1 hour 	46403 	46.7%
< 1 min 	26264 	26.4%
15-60 min 	12102 	12.2%
1-5 min 	8937 	9.0%
5-15 min 	5645 	5.7%

Architecture
x86 	61358 	61.8%
amd64 	34564 	34.8%
	3429 	3.5%

Flash Version
[blank] 	95922 	96.5%
        	3429 	3.5%
Yeah, it's going down! :-) 80386 crashes in the last 7 days.

When it is also caused by add-ons that are not multiprocessor compatible, whats about deactivate such add-ons in the Developer Edition, too? Developers should be able to set extensions.allow-non-mpc-extensions=true in about:config, too.

Also: Is it possible to activate this signature in https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=55.0a1&days=7 again, to have a better eye on it?
I still see a clear relation between a high memory usage of FF after a long time use of GMail, Twitter or Facebook and the ShutDownKills that appear by the restart/shutdown after them. I guess fixing those ShutDownKill-Crashes will reduce the amount of memory FF use over the time it runs and reduce the need to restart FF between work.

Here are some sample crash reports from me for analyses produced with FF55.0a1, 64bit, E10s, on Win7 from the last days with different threads:
https://crash-stats.mozilla.com/report/index/b232d572-a9a3-45b6-9662-7b9fa0170602
https://crash-stats.mozilla.com/report/index/d9c58c40-fea7-4791-bc8a-b9d930170602
https://crash-stats.mozilla.com/report/index/c1293857-e13c-405b-b682-0bd590170602
https://crash-stats.mozilla.com/report/index/fbc04e1b-03e6-419d-98d6-8e8610170601
https://crash-stats.mozilla.com/report/index/d0a2f888-28f9-4ce7-a4eb-9a9980170601
https://crash-stats.mozilla.com/report/index/12dc4eb5-1fd0-4508-aaf1-4887b0170601
https://crash-stats.mozilla.com/report/index/f683fb5e-4269-405a-949a-ded770170531
https://crash-stats.mozilla.com/report/index/33f8b46d-1035-422b-a2d2-4c6940170531
https://crash-stats.mozilla.com/report/index/3c87e99e-ed67-4140-9b98-3e9a60170531

(In reply to Tobias B. Besemer [:BesTo] (QA) from comment #88)
> When it is also caused by add-ons that are not multiprocessor compatible,
> whats about deactivate such add-ons in the Developer Edition, too?
> Developers should be able to set extensions.allow-non-mpc-extensions=true in
> about:config, too.
When aurora is not used to ship a alpha2 to developers ATM (is FF54.0a2 while beta is FF54.0bx), what's about shipping FF55.0a1 also as FF55.0a2 to devs ATM? This would force devs to have an eye on webextension/multi-processor-compatibility of there extension. Should I fill a new bug for it?

> Also: Is it possible to activate this signature in
> https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=55.
> 0a1&days=7 again, to have a better eye on it?
Should I fill a new bug for it, too?
Flags: needinfo?(wmccloskey)
Here are from me some sample-reports with different threads of new crash-variants from the last days produced with FF55.0a1 latest versions, 64bit with E10s, on Win7, for analyses:
https://crash-stats.mozilla.com/report/index/22efc755-adc8-4cd2-ad57-3a9710170612
https://crash-stats.mozilla.com/report/index/2656bddf-bbd3-4ed8-a81b-45ec20170612
https://crash-stats.mozilla.com/report/index/3102b8b2-a3c0-4e91-aee4-518960170611
https://crash-stats.mozilla.com/report/index/300cff78-bc3e-48e7-8ce6-923840170611

Stats are now: 52101 crashes in the last 7 days
Posting links to crash reports is not useful, without any additional information. If somebody is interested in looking at this issue, they can use crash-stats to find reports.
QA Contact: Tobias.Besemer
More then 243k crashes with this sig in the last 7 days!
Somebody still analyzing them?
(In reply to Tobias B. Besemer [:BesTo] (QA) from comment #94)
> More then 243k crashes with this sig in the last 7 days!
> Somebody still analyzing them?

Hey Tobias, this is a very generic signature that corresponds to crashes (less severe than others, as they occur during shutdown) with very different causes.
We are monitoring it to see if new crash causes occur (e.g. we recently filed bug 1375704).

If you find a reproducible case, please file a new bug and describe the STR. Otherwise, there's no need to keep updating this bug.

(In reply to Tobias B. Besemer [:BesTo] (QA) from comment #88)
> Also: Is it possible to activate this signature in
> https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=55.
> 0a1&days=7 again, to have a better eye on it?

This is a content process crash, that page shows parent process crashes. If you want a view with this signature in it, you can use https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=55.0a1&days=7&process_type=content.
Flags: needinfo?(wmccloskey)
(In reply to Marco Castelluccio [:marco] from comment #95)
> Hey Tobias, [...]
> We are monitoring it to see if new crash causes occur (e.g. we recently
> filed bug 1375704).
> If you find a reproducible case, please file a new bug and describe the STR.
> Otherwise, there's no need to keep updating this bug.

Marco, thx for you help!

If I find a proto_signature (like: proto_signature=~WaitForSyncNotifyWithA11yReentry) I always get and can monitor this way too, can I fill also a bug for it if the proto_signature have many crashes?

---

Bugs for IPCError-browser | ShutDownKill:
    1339589 NEW --- Firefox on Windows with a11y features enabled crashes on certain websites
    1333605 NEW --- Crash in IPCError-browser | ShutDownKill when GCing
    1333464 NEW --- Crash in [@ IPCError-browser | ShutDownKill ] caused by the Gecko Profiler
    1329305 RESOLVED FIXED Crash in IPCError-browser | ShutDownKill (from GPUVideoImage::GetAsSourceSurface)
    1329301 NEW --- Crash in IPCError-browser | ShutDownKill (from GfxInfoBase::GetFeatureStatus)
    1329300 RESOLVED FIXED Crash in IPCError-browser | ShutDownKill | I422ToARGBRow_C
    1324820 RESOLVED FIXED IPCError-browser | ShutDownKill received when closing FX while a gif is autoplaying from Twitter moments
    1324399 NEW --- Shutdown hang crash after DownThemAll! *nightly* extension update
    1316867 NEW --- Deterministic crash from cycle collector to js::Scope::traceChildren() when running wasm content
    1311869 NEW --- FireFox 49.0.2 Denial Of Service
    1290280 VERIFIED FIXED [e10s] Tab crashes on startup
    1289405 NEW --- Restarts to update and quits frequently spin for extended/indefinite time after all windows closed
    1279293 NEW --- Crash in IPCError-browser | ShutDownKill
    1266275 RESOLVED DUPLICATE crash in WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | mozilla::ipc::MessageChannel::WaitForSyncNotify | mozilla::ipc::MessageChannel::Send | mozilla::net::PCookieServiceChild::SendGetCookieString
    1259125 NEW --- crash in nsINode::Slots
    1240542 NEW --- crash in js::InterpreterActivation::InterpreterActivation
    1238657 NEW --- crash in js::frontend::TokenStream::getChar
    1219672 NEW --- [meta] e10s related ShutDownKill parent side abort of the content process
    1200646 NEW --- crash in DestroyDisplayItemDataForFrames
Summary: Crash in IPCError-browser | ShutDownKill → [meta] Crash in [@ IPCError-browser | ShutDownKill]
(In reply to Tobias B. Besemer [:BesTo] (QA) from comment #96)
> (In reply to Marco Castelluccio [:marco] from comment #95)
> > Hey Tobias, [...]
> > We are monitoring it to see if new crash causes occur (e.g. we recently
> > filed bug 1375704).
> > If you find a reproducible case, please file a new bug and describe the STR.
> > Otherwise, there's no need to keep updating this bug.
> 
> Marco, thx for you help!
> 
> If I find a proto_signature (like:
> proto_signature=~WaitForSyncNotifyWithA11yReentry) I always get and can
> monitor this way too, can I fill also a bug for it if the proto_signature
> have many crashes?

No, unless you have steps to reproduce or the crash is exploding (that is, if it was low-volume and suddenly became high volume).
Depends on: 1378276
I have a low-powered AMD Brazos E-350-based laptop (Lenovo x120e) and I get the crash infobar on every restart.

Having a fixed value for dom.ipc.tabs.shutdownTimeoutSecs is unreasonable. It should be based on CPU performance.
Signature report for IPCError-browser | ShutDownKill
Showing results from 7 days ago

760,549 Results 

Windows 7 	432449 	57.2%
______	172851 	22.8%
Windows 10 	69528 	9.2%
Windows 8.1 	60316 	8.0%
Windows 8 	11046 	1.5%
Linux	 	5114 	0.7%
OS X 10.12 	2499 	0.3%
OS X 10.11 	906 	0.1%
Windows XP 	717 	0.1%
OS X 10.10 	405 	0.1%
OS X 10.13 	343 	0.0%
OS X 10.9 	168 	0.0%
Windows Vista 	141 	0.0%
OS X 10.7 	49 	0.0%
Windows Server 2003 	7 	0.0%

Product
Firefox 	58.0a1 	29766 	3.9% 	14975
Firefox 	57.0b4 	1490 	0.2% 	770
Firefox 	57.0b3 	235271 	31.1% 	58808
Firefox 	57.0b0 	26731 	3.5% 	6502
________	57.0b0 	2 	0.0% 	2
Firefox 	57.0a1 	7039 	0.9% 	3237
Firefox 	56.0b99	280748 	37.1% 	52273
Firefox 	56.0b12	72465 	9.6% 	25662
Firefox 	56.0b11	18920 	2.5% 	5869
Firefox 	56.0b10	13566 	1.8% 	5072

Uptime Range
> 1 hour 	411879 	54.4%
< 1 min 	142035 	18.8%
15-60 min 	119670 	15.8%
1-5 min 	43869 	5.8%
5-15 min 	39086 	5.2%

Architecture
x86 	433124 	57.3%
_____	172851 	22.8%
amd64 	150564 	19.9%

Flash Version
[blank] 	756524 	100.0%
27.0.0.130 	12 	0.0%
11.8.800.168 	1 	0.0%
26.0.0.151 	1 	0.0%
27.0.0.151 	1 	0.0%

Graphics Adapter
0x8086 	0x29c2 	48514 	6.4%
0x8086 	0x2e32 	43141 	5.7%
0x8086 	0x0152 	39741 	5.3%
0x8086 	0x0102 	34138 	4.5%
0x8086 	0x2a42 	28727 	3.8%
0x8086 	0x2772 	27883 	3.7%
0x8086 	0x0166 	26806 	3.5%
0x8086 	0x0f31 	26451 	3.5%
0x8086 	0x0046 	26235 	3.5%
0x8086 	0x0116 	24699 	3.3%
0x8086 	0x0a16 	23631 	3.1%
Depends on: 1406839
[Tracking Requested - why for this release]:

Can this [meta] bug and all of the individual bugs that "depend" on it be tracked for FF58?

There are hundreds of thousands of crashes under this signature in the past 7 days often accounting for upwards of 90% of the content crashes per release.

It is the #1 Top Crasher for Content crashes. 

Signature report for IPCError-browser | ShutDownKill
Showing results from 7 days ago
962,573 Results	

[IPCError-browser | ShutDownKill]

https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=58.0a1&days=7&process_type=content

Top Crashers for Firefox 58.0a1
1 	67.52% 	-6.31% 	IPCError-browser | ShutDownKill
	39573 	27710 	697 	2166 	15650 	0 	2016-05-04 	

https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=57.0b6&days=7&process_type=content

Top Crashers for Firefox 57.0b6
1 	10.16% 	new 	IPCError-browser | ShutDownKill
	1978 	1490 	10 	35 	585 	0 	2016-05-04 	
2 	8.12% 	new 	IPCError-browser | PBrowserParent::RecvPDocAccessibleConstructor Constructing a top-level PDocAccessible with null COM
	1582 	1582 	0 	0 	392 	0 	2017-05-02 	
	
https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=57.0b5&days=7&process_type=content

Top Crashers for Firefox 57.0b5
1 	89.81% 	-32.67% 	IPCError-browser | ShutDownKill
	408160 	307837 	827 	1042 	85982 	0 	2016-05-04
2 	2.75% 	new 	IPCError-browser | PBrowserParent::RecvPDocAccessibleConstructor Constructing a top-level PDocAccessible with null COM
	12480 	12461 	0 	0 	609 	0 	2017-05-02 

https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=57.0b4&process_type=content

Top Crashers for Firefox 57.0b4
1 	92.62% 	-4.41% 	IPCError-browser | ShutDownKill
	268636 	201086 	592 	361 	81244 	0 	2016-05-04 
2 	2.57% 	1.79% 	IPCError-browser | PBrowserParent::RecvPDocAccessibleConstructor Constructing a top-level PDocAccessible with null COM
	7449 	7432 	0 	0 	441 	0 	2017-05-02 	

https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=57.0b3&process_type=content

Top Crashers for Firefox 57.0b3	
1 	89.14% 	-3.48% 	IPCError-browser | ShutDownKill
	63962 	46353 	592 	676 	23462 	0 	2016-05-04
2 	3.96% 	2.28% 	IPCError-browser | PBrowserParent::RecvPDocAccessibleConstructor Constructing a top-level PDocAccessible with null COM
	2840 	2837 	0 	0 	270 	0 	2017-05-02 	

https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=57.0b&process_type=content

Top Crashers for Firefox 57.0b
1 	88.73% 	-2.55% 	IPCError-browser | ShutDownKill
	742556 	555680 	2087 	3167 	194722 	0 	2016-05-04 
2 	2.9% 	2.43% 	IPCError-browser | PBrowserParent::RecvPDocAccessibleConstructor Constructing a top-level PDocAccessible with null COM
	24304 	24265 	0 	0 	1727 	0 	2017-05-02 
	
https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=57.0a1&process_type=content

Top Crashers for Firefox 57.0a1
1 	73.26% 	-2.93% 	IPCError-browser | ShutDownKill
	2219 	1495 	35 	155 	761 	0 	2016-05-04 		
	
https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=56.0.1&_facets_size=300&process_type=content

Top Crashers for Firefox 56.0.1
4 	4.71% 	new 	IPCError-browser | ShutDownKill
	23 	13 	1 	1 	18 	0 	2016-05-04 	

https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=56.0&process_type=content

Top Crashers for Firefox 56.0
1 	45.03% 	16.56% 	IPCError-browser | ShutDownKill
	70988 	53479 	193 	79 	23645 	0 	2016-05-04 

https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=56.0b12&process_type=content

Top Crashers for Firefox 56.0b12
1 	96.18% 	0.89% 	IPCError-browser | ShutDownKill
	29919 	23332 	141 	64 	13040 	0 	2016-05-04 	
	
https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=56.0b11&process_type=content

Top Crashers for Firefox 56.0b11
1 	97.57% 	-0.59% 	IPCError-browser | ShutDownKill
	12363 	9802 	49 	43 	5633 	0 	2016-05-04
Alias: IPCError_ShutDownKill
No, this is a meta bug for a signature that's a catch-all for countless issues with separate root causes that aren't actually user-visible, so it's not a candidate for release tracking.  We can track specific instances where there's known STR and user impact, but otherwise there's no use.
Depends on: 1375704
Depends on: 1408631
Depends on: 1415837
Depends on: 1421915
Keywords: meta
Depends on: 1422715
See Also: 1289405
Depends on: 1424451
Depends on: 1405290
Depends on: 1434319
Priority: P1 → P2
Depends on: 1437575
Depends on: 1445312
No longer depends on: ServiceWorkers-stability
Depends on: 1419488
Depends on: 1495632
Depends on: 1540570

¡Hola!

Adding the flag for 68 as per https://crash-stats.mozilla.org/signature/?product=Firefox&signature=IPCError-browser%20%7C%20ShutDownKill#summary it accounts for 91.2% of the total crashes.

FWIW all of the crashes on my Nightly in the last month or so are variants of this:

Submitted Crash Reports
Report ID Date Submitted
bp-9e113f57-ef91-403e-853f-2e06b0190515 5/14/2019, 7:44 PM
View
bp-d56a8ef4-0853-4d07-87cf-b9f170190515 5/14/2019, 7:44 PM
View
bp-2252da92-b4f8-4cef-915d-0d2670190514 5/13/2019, 8:27 PM
View
bp-3407c931-83e4-4a6f-96b1-b894d0190514 5/13/2019, 8:27 PM
View
bp-7a5a8541-7d1a-4815-9ccd-3a3250190514 5/13/2019, 8:27 PM
View
bp-57533b9a-a227-4995-a315-725f10190514 5/13/2019, 8:27 PM
View
bp-74da1097-3526-4b1a-9654-e34ae0190513 5/13/2019, 9:11 AM
View
bp-342bf38b-c00c-4c85-8454-447b70190510 5/10/2019, 10:17 AM
View
bp-1bcfe01d-3bed-4ff0-8a2f-d22710190510 5/10/2019, 8:56 AM
View
bp-be35f4be-6f78-4e48-8e73-a49a70190509 5/9/2019, 9:09 AM
View
bp-d42e065c-5710-492b-a2e5-4b33b0190509 5/9/2019, 9:09 AM
View
bp-04d12eba-5a5f-4351-8932-efddf0190509 5/9/2019, 9:09 AM
View
bp-eee693e1-0831-43f3-95a6-bb0980190509 5/9/2019, 9:09 AM
View
bp-658039b7-241a-4d98-8c1c-9f3780190509 5/9/2019, 9:09 AM
View
bp-ad35eb3d-8add-4730-85f8-3dcb00190509 5/9/2019, 9:09 AM
View
bp-6b0d5d47-535e-4dcc-97d7-0ccc60190509 5/9/2019, 9:09 AM
View
bp-4b7c1b5d-bdbf-4850-a775-f5ef90190509 5/9/2019, 9:09 AM
View
bp-001707f9-503e-41d6-bdd2-80de60190508 5/8/2019, 11:46 AM
View
bp-996a13c2-3bce-4226-98b5-d918c0190508 5/8/2019, 11:46 AM
View
bp-98c46dd3-a812-4f23-9011-bf22b0190506 5/6/2019, 4:03 PM
View
bp-3ad29622-5e18-4f48-945a-514390190503 5/3/2019, 3:22 PM
View
bp-106b7b55-3629-47c2-abfc-3cbd10190503 5/3/2019, 9:09 AM
View
bp-fa2c5fdd-17c2-4519-8236-f439b0190501 4/30/2019, 8:24 PM
View
bp-52f83e1a-9bd9-4332-89e1-dd8640190429 4/29/2019, 11:52 AM
View
bp-9c0376bc-200d-45ba-bd75-1d0770190429 4/29/2019, 11:52 AM
View
bp-75007eeb-2cd9-45dd-a4e2-d74280190429 4/29/2019, 11:52 AM
View
bp-c91bfdaf-d5ff-4f0c-8ed8-7ef250190429 4/29/2019, 11:52 AM
View
bp-8d079313-a3d8-44cb-b11e-1ab5c0190429 4/29/2019, 11:52 AM
View
bp-fcbfa40c-31da-4c8d-b138-406200190429 4/29/2019, 11:52 AM
View
bp-7d798392-8924-4c2b-bb6d-d3af90190429 4/29/2019, 11:52 AM
View
bp-ea25d6e2-0ccc-4faf-a537-37b360190429 4/29/2019, 11:52 AM
View
bp-05304540-cfb0-42c7-95c1-d197c0190425 4/25/2019, 9:12 AM
View
bp-750cc9de-d914-4406-b78f-53ba80190424 4/24/2019, 10:03 AM
View
bp-2bba28a8-3f01-494e-b20b-982590190424 4/24/2019, 10:03 AM
View
bp-74065cdd-52ba-4682-92f4-7676d0190424 4/24/2019, 10:03 AM
View
bp-c9e2255a-bd10-4471-8e59-cb2820190424 4/24/2019, 10:03 AM
View
bp-57861240-6f6f-415a-8312-4f0c50190424 4/24/2019, 10:03 AM
View
bp-cc5d3dec-9c8e-41eb-aad5-6e84e0190423 4/23/2019, 9:04 AM
View
bp-df0a195b-2441-4621-b56d-705f50190421 4/21/2019, 1:25 PM
View
bp-9817d377-101f-4b4b-990c-0b1f90190410 4/10/2019, 4:36 PM
View
bp-932d2082-6f7c-4eb2-91e4-ea71f0190410 4/10/2019, 9:10 AM
View

Do let me know if there's anything I could provide on this bug and I'll be happy to do so.

¡Gracias!
Alex

This saw a large uptick during March-2019 driven by the nightly channel. I understand that this bug is a catch all of sorts due to how our crash signatures are handled, but do we have an explanation for the uptick?

Edit: seems like this discussion has happened over at https://bugzilla.mozilla.org/show_bug.cgi?id=1219672#c79

Depends on: 1612569

I see nothing actionable here. This looks more like a meta bug. Removing the qawanted flag.

Keywords: qawanted

Now that bug 1612569 is fixed the signature here dropped to zero. I suggest we keep using this bug both as a meta for bugs with appropriate signatures (as blocking this) and we update the signature here to include crashes where the stacks are all over the place and thus are unlikely to be actionable. I've already identified a few of them such as:

  • IPCError-browser | ShutDownKill | mozilla::ipc::MessagePump::Run
  • IPCError-browser | ShutDownKill | Nt.*
  • IPCError-browser | ShutDownKill | PR_MD_WAIT_CV | _PR_WaitCondVar | PR_Wait | nsThreadStartupEvent::Wait

Given the volume of these crashes we should also re-evaluate the shutdown killer time which is currently set at 5 seconds. We might want to bump it up and see if it affects the crash rate.

Another point: I've found crashes where the content process was still starting up when we killed it. I wonder if we do support this scenario, i.e. sending a process a shutdown message before it has finished initializing.

There's also a worrisome number of crashes where all the threads in the process are stuck waiting for something. In many cases these involve event queues and thread pools so I fear we might be seeing actual races caused by a deadlock.

Depends on: 1614305

As per my previous comment I'll start adding signatures that don't seem directly actionable to this bug so we can keep track of the non-actionable rate of these hangs.

Crash Signature: [@ IPCError-browser | ShutDownKill] → [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit]

Quick update: I may have figured out the reason behind many of the shutdown hangs that look like deadlocks. You can see the nitty gritty details in bug 1614570 comment 1 but in short what's happening is that we stop the event loops at the moment we receive the content process shutdown message. If some code relies on runnables or spinning an event loop for its shutdown procedure it will deadlock.

Crashes with this signature are content processes being slow, not stuck. They're signaling the main process that they're done shutting down but didn't do it in time.

Crash Signature: [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit] → [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit] [@ IPCError-browser | ShutDownKill | memcpy | mozilla::ipc::CrashReporterMetadataShmem::SyncNotesToShmem]
Crash Signature: [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit] [@ IPCError-browser | ShutDownKill | memcpy | mozilla::ipc::CrashReporterMetadataShmem::SyncNotesToShmem] → [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit] [@ IPCError-browser | ShutDownKill | memcpy | mozilla::ipc::CrashReporterMetadataShmem::SyncNotesToShmem] [@ IPCError-browser | ShutDownKill | mozilla::ipc…
Crash Signature: mozilla::ipc::MessagePump::Run] → mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | <name omitted> | <name omitted> | mozilla::ipc::MessagePump::Run ]

Adding more signatures that don't seem actionable. The crashes are in various states of shutdown so they're not stuck but probably just slow.

Crash Signature: [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit] [@ IPCError-browser | ShutDownKill | memcpy | mozilla::ipc::CrashReporterMetadataShmem::SyncNotesToShmem] [@ IPCError-browser | ShutDownKill | mozilla::ipc… → [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | __GI___poll] [@ IPCError-browser | ShutDownKill | g_main_context_iterate] [@ IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit] [@ IPCError-browser | ShutDownKill | memcpy…
Crash Signature: memcpy | mozilla::ipc::CrashReporterMetadataShmem::SyncNotesToShmem] [@ IPCError-browser | ShutDownKill | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __poll] [@ IPCError-browser | ShutDownKill | __poll_nocancel] [@ IPCError-b… → memcpy | mozilla::ipc::CrashReporterMetadataShmem::SyncNotesToShmem] [@ IPCError-browser | ShutDownKill | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | MOZ_Z_inflate_fast] [@ IPCError-browser | ShutDownKill | NtAlertThreadByThre…
Depends on: 1616018

Added another "we're just being slow" signature.

Crash Signature: NtAlertThreadByThreadId] [@ IPCError-browser | ShutDownKill | NtGdiDdDDIDestroyAllocation] [@ IPCError-browser | ShutDownKill | __poll] [@ IPCError-browser | ShutDownKill | __poll_nocancel] [@ IPCError-browser | ShutDownKill | PR_MD_WAIT_CV | _PR_Wai… → NtAlertThreadByThreadId] [@ IPCError-browser | ShutDownKill | NtGdiDdDDIDestroyAllocation] [@ IPCError-browser | ShutDownKill | NtUserPeekMessage | _PeekMessage] [@ IPCError-browser | ShutDownKill | __poll] [@ IPCError-browser | ShutDownKill | __poll…
Crash Signature: IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::BackgroundHangThread::NotifyActivity] → IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::BackgroundHangThread::NotifyActivity] [@ IPCError-browser | ShutDownKill | libsystem_kernel.dylib@0x1ca16 ]
Crash Signature: IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::BackgroundHangThread::NotifyActivity] [@ IPCError-browser | ShutDownKill | libsystem_kernel.dylib@0x1ca16 ] → IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::BackgroundHangThread::NotifyActivity] [@ IPCError-browser | ShutDownKill | libsystem_kernel.dylib@0x1ca16 ] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | …

The last signature belongs to bug 1578070, I'm moving it there.

Crash Signature: IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::BackgroundHangThread::NotifyActivity] [@ IPCError-browser | ShutDownKill | libsystem_kernel.dylib@0x1ca16 ] [@ IPCError-browser | CommonCreateWindow Unexpected aChromeFlags passed | … → IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::BackgroundHangThread::NotifyActivity] [@ IPCError-browser | ShutDownKill | libsystem_kernel.dylib@0x1ca16 ]
Crash Signature: IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::BackgroundHangThread::NotifyActivity] [@ IPCError-browser | ShutDownKill | libsystem_kernel.dylib@0x1ca16 ] → IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::BackgroundHangThread::NotifyActivity] [@ IPCError-browser | ShutDownKill | libsystem_kernel.dylib@0x1ca16] [@ IPCError-browser | ShutDownKill | Interpret]
Depends on: 1620619
Depends on: 1620705
Depends on: 1620914
Depends on: 1620614
Depends on: 1620157
Depends on: 1622110
Depends on: 1622114
Depends on: 1622120
Crash Signature: __poll_nocancel] [@ IPCError-browser | ShutDownKill | PR_MD_WAIT_CV | _PR_WaitCondVar | PR_Wait | nsThreadStartupEvent::Wait] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | <name omitted> | <name omitted> | mozilla::ipc::MessagePump::Run] [@ I… → __poll_nocancel] [@ IPCError-browser | ShutDownKill | PR_MD_WAIT_CV | _PR_WaitCondVar | PR_Wait | nsThreadStartupEvent::Wait] [@ IPCError-browser | ShutDownKill | _PR_MD_WAIT_CV | _PR_WaitCondVar | PR_Wait | nsThreadStartupEvent::Wait] [@ IPCError-bro…

Added more "slow" signatures.

Crash Signature: | ShutDownKill | libsystem_kernel.dylib@0x1ca16] [@ IPCError-browser | ShutDownKill | Interpret] → | ShutDownKill | libsystem_kernel.dylib@0x1ca16] [@ IPCError-browser | ShutDownKill | Interpret] [@ IPCError-browser | ShutDownKill | NtDeviceIoControlFile] [@ IPCError-browser | ShutDownKill | NtProtectVirtualMemory]
Crash Signature: memcpy | mozilla::ipc::CrashReporterMetadataShmem::SyncNotesToShmem] [@ IPCError-browser | ShutDownKill | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | MOZ_Z_inflate_fast] [@ IPCError-browser | ShutDownKill | NtAlertThreadByThre… → memcpy | mozilla::ipc::CrashReporterMetadataShmem::SyncNotesToShmem] [@ IPCError-browser | ShutDownKill | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | <name omitted> | mozilla::ipc::MessagePump::Run ] [@ IPCEr…
Crash Signature: NtDeviceIoControlFile] [@ IPCError-browser | ShutDownKill | NtProtectVirtualMemory] → NtDeviceIoControlFile] [@ IPCError-browser | ShutDownKill | NtProtectVirtualMemory] [@ IPCError-browser | ShutDownKill | js::jit::EnterBaselineInterpreterAtBranch] [@ IPCError-browser | ShutDownKill | NtSignalAndWaitForSingleObject | SignalObjectAndWa…
Crash Signature: SignalObjectAndWait] → SignalObjectAndWait] [@ IPCError-browser | ShutDownKill | NtYieldExecution]
Depends on: 1630403
Crash Signature: SignalObjectAndWait] [@ IPCError-browser | ShutDownKill | NtYieldExecution] → SignalObjectAndWait] [@ IPCError-browser | ShutDownKill | NtYieldExecution] [@ IPCError-browser | ShutDownKill | mozilla::nsRFPService::ReduceTimePrecisionAsMSecs ]
Depends on: 1636333

¡Hola!

Got
Firefox 79.0a1 Crash Report [@ IPCError-browser | ShutDownKill | PR_MD_WAIT_CV | _PR_WaitCondVar | PR_Wait | nsThreadStartupEvent::Wait ]

Submitted Crash Reports
Report ID Date Submitted
bp-92ef466b-a927-4eea-9d9f-6bc280200616 6/16/2020, 06:16

Updating flags per https://crash-stats.mozilla.org/signature/?product=Firefox&signature=IPCError-browser%20%7C%20ShutDownKill%20%7C%20PR_MD_WAIT_CV%20%7C%20_PR_WaitCondVar%20%7C%20PR_Wait%20%7C%20nsThreadStartupEvent%3A%3AWait

¡Gracias!
Alex

This is a meta bug, please leave status flags unset. If there's specific instances of these crashes that are reproducible or actionable they should have their own bugs and status flags.

Attached image hung-Nightly.png —

¡Hola Gabriele!

For reasons unbeknownst to me Nightly has been hang happy in the most recent two weeks but the crash reporter wont' fire.

This is what I manage to find in about:crashes FWIW:

Submitted Crash Reports
Report ID Date Submitted
bp-82dbed26-65af-4dbe-8f25-2519a0200710 7/10/2020, 10:26
View
bp-2e6fb69d-d92e-47cb-bf80-80c3c0200709 7/9/2020, 18:40
View
bp-9933bec5-18ff-4ff3-ac14-aed1f0200707 7/7/2020, 11:28
View
bp-3f3c8853-1258-4699-9d80-96fb00200707 7/7/2020, 11:28
View
bp-a20e6067-7ac0-40b8-b00d-38a730200627 6/26/2020, 20:16
View
bp-db518be7-813b-4b10-83fd-fd2910200627 6/26/2020, 20:16
View
bp-becff340-b12f-4c18-a79b-1ba400200627 6/26/2020, 20:05
View
bp-0c224c7e-8df6-46ae-8a74-b82ce0200624 6/23/2020, 19:29
View
bp-4b538f74-a911-4b26-80f6-d92ff0200623 6/23/2020, 08:22
View
bp-d21e6c07-f95b-4129-9106-842a50200622 6/22/2020, 08:50
View
bp-6b517307-362a-4822-8ebe-52f5c0200622 6/22/2020, 08:34
View

Can you please comment on what could be done to try and pin down the cause of these crashes, please?

¡Gracias!
Alex

Flags: needinfo?(gsvelto)

These reports won't trigger the crash reporter because they aren't actual crashes, they're hangs. Basically a content process took too long to shut down and we killed it after a certain time (currently 20s). Before killing it we grab a snapshot of it's state and that's how you got those. Do you have fission enabled? If you do then you'll have more content processes around and it will be more likely that one is too slow when shutting down.

Flags: needinfo?(gsvelto)

¡Hola Gabriele!

Thanks for responding here.

fission.autostart is set to false in this Nightly.

I guess I'll continue to try and crash the hung Nightly with https://ftp.mozilla.org/pub/utilities/crashfirefox-intentionally/crashfirefox64.exe for the time being and see if an usable crash report is generated.

If anyone has any ideas on how to pin those down I'd be grateful if they share some ideas.

¡Gracias!
Alex

Bug 1637048 has significantly reduced the volume here. It's time to revisit the signatures and check if we've got some actual deadlocks in there. NI? me so I don't forget.

Flags: needinfo?(gsvelto)

For starters I pruned all the signatures that have no crashes for recent versions.

Crash Signature: [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | __GI___poll] [@ IPCError-browser | ShutDownKill | g_main_context_iterate] [@ IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit] [@ IPCError-browser | ShutDownKill | memcpy… → [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | __GI___poll] [@ IPCError-browser | ShutDownKill | g_main_context_iterate] [@ IPCError-browser | ShutDownKill | Interpret] [@ IPCError-browser | ShutDownKill | js::jit::EnterBasel…

I had a look around and filed some bugs.

Flags: needinfo?(gsvelto)
Crash Signature: IPCError-browser | ShutDownKill | PR_MD_WAIT_CV | _PR_WaitCondVar | PR_Wait | nsThreadStartupEvent::Wait] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | <name omitted> | mozilla::ipc::MessagePump::Run] → IPCError-browser | ShutDownKill | PR_MD_WAIT_CV | _PR_WaitCondVar | PR_Wait | nsThreadStartupEvent::Wait] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | <name omitted> | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | mozil…

¡Hola Gabriele!

Sorry to bother you again.

Some more crashes from the most recent past two weeks:

Submitted Crash Reports
Report ID Date Submitted
bp-f26ad0c1-1202-49cf-bd25-dcb8c0200902 9/2/2020, 16:34
bp-6533606c-4fb5-4ac7-b891-3f0410200828 8/28/2020, 11:18
bp-873447b7-869f-4772-8f6d-972700200825 8/25/2020, 08:41
bp-6da34ca0-f564-48e1-8aa1-9a5730200824 8/24/2020, 08:56
bp-217dec7c-f88a-4812-b2e7-e89400200823 8/23/2020, 09:53
bp-989f9c12-eb5c-46ba-ab5f-d076a0200822 8/21/2020, 22:03
bp-5f74b094-7a1b-4b3e-919c-b6cd30200821 8/21/2020, 12:11
bp-79ee51df-9d99-4721-a6cc-6f41a0200820 8/20/2020, 04:56
bp-fc0ae5be-5621-4534-a4c3-24d710200819 8/19/2020, 08:59

Please let me know here if any of these needs separate bug reports and I'd be happy to file them.

¡Gracias!
Alex

Flags: needinfo?(gsvelto)

Thanks Alex. I glanced over the crashes and it seems we have everything on file already. One question however: most of your crashes appear to be content processes being slow but from the looks of it your machine is fairly powerful. Do you have an SSD or a regular HDD on your machine? I'd like to figure out what could be causing those content processes to respond slowly.

Flags: needinfo?(gsvelto) → needinfo?(alex_mayorga)

¡Hola Gabriele!

Thanks for taking a look into these.

Here are the main specs for this laptop:

Processor Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz 2.90 GHz
Installed RAM 24.0 GB (23.8 GB usable)
System type 64-bit operating system, x64-based processor

Edition Windows 10 Pro Insider Preview
Version 2004
Installed on ‎9/‎5/‎2020
OS build 20206.1000
Experience Windows Feature Experience Pack 120.22800.0.0

SSD SAMSUNG MZVLB1T0HALR-000L7

Hope this helps.

If you need anything else from this system, please just ni?

¡Gracias!
Alex

Flags: needinfo?(alex_mayorga) → needinfo?(gsvelto)

Thanks Alex, this is very useful. You've got a fast machine so there's no real reason why content processes might take so long to shut down. We'll have to do a proper profile of a content process shutting down to figure out what's going on. I also wonder if Windows task scheduling might be playing a part here.

Flags: needinfo?(gsvelto)
Depends on: 1633342

Firefox just abruptly crashed for me while I was in another desktop. Switched back to the desktop, no crash reporter dialog.

bp-79d24fbe-41a3-4463-a636-8037b0201014
bp-724c3c22-e11e-4a14-a0d5-ad6870201014

Crash Signature: js::jit::EnterBaselineInterpreterAtBranch] [@ IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit] [@ IPCError-browser | ShutDownKill | libsystem_kernel.dylib@0x1ca16] [@ IPCError-browser | ShutDownKill | mozilla::ipc::MessagePump::Run] [@ IPCEr… → js::jit::EnterBaselineInterpreterAtBranch] [@ IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit] [@ IPCError-browser | ShutDownKill | LdrpDispatchUserCallTarget ] [@ IPCError-browser | ShutDownKill | libsystem_kernel.dylib@0x1ca16] [@ IPCError…
Depends on: 1677318
Depends on: 1674388

¡Hola y'all!

Found this one on 85.0a1:

bp-5d378d24-1334-4e36-bc09-274930201123 11/23/2020, 17:53

Updating flags per https://crash-stats.mozilla.org/signature/?product=Firefox&signature=IPCError-browser%20%7C%20ShutDownKill%20%7C%20mozilla%3A%3Aipc%3A%3AMessagePump%3A%3ARun

Seems like it is happening much more on 85.0a1 in the most recent past week BTW.

Hope this is helpful.

¡Gracias!
Alex

Change the status for beta to have the same as nightly and release.
For more information, please visit auto_nag documentation.

¡Hola y'all!

Found this one on 86.0a1:

https://crash-stats.mozilla.org/report/index/11bfbef7-8317-4f9a-80e4-66b450201223#tab-details

Updating flags per https://crash-stats.mozilla.org/signature/?product=Firefox&signature=IPCError-browser%20%7C%20ShutDownKill%20%7C%20NtYieldExecution

Seems like it is happening much more on 86.0a1 in the most recent past week BTW.

Hope this is helpful.

¡Gracias!
Alex

Crash Signature: mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | mozilla::TaskController::GetRunnableForMTTask | mozilla::ipc::MessagePump::Run ] → mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | mozilla::TaskController::GetRunnableForMTTask | mozilla::ipc::MessagePump::Run ] [@ IPCError-browser | ShutDownKill | js::NativeObject::addEnumerableDataProperty]

¡Hola!

Found this one on:

Product Firefox
Release Channel nightly
Version 87.0a1
Build ID 20210212100155 (2021-02-12)

bp-f305824e-bd5b-4a68-a7ee-74e980210213 2/13/2021, 02:10

Updating flags.

¡Gracias!
Alex

Crash Signature: mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | mozilla::TaskController::GetRunnableForMTTask | mozilla::ipc::MessagePump::Run ] [@ IPCError-browser | ShutDownKill | js::NativeObject::addEnumerableDataProperty] → mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | mozilla::TaskController::GetRunnableForMTTask | mozilla::ipc::MessagePump::Run ] [@ IPCError-browser | ShutDownKill | js::NativeObject::addEnumerableDataProperty]
Crash Signature: mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | mozilla::TaskController::GetRunnableForMTTask | mozilla::ipc::MessagePump::Run ] [@ IPCError-browser | ShutDownKill | js::NativeObject::addEnumerableDataProperty] → mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | mozilla::TaskController::GetRunnableForMTTask | mozilla::ipc::MessagePump::Run ] [@ IPCError-browser | ShutDownKill | js::NativeObject::addEnumerableDataProperty] [@ IPCError-browser…

¡Hola y'all!

Found these on

Product Firefox
Release Channel nightly
Version 88.0a1
Build ID 20210227094458

Report ID Date Submitted
bp-43a0d1ca-dcae-42dd-ab6b-797ad0210302 3/2/2021, 15:07
bp-1a91f848-051d-40c5-97f3-fd9e80210227 2/27/2021, 20:26

Updating flags.

¡Gracias!
Alex

Crash Signature: IPCError-browser | ShutDownKill | js::frontend::BytecodeEmitter::emitTree ] → IPCError-browser | ShutDownKill | js::frontend::BytecodeEmitter::emitTree ] [@ IPCError-browser | ShutDownKill | nsThread::Shutdown | mozilla::LazyIdleThread::ShutdownThread ]

Thanks Alex, these signatures belong in here.

Crash Signature: [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | __GI___poll] [@ IPCError-browser | ShutDownKill | g_main_context_iterate] [@ IPCError-browser | ShutDownKill | Interpret] [@ IPCError-browser | ShutDownKill | js::jit::EnterBasel… → [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | g_main_context_iterate] [@ IPCError-browser | ShutDownKill | Interpret] [@ IPCError-browser | ShutDownKill | js::frontend::BytecodeEmitter::emitTree] [@ IPCError-browser | ShutDo…
Flags: needinfo?(gsvelto)
Flags: needinfo?(gsvelto)

I don't know, the stacks are a bit of a mish-mash under this signature.

Flags: needinfo?(gsvelto)
Crash Signature: ShutDownKill | __psynch_cvwait | <name omitted> | mozilla::ipc::MessagePump::Run] → ShutDownKill | __psynch_cvwait | <name omitted> | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | mozilla::ipc::MessagePump::Run]

¡Hola Gabriele!

Is https://crash-stats.mozilla.org/signature/?product=Firefox&signature=IPCError-browser%20%7C%20ShutDownKill%20%7C%20xpc%3A%3AXrayWrapper%3CT%3E%3A%3AgetPrototype&date=%3E%3D2021-03-09T03%3A42%3A00.000Z&date=%3C2021-04-09T03%3A42%3A00.000Z also this one or a different one?

Also got bp-ad653f77-731c-4669-9b2e-407ce0210409 in
Product Firefox
Release Channel nightly
Version 89.0a1
Build ID 20210407212527

so updating flags FWIW.

¡Gracias!
Alex

Flags: needinfo?(gsvelto)

This one, thanks!

Crash Signature: ShutDownKill | __psynch_cvwait | <name omitted> | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | mozilla::ipc::MessagePump::Run] → ShutDownKill | __psynch_cvwait | <name omitted> | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | xpc::XrayWrapper<T>::getPrototype]
Flags: needinfo?(gsvelto)

The volume for this has increased in total. On nightly, [@ IPCError-browser | ShutDownKill | mozilla::ipc::MessagePump::Run] moved from 200-250 crashes/day to 350-600. Might have started with 20210321093903 but the changes look unrelated.

This is a natural effect of having turned on Fission for more users. With more content processes being active at a given time the chance of one being slow at shotdown has increased thus leading to more reports here. Given this is only going to increase in the future we probably must decide what to do with these reports: it's been a while since we found content processes that were genuinely stuck at shutdown in a way that was actionable.

Crash Signature: ShutDownKill | __psynch_cvwait | <name omitted> | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | xpc::XrayWrapper<T>::getPrototype] → ShutDownKill | __psynch_cvwait | <name omitted> | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | xpc::XrayWrapper<T>::getPrototype] [@ IPCErro…
Crash Signature: IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::ipc::MessagePump::Run] → IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __GI_madvise]
Crash Signature: IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __GI_madvise] → IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __GI_madvise] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | _pthread_cond_wait | mozilla::TaskController::GetRunn…
Crash Signature: IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __GI_madvise] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | _pthread_cond_wait | mozilla::TaskController::GetRunn… → IPCError-browser | ShutDownKill | RtlAcquireSRWLockExclusive | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __GI_madvise] [@ IPCError-browser | ShutDownKill | __psynch_cvwait | _pthread_cond_wait | mozilla::ipc::MessagePump::Run…

¡Hola y'all!

Got https://crash-stats.mozilla.org/report/index/57f5e1d9-e5e1-4992-b2c7-24c530210513 on 90.

Updating flags FWIW.

¡Gracias!
Alex

Found another signature. I should find time to go over all the existing ones and leave only the relevant ones.

Crash Signature: mozilla::TaskController::GetRunnableForMTTask | mozilla::ipc::MessagePump::Run] → mozilla::TaskController::GetRunnableForMTTask | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | nsThread::Dispatch]
Flags: needinfo?(gsvelto)

(In reply to alex_mayorga from comment #166)

FWIW per https://crash-stats.mozilla.org/signature/?product=Firefox&signature=IPCError-browser%20%7C%20ShutDownKill%20%7C%20js%3A%3Ajit%3A%3AMaybeEnterJit&date=%3E%3D2021-04-27T16%3A37%3A00.000Z&date=%3C2021-05-27T16%3A37%3A00.000Z IPCError-browser | ShutDownKill | js::jit::MaybeEnterJit has spiked a bit on 90.

Yes, this might be an actual issue. None of the crashes have the ipc shutdown state annotation set which means they didn't even begin to shutdown; it's possible that they were genuinely stuck, possibly deadlocked.

Flags: needinfo?(gsvelto)

I have been experiencing this crash and rarely see a crash reporter window, you have to go in and submit the crash reports. Losing the tab history with each crash just makes it hardly worth using daily.

Having the window simply disappear without anything is just not a satisfactory outcome for me. We are not really dealing with a crash on shutdown. The user interface disappears while I was using it without notice and without any of the normal crash reporting occurring and nothing in the windows event log. If my experience is anything to go by, you are not getting but a fraction of the reports submitted. When I went to look after this last disappearance I submitted a whole host of old reports and my first attempt to submit them resulted in a fail appearing in the troubleshooting list. I think the crash reporter may also be fundamentally broken here.

There are no crashes under this bug, these are a specific type of shutdown hangs. If Firefox disappears without showing the crash reporter please file another bug with a detailed description of your setup (OS, Firefox version, sites/scenarios where it use to happen). Use the Toolkit > Crash Reporting component because the crash reporter not showing up belongs there.

Here's another signature. I'm not sure why mozglue.dll is in the signature and not symbolized.

Crash Signature: mozilla::TaskController::GetRunnableForMTTask | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | nsThread::Dispatch] → mozilla::TaskController::GetRunnableForMTTask | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | nsThread::Dispatch] [@ IPCError-browser | ShutDownKill | mozglue.dll | mozilla::ipc::MessagePump::Run ]
Crash Signature: _PR_MD_WAIT_CV | _PR_WaitCondVar | PR_Wait | nsThreadStartupEvent::Wait] [@ IPCError-browser | ShutDownKill | __GI___poll] [@ IPCError-browser | ShutDownKill | __poll] [@ IPCError-browser | ShutDownKill | __poll_nocancel] [@ IPCError-browser | ShutDo… → _PR_MD_WAIT_CV | _PR_WaitCondVar | PR_Wait | nsThreadStartupEvent::Wait] [@ IPCError-browser | ShutDownKill | _tailMerge_d3dcompiler_47.dll | mozilla::ipc::MessagePump::Run] [@ IPCError-browser | ShutDownKill | __GI___poll] [@ IPCError-browser | ShutD…

¡Hola Gabriele!

Hope these lines find you well.

Reviewing about:crashes on this Nightly I found
https://crash-stats.mozilla.org/report/index/11f5e988-7b45-494d-b49c-4b7930210713#tab-details
which per
https://crash-stats.mozilla.org/signature/?product=Firefox&signature=IPCError-browser%20%7C%20ShutDownKill%20%7C%20_tailMerge_d3dcompiler_47.dll%20%7C%20mozilla%3A%3Aipc%3A%3AMessagePump%3A%3ARun
now seems to be a Fission crash.

Should that be a separate bug or is this one still the one that's relevant?

FWIW I've updated the flags per the links above.

¡Gracias!
Alex

Flags: needinfo?(gsvelto)

These stacks are identical to previous ones, they look different because we had an issue with the scripts that fetched the symbols so the resulting unwinding information was missing. I've updated it and reprocessed the crashes.

As we turn on Fission for more and more users we get more and more of these reports but their usefulness is limited; I think it's time to discuss if we still want to gather these or not.

Flags: needinfo?(gsvelto)

I noticed my Firefox (Nightly, macOS) was completely hung, so I took a sample and spindump, then killed the app. When I opened Firefox back up, I had an unsent crash report from a couple minutes prior, presumably from when the hang started. That report pointed me here. I do have Fission enabled. Would the process sample and spindump be helpful in diagnosing this? If so, I can provide. Thanks!

(In reply to Sam Johnson from comment #173)

I noticed my Firefox (Nightly, macOS) was completely hung, so I took a sample and spindump, then killed the app. When I opened Firefox back up, I had an unsent crash report from a couple minutes prior, presumably from when the hang started. That report pointed me here. I do have Fission enabled. Would the process sample and spindump be helpful in diagnosing this? If so, I can provide. Thanks!

Probably yes. The crash report you got was from a child process that was taking too long to shut down, we grab them if a child process does not respond for long enough. That being said that very crash report is captured in the main process, so it might be possible that the process was taking too long and was responsible for the freeze.

Attached file sample_spindump.zip —

I've zipped up the sample and spindump, attached. FWIW, when I noticed the app had hung, it was not after any attempts I made to quit the app; it was in the background while I was working in another program.

(In reply to Sam Johnson from comment #175)

I've zipped up the sample and spindump, attached. FWIW, when I noticed the app had hung, it was not after any attempts I made to quit the app; it was in the background while I was working in another program.

Thanks, I'll have a look ASAP.

Heads up for everybody tracking this bug: we've decided to revert the change that split the crash signatures for this type of reports. When the change will be applied in the coming weeks the signatures will again collapse into one. The reasoning behind this is that we only found a handful of useful signatures and far too many unactionable ones. The latter are making nightly crash triage harder while providing little value. We'll keep gathering these reports for the time being until we figure out a better way to make them actionable.

We now have the possibility to aggregate in crash-stats on the "Xpcom spin event loop stack" annotations. This leads to the following results.

Outstanding are: RequestHelper::StartAndReturnResponse, browser-custom-element.js:permitUnload and various nsThread::Shutdown * flavors.

Depends on: 1740889
Depends on: 1740895
Depends on: 1740896
Depends on: 1740899

There are more SpinEventLoopUntil locations in the data.

The question is if the nsThreadShutdown: * have some common cause/pattern.

(In reply to Jens Stutte [:jstutte] from comment #180)

The question is if the nsThreadShutdown: * have some common cause/pattern.

I just clicked through some reports, but it seems that many of them do not even contain the thread we are waiting for. This sounds as if our book-keeping of closing threads could fail in some cases?

IIUC the flow, we will unblock only if SchedulerGroup::Dispatch(TaskCategory::Other, event.forget()); succeeds such that we will call nsThread::ShutdownComplete on the main thread. We do not check the return value here, and there are at least some mallocs on the way to a successful dispatch that could fail. Should we check for successful dispatch here and in case clear the context->mAwaitingShutdownAck by hand off main-thread?

Flags: needinfo?(continuation)

I don't really know anything about threads. Kris Wright is a better person to ask about them.

Flags: needinfo?(continuation) → needinfo?(kwright)

So this keeps spinning in my mind as a nested event loop... Just a guess, but I see basically two ways how we can end up in this situation:

  1. Unexpected thread death without cleanup
    If a thread could die unexpectedly without cleaning up our in-process-memory book-keeping (maybe by some interrupt?) and then we arrive later in nsThread::Shutdown(), we would still be able to dispatch an event to its queue and then wait forever for the shutdown ack message to come. It is not clear to me how this could happen, but it would be nice if there was a (system) function to check, if a thread handle is really alive before we even try to shut it down.

  2. Failed dispatch of shutdown ack to main thread
    This is basically comment 181. Just setting context->mAwaitingShutdownAck might not be enough in this situation if there are no more other messages on the main thread, though?

Any other thoughts, Kris?

Hmmm, there is a very puzzling thing: While the XPCOMSpinEventLoopStack annotation suggests we are stuck inside a nested event loop on the main thread, the stack of those crashes does not contain any SpinEventLoopUntil call but they are just sitting in the normal main event loop. I start to think this is a red herring...

(In reply to Jens Stutte [:jstutte] from comment #184)

Hmmm, there is a very puzzling thing: While the XPCOMSpinEventLoopStack annotation suggests we are stuck inside a nested event loop on the main thread, the stack of those crashes does not contain any SpinEventLoopUntil call but they are just sitting in the normal main event loop. I start to think this is a red herring...

Just to be clear: In some cases like bug 1740889 we see a real SpinEventLoopUntil on the stack, but in some cases like bug 1740895 we see a misalignment between the main thread stack and the XPCOMSpinEventLoopStack annotation.

Kris, I do not think it is worth the time to look into the nsThreadShutdown: * instances for now, they seem to be a false alarm (at least in the terms of SpinEventLoopUntil caused hangs, the hangs are real, of course). The instances I cracked up so far do not contain any sign of really being inside nsThread::Shutdown at all. So I'll try to understand the apparently bogus annotations first.

Flags: needinfo?(kwright)
Depends on: 1741131

So in the vast majority of cases we are not stuck in the child process in any SpinEventLoopUntil and we need to look out for other common patterns. :-(

There are some cases though where we find ourselves stuck in RequestHelper::StartAndReturnResponse which might indicate some pitfall in the request/response flow of local storage.

The remaining cases with an event loop stack from the parent process are almost all happening during nsThread:.Shutdown for some thread on the parent process.

Just to check: it would not be a bug if this crash was the result of a process in an infinite loop in Javascript being killed, correct?

(In reply to Justin Peter from comment #187)

Just to check: it would not be a bug if this crash was the result of a process in an infinite loop in Javascript being killed, correct?

This is not really a crash. It's just us killing the process because it's taking too long to shut down. The crash reports are snapshots of the process right before we killed it.

(In reply to Gabriele Svelto [:gsvelto] from comment #177)

Heads up for everybody tracking this bug: we've decided to revert the change that split the crash signatures for this type of reports. When the change will be applied in the coming weeks the signatures will again collapse into one. The reasoning behind this is that we only found a handful of useful signatures and far too many unactionable ones. The latter are making nightly crash triage harder while providing little value. We'll keep gathering these reports for the time being until we figure out a better way to make them actionable.

Indeed we now see only crashes for the basic signature, so it is time for a signature cleanup here.

Crash Signature: [@ IPCError-browser | ShutDownKill] [@ IPCError-browser | ShutDownKill | g_main_context_iterate] [@ IPCError-browser | ShutDownKill | Interpret] [@ IPCError-browser | ShutDownKill | js::frontend::BytecodeEmitter::emitTree] [@ IPCError-browser | ShutDo… → [@ IPCError-browser | ShutDownKill]
Depends on: 1748183
No longer depends on: 1748183
Depends on: 1754208

So looking at the first reports from build 20220210213101 with the new annotations from bug 1754208, the interesting thing is that we do not see any of them:

 1 	SendFinishShutdown (sent) 	7 	21.88 %

So there is a significant 78% of cases where we do not even reach RecvShutdown at all, it seems.

IIUC all starts here in the parent process and there are indeed (intentional) ways to call that function without sending the shutdown message to the child. It feels a bit odd that we apparently did not give the child any chance to shutdown and then just kill it, or am I overlooking something?

Yes, I did a writeup some time ago but I can't find it anymore. In many cases the main process sends the IPC message to shut down the child, but the child is busy, so it doesn't see the parent's message right away. After a while the child gets killed and it often hasn't seen the IPC message yet. More often than not because there were other pending messages that it had to process first. There's another factor compounding this: we reduce the priority of content processes for tabs that are not visible. So not only the child process might be busy, but the OS might be in no rush to run it as it's been informed that it shouldn't prioritize it.

Hmm, could it then be that the SendFinishShutdown (sent) case is just the opposite (the child went all through its shutdown but the acknowledge message does never arrive to the parent) ?

It might be an option to raise a child's priority before sending the shutdown message, that could help in some of the cases you described above. But am I assuming right, that if the message queue is already well filled we get queued at the end? Then a higher process priority will not really help if we do not tweak our internal processing, IIUC?

Yes, that also seems to happen. I had filed bug 1619676 about exploring the priority changes but as someone pointed out in that bug it could have the opposite effect by slowing down the main process and thus ending up in the same place as we are now.

(In reply to Gabriele Svelto [:gsvelto] from comment #191)

...
There's another factor compounding this: we reduce the priority of content processes for tabs that are not visible. So not only the child process might be busy, but the OS might be in no rush to run it as it's been informed that it shouldn't prioritize it.

(In reply to Gabriele Svelto [:gsvelto] from comment #193)

Yes, that also seems to happen. I had filed bug 1619676 about exploring the priority changes but as someone pointed out in that bug it could have the opposite effect by slowing down the main process and thus ending up in the same place as we are now.

Maybe instead of reducing priority for not visible tabs, and instead of raising priority of other tabs, how about just leaving the priorities alone and see what happens? Less is more kind of thing. Maybe just slightly elevate what needs elevation, instead of dropping what you used to think could be reduced. Maybe do a build with no priority changes, and see what pans out?

Just to confirm the observation of comment 190 with more numbers:

Rank IPC shutdown state # %
1 SendFinishShutdown (sent) 292 17.70
2 ShutdownInternal entry 6 0.36
3 content-child-shutdown started 2 0.12

It confirms that we have only very few cases of a real hang during processing the shutdown sequence. It seems that in most of the cases either:

  • the child process is to busy to even receive and start to process the shutdown request (81%)
  • the parent process is to busy to even receive and acknowledge the successful shutdown (17%)

I assume this can only be changed if we create kind of a "priority lane" for shutdown messages that bypasses the normal queue. I wonder if having an additional IPC channel only for shutdown messages could help? In particular on the child process side there is probably not much reason to keep up the normal processing order until we eventually arrive at the shutdown event in our queue (which will always be kind of unexpected and random wrt to our internal state, such that handling it out of order should not be worse, IIUC). But also the ack messages could be processed out of order on the parent side, probably.

You can add a Priority annotation to an IPC message to increase the priority. We use that, for instance, for things related to input.

Also, looking at the earlier discussion, we do already have some code to raise the priority of processes we're shutting down.

Depends on: 1755376

FWIW, from a short glance at 10 reports in a row, I see:

Probably the ChildProfilerController and the GC mutex case could merit a second look based on those stack traces.

Starting from March 2022 we see a slight downwards trend, it seems. Of those:

  • ~ half of the crashes show now NotifyImpendingShutdown received. This means, the content process was alerted but too busy (or hanging) to even process the ShutdownConfirmedHP sent with high priority.
  • almost a quarter of the crashes carry SendFinishShutdown (sent). This would indicate that the content process was able to finish its shutdown but the parent process did never receive or process the FinishShutdown message.
  • almost another quarter do not have any ipc_shutdown annotation set, which is weird.

The remainder are some rare sparse crashes with other ipc_shutdown annotations set.

(In reply to Jens Stutte [:jstutte] from comment #198)

  • almost a quarter of the crashes carry SendFinishShutdown (sent). This would indicate that the content process was able to finish its shutdown but the parent process did never receive or process the FinishShutdown message.

We could filter those out and not send reports for them: after all the content processes did shut down correctly. WDYT, should I file a bug? It would be a simple fix.

(In reply to Gabriele Svelto [:gsvelto] from comment #199)

We could filter those out and not send reports for them: after all the content processes did shut down correctly. WDYT, should I file a bug? It would be a simple fix.

I do not think we should do this. It most probably means that the parent is not aware of that child shutdown and continues to block its own shutdown process until timeout?

I think we should do bug 1755376 comment 17 to give shutdown notifications the highest possible priority in both directions.

OK, I trust your assessment, thanks!

Depends on: 1775076
Depends on: 1777198

It seems that bug 1777198 did not move the needle much here. Let's wait for the improved annotations for a better understanding, though.

But: Roughly 25% of the crashes do not have any dump file (and thus stack) associated, it seems? Could it be that those processes actually ended but the parent did not understand it? Example, the upload_file_minidump is empty, the upload_file_minidump_browser contains data.

 	[Inlineframe] xul.dll!google_breakpad::ExceptionHandler::WriteMinidump() Zeile 805	C++
 	xul.dll!google_breakpad::ExceptionHandler::WriteMinidump(const std::wstring & dump_path, bool(*)(const wchar_t *, const wchar_t *, void *, _EXCEPTION_POINTERS *, MDRawAssertionInfo *, const mozilla::phc::AddrInfo *, bool) callback, void * callback_context, _MINIDUMP_TYPE dump_type) Zeile 831	C++
 	xul.dll!CrashReporter::CreateMinidumpsAndPair(void * aTargetHandle, unsigned long aTargetBlamedThread, const nsTSubstring<char> & aIncomingPairName, mozilla::EnumeratedArray<CrashReporter::Annotation,CrashReporter::Annotation::Count,nsTString<char>> & aTargetAnnotations, nsIFile * * aMainDumpOut) Zeile 3794	C++
 	xul.dll!mozilla::ipc::CrashReporterHost::GenerateMinidumpAndPair<mozilla::dom::ContentParent>(mozilla::dom::ContentParent * aToplevelProtocol, const nsTSubstring<char> & aPairName) Zeile 70	C++
>	xul.dll!mozilla::dom::ContentParent::GeneratePairedMinidump(const char * aReason) Zeile 4291	C++
 	xul.dll!mozilla::dom::ContentParent::KillHard(const char * aReason) Zeile 4326	C++
 	[Inlineframe] xul.dll!nsCOMPtr<nsITimer>::get() Zeile 851	C++

It seems that ContentParent::KillHard does not check if we actually killed something?

Gabriele, can you confirm that the weird crashes we see could be caused by that? Is there something we should check inside ContentParent::GeneratePairedMinidump ?

Flags: needinfo?(gsvelto)
Depends on: 1788879
Depends on: 1789231
No longer depends on: 1777198, 1788879, 1740899

(In reply to Jens Stutte [:jstutte] from comment #203)

It seems that ContentParent::KillHard does not check if we actually killed something?

It does check what happened on both Linux/macOS and Windows. In both cases if the process already died KillProcess() returns false.

Gabriele, can you confirm that the weird crashes we see could be caused by that? Is there something we should check inside ContentParent::GeneratePairedMinidump ?

Minidump generation can fail in weird ways, including empty or truncated minidumps and ATM we don't have good checking for that outside of Linux. As I rewrite this stuff in the coming months we can add diagnostic information to work around this issue, for example by providing rich errors instead of malformed minidumps like we do on Linux using the new oxidized machinery.

Flags: needinfo?(gsvelto)

(In reply to Gabriele Svelto [:gsvelto] from comment #204)

(In reply to Jens Stutte [:jstutte] from comment #203)

It seems that ContentParent::KillHard does not check if we actually killed something?

It does check what happened on both Linux/macOS and Windows. In both cases if the process already died KillProcess() returns false.

Hmm, at least in the linked code snippet I just see a NS_WARNING but no other consequences? I assumed that means that in either cases we transmit the collected crash telemetry?

Gabriele, can you confirm that the weird crashes we see could be caused by that? Is there something we should check inside ContentParent::GeneratePairedMinidump ?

Minidump generation can fail in weird ways, including empty or truncated minidumps and ATM we don't have good checking for that outside of Linux. As I rewrite this stuff in the coming months we can add diagnostic information to work around this issue, for example by providing rich errors instead of malformed minidumps like we do on Linux using the new oxidized machinery.

Thanks for working on this, that sounds very promising !

(In reply to Jens Stutte [:jstutte] from comment #205)

Hmm, at least in the linked code snippet I just see a NS_WARNING but no other consequences? I assumed that means that in either cases we transmit the collected crash telemetry?

Ah yes, good point, we can still upload the minidump if the process was killed in-between the two, or right when the minidump was being written. I'll file a bug to discard those instead of submitting them.

Depends on: 1789421
Depends on: 1789803
Depends on: 1789810
No longer depends on: 1789803
Depends on: 1790611
Severity: critical → S2
Depends on: 1794568
Severity: S2 → S3
No longer depends on: 1424451
Depends on: 1808691
1 	6473 	46.44 % 	- NotifiedImpendingShutdown - NotifiedImpendingShutdown - HangMonitorChild::RecvRequestContentJSInterrupt (expected)
2 	2043 	14.66 % 	- NotifiedImpendingShutdown - HangMonitorChild::RecvRequestContentJSInterrupt (expected)
3 	1627 	11.67 % 	- NotifiedImpendingShutdown - NotifiedImpendingShutdown - HangMonitorChild::RecvRequestContentJSInterrupt (expected) - RecvShutdownConfirmedHP entry - RecvShutdown entry - content-child-will-shutdown started - ShutdownInternal entry - content-child-shutdown started - StartForceKillTimer - SendFinishShutdown (sending) - SendFinishShutdown (sent)
4 	 486 	 3.49 % 	- NotifiedImpendingShutdown - HangMonitorChild::RecvRequestContentJSInterrupt (expected) - RecvShutdownConfirmedHP entry - RecvShutdown entry - content-child-will-shutdown started - ShutdownInternal entry - content-child-shutdown started - StartForceKillTimer - SendFinishShutdown (sending) - SendFinishShutdown (sent)

Looking at IPC shutdown state I see 4 main cases:

  1. Tab is closing & Child gives no sign of live after receiving RequestContentJSInterrupt
  2. Parent shuts down & Child gives no sign of live after receiving RequestContentJSInterrupt
  3. Tab is closing & Child notified the parent about its shutdown but the parent did not process the notification
  4. Parent shuts down & Child notified the parent about its shutdown but the parent did not process the notification

Annotations interpretation:

Tab is closing: two times NotifiedImpendingShutdown, the first is from ContentParent::NotifyTabWillDestroy, the second from ContentParent::SignalImpendingShutdownToContentJS
Parent shuts down: the only NotifiedImpendingShutdown is from ContentParent::SignalImpendingShutdownToContentJS coming from ContentParent::BlockShutdown
Parent did not process shutdown notification in time: SendFinishShutdown (sent) is present

It might be early to tell but it seems that the patches from bug 1837467 had a pretty positive impact on the numbers here.

In summary, that patch has two consequences:

  1. We ensure that we start the ForceKillTimer only after we actually told the child process to shutdown, preventing that slow (unrelated) processing in the parent would hit us.
  2. We indirectly increased the time the content process has to do shutdown related things by adding the second timer for the Browser actor destroy cycle.

It is unclear, if just increasing the timeout would have had a similar effect (although I'd expect the slow parent case to be improved only statistically this way, not systematically). It might be worth to experiment with different timer settings (like split the overall time over the two timers with some ratio) to get closer to the former timeouts again.

Depends on: 1864641
Depends on: 1880427
Depends on: 1880438
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: