Closed Bug 1524200 Opened 5 years ago Closed 5 years ago

Crash in AsyncShutdownTimeout | Places Clients shutdown | sanitize.js: Sanitize on shutdown

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla67

Tracking Flags:

Tracking

Status

firefox-esr60

---

wontfix

firefox65

---

wontfix

firefox66

---

wontfix

firefox67

---

wontfix

firefox68

wontfix

People

(Reporter: philipp, Assigned: baku)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(5 files)

Bug 1524200 - Diagnostic messages on Sanitizer.jsm, r?mak 5 years ago Andrea Marchesini [:baku] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1524200 - Optimize the comparison of principals in Sanitizer.jsm, r?johannh 5 years ago Andrea Marchesini [:baku] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1524200 - Introduce additional debugging messages to Sanitizer.jsm, r?mak 5 years ago Andrea Marchesini [:baku] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1524200 - Sanitizer.jsm should retrieve the list of nsIPrincipal objects with site data only when needed, r?johannh 5 years ago Andrea Marchesini [:baku] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1524200 - Add more debugging message to Sanitizer.jsm, r?mak 5 years ago Andrea Marchesini [:baku] 47 bytes, text/x-phabricator-request		Details \| Review

[:philipp]

Reporter

Description

•

5 years ago

•

Edited

This bug is for crash report bp-3344acb0-afe3-4040-b8ff-2d9eb0190131.

AsyncShutdownTimeout 	

{

    "phase":"Places Clients shutdown",
    "conditions":[
        {
            "name":"sanitize.js: Sanitize on shutdown",
            "state":{
                "progress":{
                    "isShutdown":true
                }
            },
            "filename":"resource:///modules/Sanitizer.jsm",
            "lineNumber":141,
            "stack":[
                "resource:///modules/Sanitizer.jsm:onStartup:141",
                "jar:file:///C:/Program%20Files/Mozilla%20Firefox/browser/omni.ja!/components/nsBrowserGlue.js:BG__onWindowsRestored:1557",
                "jar:file:///C:/Program%20Files/Mozilla%20Firefox/browser/omni.ja!/components/nsBrowserGlue.js:BG_observe:743",
                "resource:///modules/sessionstore/SessionStore.jsm:initializeWindow:1177",
                "resource:///modules/sessionstore/SessionStore.jsm:onBeforeBrowserWindowShown/<:1329"
            ]
        }
    ]

}

crash reports on shutdown in [@ AsyncShutdownTimeout | Places Clients shutdown | sanitize.js: Sanitize on shutdown] are spiking up in firefox 65 (in 64.0.2 they accounted for 0.9% of browser crashes, currently in 65.0 they are 4.9% of browser crashes).

a number of user comments are reporting they consistently get these crashes when they close the browser after the update to firefox 65.

we already had seen a spike of those reports during the 65.0a1 nightly cycle, originally caused by bug 1505071 - bug 1507171 has taken care of most of the regressing crashes but perhaps there are still some rarer cases that aren't covered yet and might lead to these AsyncShutdownTimeouts...

Johann Hofmann [:johannh]

Comment 1

•

5 years ago

Hm, I'm not very experienced with debugging AsyncShutdownTimeouts, mak, do you know how we can look into this?

Flags: needinfo?(mak77)

Ryan VanderMeulen [:RyanVM]

Updated

•

5 years ago

tracking-firefox65: --- → +

tracking-firefox66: --- → +

Ryan VanderMeulen [:RyanVM]

Comment 2

•

5 years ago

This the #1 overall top crash on 65 at the moment.

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 3

•

5 years ago

Unfortunately a lot of the Async shutdown timeout metadata are missing important information, did we change something recently around this code?

This is what I'd expect to see:

{"phase":"Places Clients shutdown","conditions":[{"name":"sanitize.js: Sanitize on shutdown","state":{"progress":{"isShutdown":true,"cache":"cleared","cookies":"cleared","history":"blocking","formdata":"blocking","downloads":"blocking","sessions":"blocking"}},"filename":"resource:///modules/Sanitizer.jsm","lineNumber":141,"stack":["resource:///modules/Sanitizer.jsm:onStartup:141","jar:file:///C:/Program%20Files%20(x86)/Mozilla%20Firefox/browser/omni.ja!/components/nsBrowserGlue.js:BG__onWindowsRestored:1557","jar:file:///C:/Program%20Files%20(x86)/Mozilla%20Firefox/browser/omni.ja!/components/nsBrowserGlue.js:BG_observe:743","resource:///modules/sessionstore/SessionStore.jsm:initializeWindow:1177","resource:///modules/sessionstore/SessionStore.jsm:onBeforeBrowserWindowShown/<:1329"]}]}

As you can see it is clearly showing a few cleaners blocking us, it shows our progress.

This is what I see in most reports:

{"phase":"Places Clients shutdown","conditions":[{"name":"sanitize.js: Sanitize on shutdown","state":{"progress":{"isShutdown":true}},"filename":"resource:///modules/Sanitizer.jsm","lineNumber":141,"stack":["resource:///modules/Sanitizer.jsm:onStartup:141","jar:file:///C:/Program%20Files/Mozilla%20Firefox/browser/omni.ja!/components/nsBrowserGlue.js:BG__onWindowsRestored:1557","jar:file:///C:/Program%20Files/Mozilla%20Firefox/browser/omni.ja!/components/nsBrowserGlue.js:BG_observe:743","resource:///modules/sessionstore/SessionStore.jsm:initializeWindow:1177","resource:///modules/sessionstore/SessionStore.jsm:onBeforeBrowserWindowShown/<:1329"]}]}

And this is not useful, because it doesn't report any debugging info. There's no progress indication, like we stopped earlier.

The first thing would be to check if some recent changes have touched this code and eventually made the reported metadata less useful.

[:philipp]

Reporter

Comment 4

•

5 years ago

affected users are reporting that they can fix the problem by manually clearing "offline website data" once:
http://forums.mozillazine.org/viewtopic.php?p=14822808
https://support.mozilla.org/en-US/questions/1249036

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 5

•

5 years ago

It could make sense, 65 made lots of changes to clearing site data, iirc.

Flags: needinfo?(mak77)

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 6

•

5 years ago

So, this code, and all the following code, seems to completely skip any async shtudown progress marking:
https://searchfox.org/mozilla-central/rev/152993fa346c8fd9296e4cd6622234a664f53341/browser/modules/Sanitizer.jsm#685
We should be investigating that code, and also add progress metadata to the progress object as this code runs

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Updated

•

5 years ago

Flags: needinfo?(amarchesini)

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 7

•

5 years ago

If I should guess a bug, I'd probably say bug 1505071

Blocks: 1505071

Ryan VanderMeulen [:RyanVM]

Comment 8

•

5 years ago

FYI, I'm planning a 65.0.1 release likely early next week. I'd be happy to take a patch which improves these reports in it.

Pascal Chevrel:pascalc (PTO until April 26)

Updated

•

5 years ago

status-firefox67: ? → affected

status-firefox-esr60: unaffected → affected

tracking-firefox67: --- → +

Liz Henry (:lizzard) (relman/hg->git project)

Comment 9

•

5 years ago

Following up in email as the crash volume is high and I'd like to get a patch into 66.

Andrea Marchesini [:baku]

Assignee

Comment 10

•

5 years ago

Philipp and I have been discussed this bug a lot. Philipp is in contact with a volunteer who is able to reproduce this crash consistently. The volunteer (sorry, I don't know the name) shared their profile folder with us and I'm debugging the crash locally. Unfortunately I'm not able to reproduce it, yet.

I'm taking this bug.

Assignee: nobody → amarchesini

Flags: needinfo?(amarchesini)

Andrea Marchesini [:baku]

Assignee

Comment 11

•

5 years ago

Attached file Bug 1524200 - Diagnostic messages on Sanitizer.jsm, r?mak — Details

Andrea Marchesini [:baku]

Assignee

Comment 12

•

5 years ago

I just submitted a diagnostic patch to have more log. Let's keep this bug open.

Keywords: leave-open

Pulsebot

Comment 13

•

5 years ago

Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b1ac22673987
Diagnostic messages on Sanitizer.jsm, r=mak

Cosmin Sabou [:CosminS]

Comment 14

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/b1ac22673987

Jeff Colbert

Comment 15

•

5 years ago

Looking at my crash log submissions, I see this has been happening since 11 Oct 2018. It happens every time I shut down.I have disabled all plugins and it still happens.Please email me if I can upload any files that might help you.

Andrea Marchesini [:baku]

Assignee

Comment 16

•

5 years ago

Attached file Bug 1524200 - Optimize the comparison of principals in Sanitizer.jsm, r?johannh — Details

Andrea Marchesini [:baku]

Assignee

Updated

•

5 years ago

Keywords: leave-open

Pulsebot

Comment 17

•

5 years ago

Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/25397a6f8c4f
Optimize the comparison of principals in Sanitizer.jsm, r=johannh

Narcis Beleuzu [:NarcisB]

Comment 18

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/25397a6f8c4f

Status: NEW → RESOLVED

Closed: 5 years ago

status-firefox67: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla67

Calixte Denizet (:calixte)

Comment 19

•

5 years ago

There are 9 crashes in nightly 20190221215439 so after the patch landed.

Status: RESOLVED → REOPENED

status-firefox67: fixed → affected

Resolution: FIXED → ---

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 20

•

5 years ago

All the crashes report this progress:
{"isShutdown":true,"advancement":"get-principals","step":"principals-quota-manager"}

[:philipp]

Reporter

Comment 21

•

5 years ago

unfortunately it looks like the volume of crash reports got worse as well...

Andrea Marchesini [:baku]

Assignee

Comment 22

•

5 years ago

Attached file Bug 1524200 - Introduce additional debugging messages to Sanitizer.jsm, r?mak — Details

Andrea Marchesini [:baku]

Assignee

Comment 23

•

5 years ago

I add some additional debugging messages. I suspect it's QM that doesn't resolve the callback.
With my previous patch getAllPrincipals() is always executed and that could be the reason of the increasing of crashes.
I'm going to submit a patch to optimize this as well.

unfortunately it looks like the volume of crash reports got worse as well...

Andrea Marchesini [:baku]

Assignee

Comment 24

•

5 years ago

Let's keep this bug open. I'm going to land some debugging code.

Keywords: leave-open

Andrea Marchesini [:baku]

Assignee

Comment 25

•

5 years ago

Attached file Bug 1524200 - Sanitizer.jsm should retrieve the list of nsIPrincipal objects with site data only when needed, r?johannh — Details

Depends on D20976

Pulsebot

Comment 26

•

5 years ago

Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/81cc693e6417
Introduce additional debugging messages to Sanitizer.jsm, r=mak

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 27

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/81cc693e6417

Johann Hofmann [:johannh]

Updated

•

5 years ago

Priority: -- → P1

Pulsebot

Comment 28

•

5 years ago

Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b7ea35510cf1
Sanitizer.jsm should retrieve the list of nsIPrincipal objects with site data only when needed, r=johannh

Natalia Csoregi [:nataliaCs]

Comment 29

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/b7ea35510cf1

Andrea Marchesini [:baku]

Assignee

Comment 30

•

5 years ago

I can confirm that the problem is QM not executing the getUsage() callback during the shutdown:
https://searchfox.org/mozilla-central/rev/dbddac86aadf1d4871fb350bbe66db43728a9f81/browser/modules/Sanitizer.jsm#752-753

Depends on: 1529301

Ryan VanderMeulen [:RyanVM]

Comment 31

•

5 years ago

Too late to fix this in 65 at this point with 66 shipping in less than 3 weeks.

status-firefox65: affected → wontfix

status-firefox-esr60: affected → unaffected

Tim Spurway [:tspurway]

Comment 32

•

5 years ago

:baku - any progress on this one?

Flags: needinfo?(amarchesini)

Andrea Marchesini [:baku]

Assignee

Comment 33

•

5 years ago

Bug 1529301 is fixing this issue.

Flags: needinfo?(amarchesini)

Liz Henry (:lizzard) (relman/hg->git project)

Comment 34

•

5 years ago

I don't think that will land in time for the 66 release but I'll keep tracking this in case it is or in case it ends up being upliftable for a dot release.

Andrew Overholt [:overholt]

Comment 35

•

5 years ago

baku/Tom, bug 1529301 is fixed but we're still seeing crashes in recent nightly builds. Can you take another look?

Flags: needinfo?(shes050117)

Flags: needinfo?(amarchesini)

Tom Tung [:tt, :ttung]

Comment 36

•

5 years ago

Looking into the tens recent reports, the steps in the metadata of them are:

principals-quota-manager
promises:162
promises:1
principals-quota-manager
sanitized
sanitized
promises:12
sanitized
sanitized
promises:1

Looks like only two of them are crashing at getting usage/origin. Perhaps, we could get the duration of each step to know which one consumes more than expected?

Andrea Marchesini [:baku]

Assignee

Comment 37

•

5 years ago

It seems that we are crashing here:
https://searchfox.org/mozilla-central/rev/ddd1679c0534f7ddf36cafddd17b710c4fefe3c4/browser/modules/Sanitizer.jsm#905

I'm saying this because the 'promises:X' state is set after the populating of the promises list here:
https://searchfox.org/mozilla-central/rev/ddd1679c0534f7ddf36cafddd17b710c4fefe3c4/browser/modules/Sanitizer.jsm#851

And at that point the cleaning has already started, but not completed yet. The same happens for 'sanitized' step, which comes from:
https://searchfox.org/mozilla-central/rev/ddd1679c0534f7ddf36cafddd17b710c4fefe3c4/browser/modules/Sanitizer.jsm#910

in scenarios where we have probably more promises to resolve (more cleaning to be done).

When we sanitize site data, we use these flags (expanding CLEAR_DOM_STORAGES):
CLEAR_APPCACHE | CLEAR_DOM_QUOTA | CLEAR_DOM_PUSH_NOTIFICATIONS | CLEAR_REPORTS | CLEAR_COOKIES

Maybe one of these cleaning takes too much time.

To be sure about this, I would like to land a patch that sets an extra value in 'progress' data.

Flags: needinfo?(amarchesini)

Andrea Marchesini [:baku]

Assignee

Comment 38

•

5 years ago

Attached file Bug 1524200 - Add more debugging message to Sanitizer.jsm, r?mak — Details

Tom Tung [:tt, :ttung]

Updated

•

5 years ago

Flags: needinfo?(shes050117)

Pulsebot

Comment 39

•

5 years ago

Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a1d82278bf83
Add more debugging message to Sanitizer.jsm, r=mak

Cosmin Sabou [:CosminS]

Comment 40

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/a1d82278bf83

Pascal Chevrel:pascalc (PTO until April 26)

Updated

•

5 years ago

status-firefox68: --- → affected

status-firefox-esr60: unaffected → affected

Andrea Marchesini [:baku]

Assignee

Comment 41

•

5 years ago

I spent quite some time reading the crash reports with the latest debugging messages.
I can confirm my assumption: 90% of the crashes happens because the cleanup takes too long.

We can try to optimize it more, but:

we have crash-reports coming from profiles with more than 1000 origins to clean up. Many of them are between 70 to 100 origins which are still a lot for slow machines.
The cleanup involves a lot of I/O in a moment, the shutdown, where many other components are performing actions.

Probably, it's time to have a different approach. I suggest to introduce a cleanup dialog, to be shown after the closing of firefox. In this dialog we can have a progress bar, and maybe a list of operations that the browser is performing (cleaning X, cleaning Y,...)

This dialog would be shown only when needed: to the users who want to delete data on shutdown, or/and to the users who have set a custom session-only cookie permission for some origins. This dialog could be also used by other features such as 'forget about site' and 'clear data history'.

I have already discussed this proposal with johannh. I will organize a meeting to discuss it with more people.

Liz Henry (:lizzard) (relman/hg->git project)

Comment 42

•

5 years ago

I don't think we'll be able to do anything for 66, but I like the cleanup dialog idea!

status-firefox66: affected → wontfix

Neha Kochar [:neha]

Comment 43

•

5 years ago

[Tracking Requested - why for this release]:
we should probably track the improvement of cleanup dialog for 68.

Status: REOPENED → RESOLVED

Closed: 5 years ago → 5 years ago

status-firefox67: affected → wontfix

tracking-firefox65: + → ---

tracking-firefox66: + → ---

tracking-firefox67: + → ---

tracking-firefox68: --- → ?

Resolution: --- → FIXED

BugBot [:suhaib / :marco/ :calixte]

Updated

•

5 years ago

Keywords: leave-open

Julien Cristau [:jcristau]

Comment 44

•

5 years ago

(In reply to Neha Kochar [:neha] from comment #43)

[Tracking Requested - why for this release]:
we should probably track the improvement of cleanup dialog for 68.

That'd presumably be in a different bug.

tracking-firefox68: ? → -

Johann Hofmann [:johannh]

Comment 45

•

5 years ago

Also, to be clear, we do not have the resources to come up with a cleanup dialog based solution in 68. We'll have a roadmap and resourcing discussion for data sanitization during the Whistler All-Hands. If you're interested in that I'm happy to invite you, Neha. :)

Liz Henry (:lizzard) (relman/hg->git project)

Updated

•

5 years ago

status-firefox-esr60: affected → wontfix

Liz Henry (:lizzard) (relman/hg->git project)

Updated

•

5 years ago

status-firefox68: affected → wontfix

Ed Lee :Mardak

Updated

•

5 years ago

Regressions: 1554167

[:philipp]

Reporter

Updated

•

5 years ago

Comment 46

•

5 years ago

•

Edited

I'm looking at the idea of the cleanup dialog (not as an implementer, but as the most likely reviewer), but it sounds like punching holes in a safety net, so I'm not a huge fan.

You need to log in before you can comment on or make changes to this bug.