Crash in AsyncShutdownTimeout | Places Clients shutdown | sanitize.js: Sanitize on shutdown

RESOLVED FIXED in Firefox 65

Status

()

defect
--
critical
RESOLVED FIXED
7 months ago
2 months ago

People

(Reporter: philipp, Assigned: baku)

Tracking

({crash, regression})

unspecified
mozilla65
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr60 unaffected, firefox63 unaffected, firefox64 unaffected, firefox65+ fixed)

Details

(crash signature)

Attachments

(1 attachment, 1 obsolete attachment)

Reporter

Description

7 months ago
This bug was filed from the Socorro interface and is
report bp-1a882122-0051-406f-b668-fc6490181114.
=============================================================

AsyncShutdownTimeout 	

{"phase":"Places Clients shutdown","conditions":[{"name":"sanitize.js: Sanitize on shutdown","state":{"progress":{"isShutdown":true}},"filename":"resource:///modules/Sanitizer.jsm","lineNumber":141,"stack":["resource:///modules/Sanitizer.jsm:onStartup:141"]}]}

=============================================================

this shutdown signature is regressing in volume in nightly 65.0a1 build 20181113220101 where bug 1505071 was landing.

in my daily profile i can also reproduce this persistently - the firefox process is continuing to run after it's closed in the UI until the artificial crash is triggered. please let me know if there are any particular troubleshooting steps you want me to take in order to debug the issue.
Flags: needinfo?(amarchesini)
Assignee

Comment 1

7 months ago
Do you have an STR? Thanks!
Flags: needinfo?(amarchesini) → needinfo?(madperson)
Reporter

Comment 2

7 months ago
(provided some reproducible case via DM on slack)
Flags: needinfo?(madperson)

Comment 3

7 months ago
Hi,

Based on the description only Nightly is marked as affected, but opening the crash report there seems to be a long line of versions on which the crash occurs. Should we mark all versions as affected or Nightly is enough?
Flags: needinfo?(madperson)
Assignee

Comment 4

7 months ago
Posted patch cleanup.patch (obsolete) — Splinter Review
Assignee: nobody → amarchesini
Attachment #9025319 - Flags: review?(jhofmann)
Reporter

Comment 5

7 months ago
hi laszlo, i filed this bug specifically for the regression in 65.0a1 build 20181113220101 where the number of reports went from 1-2 per build to 60-80 cases. 
yes, the signature was already present for a while and might be triggered by different underlying issues. not sure if the fix here manages to get rid of all of them - if not we should probably open a new report to track the remaining ones...
Flags: needinfo?(madperson)
Comment on attachment 9025319 [details] [diff] [review]
cleanup.patch

Review of attachment 9025319 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/modules/Sanitizer.jsm
@@ +716,5 @@
>  
> +async function hasRootDomain(matchHost, host) {
> +  try {
> +    return Services.eTLD.hasRootDomain(matchUriHost, host);
> +  } catch (ex) {

Can you explain what exceptions are expected to be thrown and under what circumstances?  This method looks like it should only ever throw if the XPCOM calling convention is violated with aResult being passed a nullptr.  Which leaves the service no longer being present.  If the latter is the problem, that seems like a serious invariant problem if your logic depends on a thing that frequently doesn't exist...
Comment on attachment 9025319 [details] [diff] [review]
cleanup.patch

Review of attachment 9025319 [details] [diff] [review]:
-----------------------------------------------------------------

Hmmm, it would be great to have additional details. We use Services.eTLD.hasRootDomain in lots of places, e.g. ClearDataService itself, without catching exceptions.

Do we know enough about this crash to add a comment and a commit note about the kind of error we're catching here?
Attachment #9025319 - Flags: review?(jhofmann)
Assignee

Comment 8

7 months ago
Sorry, that was the wrong fix.
Attachment #9025319 - Attachment is obsolete: true
Attachment #9025582 - Flags: review?(jhofmann)
Tracking since this is a new high volume spike in Nightly.
Comment on attachment 9025582 [details] [diff] [review]
cleanup2.patch

Review of attachment 9025582 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, so it was about: pages. That makes sense. Thanks!

::: browser/modules/Sanitizer.jsm
@@ +698,5 @@
> +
> +    // We consider just permissions set for http, https and file URLs.
> +    if (permission.principal.URI.scheme != "http" &&
> +        permission.principal.URI.scheme != "https" &&
> +        permission.principal.URI.scheme != "file") {

nit: Consider adding a helper function for checking this (isSupportedURI?)
Attachment #9025582 - Flags: review?(jhofmann) → review+

Comment 11

7 months ago
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3b2da7e798ea
Cleanup data on shutdown just for http, https and file URLs, r=johannh
:baku, can you elaborate on the problem and solution here?  I'm concerned that:

- This may just be working around a bug in QuotaManager or something other subsystem.  In my profile, under about:newtab's QM storage, I only see a ~100k "snippets_db" database.  This is not very large and based on my understanding, the snippets DB is about the blurbs shown, so the database size should be constant across all users.  So I would not expect that it's just the extra 100k of deletion causing problems but some type of deadlock occurring.

- At a meta level, I'm a bit worried about opting about:newtab and all other about pages out of having their storage cleared.  As noted above, I think the "snippets" DB is harmless at the current time, but it seems like asking for a privacy fire-drill later on because there were assumptions about data clearing.  How about adding an additional flag to nsIAboutModule[1] along the lines of RETAIN_DATA_DURING_SESSION_CLEAR and adding it to ACTIVITY_STREAM_FLAGS[2] if there's a specific desire to not clear this data and the Activity Stream team has indicated they 1) will never store user data in their origin storage and 2) have unit tests to this effect that will fail if they add another IndexedDB database.  (Honestly, the about flag still seems like a foot-gun, but at least making sure it's aimed at only one foot seems preferable to all "about" feet.)

1: https://searchfox.org/mozilla-central/source/netwerk/protocol/about/nsIAboutModule.idl
2: https://searchfox.org/mozilla-central/rev/b03a62c3c82316e733a3b09622c1cb7e59f64cc3/browser/components/about/AboutRedirector.cpp#27
Flags: needinfo?(amarchesini)
Assignee

Comment 13

7 months ago
> - This may just be working around a bug in QuotaManager or something other

The problem here was a combination of factors:
1. ClearDataService.js was using permission.principal.URI.host for any 'cookie' permission's principal and nsSimpleURL::GetHost() throws an exception. If the user sets a permission for an about: page, ClearDataService.js doesn't work properly.

2. really do we allow to set cookie permission for about: pages? It seems we do...

3. ClearDataService deletes data for sites. We should keep about: pages our of the games.

> - At a meta level, I'm a bit worried about opting about:newtab and all other
> about pages out of having their storage cleared. [...]

This seems a good idea. We should do this in a follow up. So far, I think we should not clear data for about: pages.
Flags: needinfo?(amarchesini)

Comment 14

7 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/3b2da7e798ea
Status: NEW → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65

Comment 15

7 months ago
Is/was this related to the million "shutdown kill" crashes I've been seeing?
Reporter

Updated

5 months ago
See Also: → 1524200
You need to log in before you can comment on or make changes to this bug.