Closed Bug 1507171 Opened 5 years ago Closed 5 years ago

Crash in AsyncShutdownTimeout | Places Clients shutdown | sanitize.js: Sanitize on shutdown

Categories

(Toolkit :: Data Sanitization, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla65
Tracking Status
firefox-esr60 --- unaffected
firefox63 --- unaffected
firefox64 --- unaffected
firefox65 + fixed

People

(Reporter: philipp, Assigned: baku)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file, 1 obsolete file)

This bug was filed from the Socorro interface and is
report bp-1a882122-0051-406f-b668-fc6490181114.
=============================================================

AsyncShutdownTimeout 	

{"phase":"Places Clients shutdown","conditions":[{"name":"sanitize.js: Sanitize on shutdown","state":{"progress":{"isShutdown":true}},"filename":"resource:///modules/Sanitizer.jsm","lineNumber":141,"stack":["resource:///modules/Sanitizer.jsm:onStartup:141"]}]}

=============================================================

this shutdown signature is regressing in volume in nightly 65.0a1 build 20181113220101 where bug 1505071 was landing.

in my daily profile i can also reproduce this persistently - the firefox process is continuing to run after it's closed in the UI until the artificial crash is triggered. please let me know if there are any particular troubleshooting steps you want me to take in order to debug the issue.
Flags: needinfo?(amarchesini)
Do you have an STR? Thanks!
Flags: needinfo?(amarchesini) → needinfo?(madperson)
(provided some reproducible case via DM on slack)
Flags: needinfo?(madperson)
Hi,

Based on the description only Nightly is marked as affected, but opening the crash report there seems to be a long line of versions on which the crash occurs. Should we mark all versions as affected or Nightly is enough?
Flags: needinfo?(madperson)
Attached patch cleanup.patch (obsolete) — Splinter Review
Assignee: nobody → amarchesini
Attachment #9025319 - Flags: review?(jhofmann)
hi laszlo, i filed this bug specifically for the regression in 65.0a1 build 20181113220101 where the number of reports went from 1-2 per build to 60-80 cases. 
yes, the signature was already present for a while and might be triggered by different underlying issues. not sure if the fix here manages to get rid of all of them - if not we should probably open a new report to track the remaining ones...
Flags: needinfo?(madperson)
Comment on attachment 9025319 [details] [diff] [review]
cleanup.patch

Review of attachment 9025319 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/modules/Sanitizer.jsm
@@ +716,5 @@
>  
> +async function hasRootDomain(matchHost, host) {
> +  try {
> +    return Services.eTLD.hasRootDomain(matchUriHost, host);
> +  } catch (ex) {

Can you explain what exceptions are expected to be thrown and under what circumstances?  This method looks like it should only ever throw if the XPCOM calling convention is violated with aResult being passed a nullptr.  Which leaves the service no longer being present.  If the latter is the problem, that seems like a serious invariant problem if your logic depends on a thing that frequently doesn't exist...
Comment on attachment 9025319 [details] [diff] [review]
cleanup.patch

Review of attachment 9025319 [details] [diff] [review]:
-----------------------------------------------------------------

Hmmm, it would be great to have additional details. We use Services.eTLD.hasRootDomain in lots of places, e.g. ClearDataService itself, without catching exceptions.

Do we know enough about this crash to add a comment and a commit note about the kind of error we're catching here?
Attachment #9025319 - Flags: review?(jhofmann)
Attached patch cleanup2.patchSplinter Review
Sorry, that was the wrong fix.
Attachment #9025319 - Attachment is obsolete: true
Attachment #9025582 - Flags: review?(jhofmann)
Tracking since this is a new high volume spike in Nightly.
Comment on attachment 9025582 [details] [diff] [review]
cleanup2.patch

Review of attachment 9025582 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, so it was about: pages. That makes sense. Thanks!

::: browser/modules/Sanitizer.jsm
@@ +698,5 @@
> +
> +    // We consider just permissions set for http, https and file URLs.
> +    if (permission.principal.URI.scheme != "http" &&
> +        permission.principal.URI.scheme != "https" &&
> +        permission.principal.URI.scheme != "file") {

nit: Consider adding a helper function for checking this (isSupportedURI?)
Attachment #9025582 - Flags: review?(jhofmann) → review+
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3b2da7e798ea
Cleanup data on shutdown just for http, https and file URLs, r=johannh
:baku, can you elaborate on the problem and solution here?  I'm concerned that:

- This may just be working around a bug in QuotaManager or something other subsystem.  In my profile, under about:newtab's QM storage, I only see a ~100k "snippets_db" database.  This is not very large and based on my understanding, the snippets DB is about the blurbs shown, so the database size should be constant across all users.  So I would not expect that it's just the extra 100k of deletion causing problems but some type of deadlock occurring.

- At a meta level, I'm a bit worried about opting about:newtab and all other about pages out of having their storage cleared.  As noted above, I think the "snippets" DB is harmless at the current time, but it seems like asking for a privacy fire-drill later on because there were assumptions about data clearing.  How about adding an additional flag to nsIAboutModule[1] along the lines of RETAIN_DATA_DURING_SESSION_CLEAR and adding it to ACTIVITY_STREAM_FLAGS[2] if there's a specific desire to not clear this data and the Activity Stream team has indicated they 1) will never store user data in their origin storage and 2) have unit tests to this effect that will fail if they add another IndexedDB database.  (Honestly, the about flag still seems like a foot-gun, but at least making sure it's aimed at only one foot seems preferable to all "about" feet.)

1: https://searchfox.org/mozilla-central/source/netwerk/protocol/about/nsIAboutModule.idl
2: https://searchfox.org/mozilla-central/rev/b03a62c3c82316e733a3b09622c1cb7e59f64cc3/browser/components/about/AboutRedirector.cpp#27
Flags: needinfo?(amarchesini)
> - This may just be working around a bug in QuotaManager or something other

The problem here was a combination of factors:
1. ClearDataService.js was using permission.principal.URI.host for any 'cookie' permission's principal and nsSimpleURL::GetHost() throws an exception. If the user sets a permission for an about: page, ClearDataService.js doesn't work properly.

2. really do we allow to set cookie permission for about: pages? It seems we do...

3. ClearDataService deletes data for sites. We should keep about: pages our of the games.

> - At a meta level, I'm a bit worried about opting about:newtab and all other
> about pages out of having their storage cleared. [...]

This seems a good idea. We should do this in a follow up. So far, I think we should not clear data for about: pages.
Flags: needinfo?(amarchesini)
https://hg.mozilla.org/mozilla-central/rev/3b2da7e798ea
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Is/was this related to the million "shutdown kill" crashes I've been seeing?
See Also: → 1524200
You need to log in before you can comment on or make changes to this bug.