Closed Bug 1578273 Opened 5 years ago Closed 1 year ago

Firefox process staying alive after exit

Categories

(Toolkit :: Data Sanitization, defect, P2)

defect

Tracking

()

VERIFIED FIXED
113 Branch
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- wontfix
firefox-esr102 --- wontfix
firefox69 --- wontfix
firefox70 --- wontfix
firefox71 --- wontfix
firefox108 --- wontfix
firefox109 --- wontfix
firefox110 --- wontfix
firefox111 --- wontfix
firefox112 --- wontfix
firefox113 --- fixed

People

(Reporter: paulburke, Assigned: h.sofie.p)

References

Details

(Keywords: crash, topcrash)

Crash Data

Attachments

(2 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0

Steps to reproduce:

Hi,
On closing Firefox either using the 'X' or the 'File' and 'Exit', the Firefox processes stay alive.
I have 3 pc's runing Firefox and the issue is on all 3.
This is not unique to v69, it has been happening for quite some time (probably around a year or more!).
The only way to stop the program is to close the processes in Task Manager.
About 30 seconds after exiting the browser if the processes are not closed a Firefox Crashed Message Box appears!

Actual results:

On closing Firefox either using the 'X' or the 'File' and 'Exit', the Firefox processes stay alive.
This is not unique to v69, it has been happening for quite some time (probably around a year or more!).
The only way to stop the program is to close the processes in Task Manager.
About 30 seconds after exiting the browser if the processes are not closed a Firefox Crashed Message Box appears!

Expected results:

The program should exit and the processes stopped.

hi, please submit one of those crash reports from the box that opens up 60 seconds or so after you close the browser. after a restart retrieve the crash id by entering about:crashes into the firefox address bar and pasting it here into the bug.

Last crash report.

I hope that is what you want.

There are these going back 6 months.

thank you, so it's basically the same issue that was tried to be diagnosed/debugged in bug 1524200.

{
    "phase":"Places Clients shutdown",
    "conditions":[
        {
            "name":"sanitize.js: Sanitize on shutdown",
            "state":{
                "progress":{
                    "isShutdown":true,
                    "advancement":"session-permission",
                    "step":"promises:1",
                    "sanitizePrincipal":"started"
                }
            },
            "filename":"resource:///modules/Sanitizer.jsm",
            "lineNumber":169,
            "stack":[...]
        }
    ]
}

you've set firefox to clear certain data on exit which is taking too long until in the end an artificial crash is triggered so that the process is really shutting down.

baku, is there any more debugging or information the affected user could provide in order for us to better understand and fix the problem?

Status: UNCONFIRMED → NEW
Crash Signature: [@ AsyncShutdownTimeout | Places Clients shutdown | sanitize.js: Sanitize on shutdown ]
Component: Untriaged → Data Sanitization
Ever confirmed: true
Flags: needinfo?(amarchesini)
Keywords: crash
Product: Firefox → Toolkit
See Also: → 1524200
Version: 69 Branch → unspecified

It seems that Services.clearData.deleteDataFromPrincipal is taking too much time. Wondering if it fails, somehow and it doesn't call 'resolve'.
Is it possible to have a dump of the browser console when it crashes? Maybe you want to enable the console to stdout.

Flags: needinfo?(amarchesini) → needinfo?(paulburke)

I was not aware that I am deleting anything on exit, except I have an add on 'Expire History By Days' enabled. However with that disabled the crash still occurs.
Could you please tell me how to get the dump of the browser console when it crashes please, and enable console to stdout?

Flags: needinfo?(paulburke)

Thanks for your help. I would do these steps:

  1. take a copy of your profile directory.
  2. run firefox by command line in this way: firefox -no-remote -profile <the copy of your profile>. For instance, if you have copied your profile in /tmp/profile, run: firefox -no-remote -profile /tmp/profile
  3. in about:config set browser.sanitizer.loglevel pref to "All". You should have "Warn".
  4. in about:config set devtools.console.stdout.chrome to true
  5. do the same for devtols.console.stdout.content (to true)

Copy everything that happens after the first line containing "Sanitizer.jsm". For instance, in my tests the first line is:
console.log: ** Sanitizer.jsm: "Sanitizing on shutdown"

Copy all, and send it to me: amarchesini@mozilla.com. Email is better because we don't want you to share on bugzilla your visited websites.
Thank you! If you need any additional information, write me an email.

Flags: needinfo?(paulburke)

Hello,

Hopefully you will have received my email.

Flags: needinfo?(paulburke)
Priority: -- → P2
Severity: normal → S3
Priority: P2 → P3

The bug is linked to a topcrash signature, which matches the following criteria:

  • Top 20 desktop browser crashes on release
  • Top 5 desktop browser crashes on Mac on beta

:hpeuckmann, could you consider increasing the severity of this top-crash bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(hpeuckmann)
Keywords: topcrash

Looking at a few of the crash reports, the last sanitization progress step is 'principals-quota-manager', so Services.qms.listOrigins() could be causing this.

I noticed that we can reduce our calls to getAllPrincipals() in the sanitizer, we should try to only call this if sanitizeOnShutdown, clearOnShutdown.cookies and clearOnShutdown.offlineApps are true or the user has session permissions exceptions.

  • If sanitizeOnShutdown is true, we can check if offlineApps or cookies are to be cleaned, otherwise we can skip the call to getAllPrincipals() because those are the only categories we clear per principal.

  • If sanitizeOnShutdown is true I think we can also skip this 'session-permissions' block that is calling getAllPrincipals() again, from my understanding the session permission origins should already be cleared if sanitizeOnshutdown is true.

Paul, any thoughts on this?

Severity: S3 → S2
Flags: needinfo?(hpeuckmann) → needinfo?(pbz)
Priority: P3 → P2

We discussed this in the meeting. Hannah wants to look into optimizing this code to see if we can get the crash rates down. Mid to long term we should also address this issue by showing a progress dialog for clearing on shutdown and then not timing out because the shutdown takes longer.
However, it's also worth noting that the clients affected do not all have sanitize-on-shutdown enabled. In this case optimizing the principal collection or skipping it for the common case could help. Thanks Hannah!

Flags: needinfo?(pbz)
Assignee: nobody → hpeuckmann
Attachment #9304713 - Attachment description: WIP: Bug 1578273 - Reduce calls to the PrincipalsCollector. r=pbz! → Bug 1578273 - Reduce calls to the PrincipalsCollector. r=pbz!

Note we can't distinguish if something is taking too long, or if just the promise is not resolved. When the whole stack is made up of async functions, it's more likely stuff taking too long, but as soon as there's some callback in the middle, it's very easy to not invoke a callback if some unexpected exception happen or, even worst, when the callback is only invoked in case of success.
For example we had a lot of cases blocking in formData, so I switched it from callbacks to async functions, and the number of cases (afaict) reduced. Rejecting is better than not resolving for async shutdown.

So apart from optimizing, I suggest to also check in case of exceptions or errors that your callback is being invoked, so the promise is either resolved or rejected in any case.

Pushed by hpeuckmann@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/733f6ba8c1c6
Reduce calls to the PrincipalsCollector. r=pbz
Regressions: 1806620
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 110 Branch
See Also: → 1816279
Regressions: 1816279
See Also: 1816279

Backed out 2 changesets (Bug 1578273, Bug 1806620) as req for causing Bug 1816279.
Backout link

Status: RESOLVED → REOPENED
Flags: needinfo?(hpeuckmann)
Resolution: FIXED → ---
Target Milestone: 110 Branch → ---

I will fix it and land it again. We are missing an automated test for cookie clearing in a time range I will add one in this patch to prevent regressions like Bug 1816279 from happening again in the future.

Flags: needinfo?(hpeuckmann)
Attachment #9304713 - Attachment description: Bug 1578273 - Reduce calls to the PrincipalsCollector. r=pbz! → WIP: Bug 1578273 - Reduce calls to the PrincipalsCollector. r=pbz!

Depends on D162721

Attachment #9304713 - Attachment description: WIP: Bug 1578273 - Reduce calls to the PrincipalsCollector. r=pbz! → Bug 1578273 - Reduce calls to the PrincipalsCollector. r=pbz!
Attachment #9318789 - Attachment description: WIP: Bug 1578273 - Add test cases for cookie clearing in time range. r=pbz! → Bug 1578273 - Add test cases for cookie clearing in time range. r=pbz!

Comment on attachment 9318789 [details]
Bug 1578273 - Add test cases for cookie clearing in time range. r=pbz!

Revision D170377 was moved to bug 1819529. Setting attachment 9318789 [details] to obsolete.

Attachment #9318789 - Attachment is obsolete: true
Attachment #9304713 - Attachment description: Bug 1578273 - Reduce calls to the PrincipalsCollector. r=pbz! → WIP: Bug 1578273 - Reduce calls to the PrincipalsCollector. r=pbz!
Attachment #9304713 - Attachment description: WIP: Bug 1578273 - Reduce calls to the PrincipalsCollector. r=pbz! → Bug 1578273 - Reduce calls to the PrincipalsCollector. r=pbz!
Pushed by hpeuckmann@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6b9b6b406d7d
Reduce calls to the PrincipalsCollector. r=pbz
Status: REOPENED → ASSIGNED
Target Milestone: 110 Branch → ---
Status: ASSIGNED → RESOLVED
Closed: 1 year ago1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 113 Branch

The patch landed in nightly and beta is affected.
:hpeuckmann, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.Also, don't forget to request an uplift for the patches in the regressions caused by this fix.
  • If no, please set status-firefox112 to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(hpeuckmann)
Flags: needinfo?(hpeuckmann)
Flags: qe-verify+

I am not able to reproduce this issue neither with the version mention in the initial report (Fx69), nor with any more recent ones on Windows 10. Paul, can you confirm if time permits, that this issue is fixed on the latest beta build on your side? Thank you!

Flags: needinfo?(paulburke)

Yes, all seems go now.

Status: RESOLVED → VERIFIED
Flags: needinfo?(paulburke)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: