Closed Bug 862178 Opened 11 years ago Closed 7 years ago

CrashSubmit.jsm / KeyValueParser.jsm do a lot of main thread file I/O

Categories

(Toolkit :: Crash Reporting, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla56
Iteration:
56.1 - Jun 26
Performance Impact medium
Tracking Status
firefox56 --- fixed

People

(Reporter: Gavin, Assigned: perry, Mentored)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: main-thread-io, Whiteboard: [reserve-photon-performance][fce-active-legacy])

Attachments

(1 file)

Most of the code in these modules involves doing file I/O on the main thread. We should fix that where possible, and use OS.File.
Most of this should be pretty trivial to change. The only thing that might suck is the pendingIDs call is currently synchronous, and we'd want to change that to async. Looks like the only caller is B2G though, so might not be a big deal to change it:
http://mxr.mozilla.org/mozilla-central/source/b2g/chrome/content/shell.js#117
This contributes 89ms of jank on this startup profile captured on the quantum reference hardware: https://perfht.ml/2pSGtak

There are several .exists() calls inside http://searchfox.org/mozilla-central/rev/ae8c2e2354db652950fe0ec16983360c21857f2a/toolkit/crashreporter/CrashSubmit.jsm#381 and it calls http://searchfox.org/mozilla-central/rev/ae8c2e2354db652950fe0ec16983360c21857f2a/toolkit/crashreporter/KeyValueParser.jsm#39 This should use OS.File instead.
Whiteboard: [photon-performance][qf]
Flags: qe-verify?
Priority: -- → P2
The fact that submitReports shows up in a startup profile is more concerning to me! We shouldn't be submitting crash reports during startup. I assume this is from `checkForUnsubmittedCrashReports`, which runs when the `browser-delayed-startup-finished` notification is fired:
https://dxr.mozilla.org/mozilla-central/rev/2e7c10a9b86e30691f67855f6c8f98d984508d7c/browser/modules/ContentCrashHandlers.jsm#624

Is that still too early, or is something else happening here? We can also fix the code in CrashSubmit.jsm/KeyValuePairs.jsm, but we really shouldn't be exercising any of that during startup.

Also, since comment 0 / comment 1 were written, CrashSubmit.jsm did get a `pendingIDsAsync` method:
http://searchfox.org/mozilla-central/rev/ae8c2e2354db652950fe0ec16983360c21857f2a/toolkit/crashreporter/CrashSubmit.jsm#516

and I don't think anything is using the old pendingIDs method:
https://dxr.mozilla.org/mozilla-central/search?tree=mozilla-central&q=pendingIDs

...so we could just remove that.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #3)
> The fact that submitReports shows up in a startup profile is more concerning
> to me!

I typed comment 2 too quickly and wasn't clear enough, sorry. My link points to a startup profile, but the submitReports part is running after we are done with all the CPU intensive parts of startup (bug 1358583 is causing the misleading red bar at the top). Observing browser-delayed-startup-finished is reasonable, but this code may not be the only one observing it, so using requestIdleCallback would be nice.

> Is that still too early, or is something else happening here? We can also
> fix the code in CrashSubmit.jsm/KeyValuePairs.jsm, but we really shouldn't
> be exercising any of that during startup.

The main thread IO really needs fixing, yes.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #3)
> The fact that submitReports shows up in a startup profile is more concerning
> to me! We shouldn't be submitting crash reports during startup. I assume
> this is from `checkForUnsubmittedCrashReports`, which runs when the
> `browser-delayed-startup-finished` notification is fired:
> https://dxr.mozilla.org/mozilla-central/rev/
> 2e7c10a9b86e30691f67855f6c8f98d984508d7c/browser/modules/
> ContentCrashHandlers.jsm#624

Note that we'll only hit this code in the start-up path (or soon after) if we have browser.crashReports.unsubmittedCheck.enabled is true which is only true for builds up to early beta, AND browser.crashReports.unsubmittedCheck.autoSubmit set to true, which users have to opt-in to.

We can still hit these main-thread disk IO paths if the user chooses to submit a crash report from about:tabcrashed, I think.

Especially because this main thread IO is likely to occur _after_ a tab just crashed, I think we're compounding bad things, and I think we should fix this.
Whiteboard: [photon-performance][qf] → [photon-performance][qf:p1]
For additional related jank, it turns out that the async XHR used for the upload also does main thread IO (see bug 1362388).
Priority: P2 → P3
Whiteboard: [photon-performance][qf:p1] → [reserve-photon-performance][qf:p1]
Whiteboard: [reserve-photon-performance][qf:p1] → [reserve-photon-performance][qf:p2]
Temporarily assigning this to Felipe who intends to mentor this, as the first bug of an intern starting soon.
Assignee: nobody → felipc
Mentor: felipc
No longer blocks: qf-bugs-upforgrabs
Flags: qe-verify? → qe-verify-
Assignee: felipc → jiangperry
Status: NEW → ASSIGNED
Priority: P3 → P1
Comment on attachment 8878697 [details]
Bug 862178 - Remove main thread I/O from CrashSubmit.jsm and KeyValuParser.jsm, replacing with OS.File.

https://reviewboard.mozilla.org/r/150010/#review155002

This is good overall; I'm giving it r- just because of a few issues that need to be addressed before we can land it:
* Some tests are failing with the patch applied
  * https://treeherder.mozilla.org/#/jobs?repo=try&revision=0accc16a59b7862521178ae30ad0ca7baaad8498&group_state=expanded&selectedJob=108098311
  * https://treeherder.mozilla.org/#/jobs?repo=try&revision=0accc16a59b7862521178ae30ad0ca7baaad8498&group_state=expanded&selectedJob=108098330
  * https://treeherder.mozilla.org/#/jobs?repo=try&revision=0accc16a59b7862521178ae30ad0ca7baaad8498&group_state=expanded&selectedJob=108098331
* This doesn't apply cleanly on top of mozilla-central, you'll need to rebase it. By doing so you will also pick up a change that will make a test less prone to intermittent failures
Attachment #8878697 - Flags: review?(gsvelto) → review-
Whiteboard: [reserve-photon-performance][qf:p2] → [reserve-photon-performance][qf:p2][fce-active]
(In reply to Gabriele Svelto [:gsvelto] from comment #9)

> * Some tests are failing with the patch applied
>   *
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=0accc16a59b7862521178ae30ad0ca7baaad8498&group_state=e
> xpanded&selectedJob=108098311
>   *
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=0accc16a59b7862521178ae30ad0ca7baaad8498&group_state=e
> xpanded&selectedJob=108098330
>   *
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=0accc16a59b7862521178ae30ad0ca7baaad8498&group_state=e
> xpanded&selectedJob=108098331

Those tests should be passing now. One additional issue is that the INI parser is synchronous; there doesn't seem to be an off-main-thread/async version of the parser, so I left that as is.
Comment on attachment 8878697 [details]
Bug 862178 - Remove main thread I/O from CrashSubmit.jsm and KeyValuParser.jsm, replacing with OS.File.

https://reviewboard.mozilla.org/r/150010/#review156710

Everything seems in order now, thank you! Please file a bug for adding an asynchronous version of the nsIINIParser so that we don't forget we can make that asynchronous too.
Attachment #8878697 - Flags: review?(gsvelto) → review+
For QA, steps to manually test:

1) Go to the [user application data]/Crash Reports/pending and [user application data]/Crash Reports/submitted directories. In the browser console, `OS.Constants.Path.userApplicationDataDir` should resolve to the user application data directory. Note if there’s anything there and their filenames (new files should appear after a crash in “pending” and in “submitted” after a pending crash report sends).

2) The “Crash Me Now!” add-on (https://addons.mozilla.org/en-US/firefox/addon/crash-me-now-simple/) can be used to crash the browser or tab. Crash either the browser (“Crash me!”) or tab (“Crash content process!”). “Crash content process!” might not appear in the Add-on menu but should be available in the Customize menu.

3) If the browser has crashed, a “Mozilla Crash Reporter” popup should appear with the option to leave a comment and contact email for the crash report. Otherwise, if a tab is crashed, the tab should be replaced with a similar dialogue. Leave a unique comment (which is publicly accessible) to identify the crash later, but other than that, don’t interact with the popup or crashed tab yet.

4) In the same “pending” directory in 1), files should appear in the form of a UUID with a “.dmp” and “.extra” extension (e.g. “9F83E316-E066-4D61-85AD-C5BF04A35A42.dmp” and “9F83E316-E066-4D61-85AD-C5BF04A35A42.extra”).

5) Making sure that the option to send the crash report is checked (in the popup or tab), either quit or restore the tab/browser. Alternatively, to manually submit the report, exit out of the popup or crashed tab, navigate to about:crashes, and click on the pending report under “Unsubmitted Crash Reports” to send it. The files which had appeared in the “pending” directory should disappear, and there should be an extra file in the “submitted” directory with an ID like “bp-6cef8451-9b86-4d90-855e-8a2070170621” (e.g. “bp-6cef8451-9b86-4d90-855e-8a2070170621.txt”). 

6) Navigate to about:crashes, and you should see a submitted crash report with the same ID seen in 5). Click the ID in about:crashes under “Submitted Crash Reports” to be taken to https://crash-stats.mozilla.com, where you can verify that this crash report is for the correct crash by matching the comment submitted with “User Comment”.
Flags: needinfo?(stefan.georgiev)
Flags: needinfo?(andrei.vaida)
Crash report submitted: https://crash-stats.mozilla.com/report/index/0a390b1f-e37f-4b6f-9aaa-e16d70170622
Flags: needinfo?(stefan.georgiev)
I have tested this on Windows 10 x64, Windows 7 x86 , Ubuntu 16.04 x64 and Mac OS X 10.12. Every time when crashing the  process/browser, the files in pending directory are created and gets deleted ".dmp and .extra" right after restoring the process/browser and new "CrashID.txt" file is created in the submitted directory. 

No issues found during testing. If you have any questions or need assistance regarding the testing, please let me know.
Pushed by felipc@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/e25d4611b841
Remove main thread I/O from CrashSubmit.jsm and KeyValuParser.jsm, replacing with OS.File. r=gsvelto
https://hg.mozilla.org/mozilla-central/rev/e25d4611b841
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Iteration: --- → 56.1 - Jun 26
Flags: needinfo?(andrei.vaida)
Depends on: 1377868
Whiteboard: [reserve-photon-performance][qf:p2][fce-active] → [reserve-photon-performance][qf:p2][fce-active-legacy]
Performance Impact: --- → P2
Whiteboard: [reserve-photon-performance][qf:p2][fce-active-legacy] → [reserve-photon-performance][fce-active-legacy]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: