Closed Bug 1510900 Opened 5 years ago Closed 5 years ago

Consider crashing the content process when WebRender crashes the GPU process

Categories

(Core :: Graphics: WebRender, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
firefox65 --- fixed
firefox66 --- fixed

People

(Reporter: mattwoodrow, Assigned: mattwoodrow)

References

Details

Attachments

(1 file, 1 obsolete file)

We don't currently have a way of getting URLs from GPU process crashes, and finding the right non-intrusive UI for fixing that might take a while. Bug 1506202 is filed for that.

As an interim option, we could consider also crashing the content process when the GPU process crashes, such that we get two crash reports, and hopefully the URL is contained in the second one.

Finding a way to correlate the reports might be fun, but it's possible we don't need it, and just matching spikes of signatures will be sufficient.

In terms of implementation, the GPU process uses CrashReporterClient (PCrashReporter) to create shared memory between the GPU process and parent used for storing metadata to put in a crash report.

We can store data in there (like the tab id?) to identify which tab content is currently active within WR.

If/when CrashReporterParent is processing a crashed GPU process, and the data identifying a problem tab exists, we can send a message to the appropriate content process and force a crash.

We'd probably only want this in Nightly, with lots of easy ways to disable it.
Priority: -- → P2
Assignee: nobody → matt.woodrow
See Also: → 1506202
After looking at bug 1506202 and the patch in this bug I might have a less intrusive solution: since it's possible to obtain the TabParent from the layer ID you could obtain the corresponding ContentParent and from there the content process annotation that's been stored in its CrashReporterHost instance. No method for retrieving annotations is present in nsIContentProcess and CrashReporterHost but adding them should be trivial. Once you have the URL annotation for the corresponding content process you can just add it to the GPU process crash. If the user has enabled sending crash reports automatically the report will be silently sent like other content process crashes.
(In reply to Gabriele Svelto [:gsvelto] from comment #2)
> After looking at bug 1506202 and the patch in this bug I might have a less
> intrusive solution: since it's possible to obtain the TabParent from the
> layer ID you could obtain the corresponding ContentParent and from there the
> content process annotation that's been stored in its CrashReporterHost
> instance. No method for retrieving annotations is present in
> nsIContentProcess and CrashReporterHost but adding them should be trivial.
> Once you have the URL annotation for the corresponding content process you
> can just add it to the GPU process crash. If the user has enabled sending
> crash reports automatically the report will be silently sent like other
> content process crashes.

We actually have the URL in the GPU process already, so I think we could just annotate the crash report directly from within the GPU process.

I was mainly worried about getting user consent for submitting a URL with the crash report, do we not need to request this separately?
Flags: needinfo?(gsvelto)
I've looked at the existing code and it seems to me that the URL is stripped out from the crash report only for crashes that are submitted interactively (i.e. via the crashreporter client or the crash tab UI). All other crashes, such as auto-submitted content crashes or crashes submitted via about:crashes automatically include the URL if it's among the crash annotations.

I'm not sure if this was intended or just a side-effect since the code that strips the URL is part of the various clients' implementations and not part of the submission pipeline. However including it in auto-submitted GPU process crashes wouldn't be different than what we already do with content crashes.

Maybe it would be best to have a data steward involved to be 100% of what the correct behaviour should be.
Flags: needinfo?(gsvelto)
Interesting, I wasn't aware that we did that. Prior art is definitely useful here.

Chris, are you the right person to help us figure out if the current implementation matches our policy here? And if we can extend it to the GPU process?
Flags: needinfo?(chutten)
Policy on crash reports is included in the privacy details here: https://www.mozilla.org/en-US/privacy/firefox/#crash-reporter

(this is linked to as "Learn more" next to the preference that allows Firefox to auto-submit crash reports).

It explains what data is collected by Firefox crash reports irrespective of what process crashed. It does explicitly mention the opt-in.

From my reading of the prefs, `browser.tabs.crashReporting.includeURL` defaults to false and needs to be specifically checked to be included. If it is checked once, it defaults to checked[1]. If autosubmission is enabled, it remains in the state it was when autosubmission was enabled.

All this being said, I'm not the right person to figure out if the behaviour matches the policy. Please email merwin or aliciag or someone else from Privacy & Trust about this.

[1]: https://searchfox.org/mozilla-central/rev/adcc169dcf58c2e45ba65c4ed5661d666fc3ac74/browser/modules/ContentCrashHandlers.jsm#448
Flags: needinfo?(chutten)
(In reply to Chris H-C :chutten from comment #6)
> From my reading of the prefs, `browser.tabs.crashReporting.includeURL`
> defaults to false and needs to be specifically checked to be included. If it
> is checked once, it defaults to checked[1]. If autosubmission is enabled, it
> remains in the state it was when autosubmission was enabled.

True, but auto-submission (or submission from about:crashes) ignores that flag. This might be a bug or it might have been a deliberate choice. Either way I think whatever policy is in place for content processes should also apply to GPU processes.
I definitely agree that we should be consistent between the two types of child process.

My interpretation of the policy is that it doesn't distinguish between sending a report, and including the URL in that report, just that we need to get approval for it as a whole (which opting into auto-submission is). That would make that current behaviour correct.

I'll send an email to merwin and aliciag to seek further clarification.
I agree that we should have a consistent approach between the content process and GPU process.  I don't understand/recall why we settled on the auto-submission ignoring the flag.  I'd like to run that down but for the moment duplicating the current approach from the content process crash seems right.
Attachment #9028850 - Attachment is obsolete: true
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6f590bff2797
Include URL in GPU process crash reports. r=jrmuizel,gsvelto
https://hg.mozilla.org/mozilla-central/rev/6f590bff2797
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
Is this something we should consider backporting for the studies running on Beta or can it ride the trains?
Flags: needinfo?(matt.woodrow)
Comment on attachment 9032285 [details]
Bug 1510900 - Include URL in GPU process crash reports. r?jrmuizel,gsvelto

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: None

User impact if declined: Can't get URLs for WebRender crashes when running a SHIELD study in beta, making it harder to fix detected issues.

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): 

String changes made/needed:
Flags: needinfo?(matt.woodrow)
Attachment #9032285 - Flags: approval-mozilla-beta?
(In reply to Ryan VanderMeulen [:RyanVM] from comment #13)
> Is this something we should consider backporting for the studies running on
> Beta or can it ride the trains?

Good idea, the extra volume on beta should help us find testcases for crashes.
Comment on attachment 9032285 [details]
Bug 1510900 - Include URL in GPU process crash reports. r?jrmuizel,gsvelto

[Triage Comment]
Gives us more information in WebRender crash reports to improve our ability to find sites which reproduce. Approved for 65.0b7.
Attachment #9032285 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: