Closed Bug 1165945 Opened 9 years ago Closed 2 years ago

Better handle a failure to start a content child process.

Tracking

()

Status:

RESOLVED WONTFIX

Tracking Flags:

Tracking

Status

e10s

---

People

(Reporter: bobowen, Unassigned)

References

Details

Attachments

(1 file, 1 obsolete file)

Adds a telemetry probe and console logging to record subprocess launch failures 8 years ago Haik Aftandilian [:haik] 3.11 KB, patch	benjamin : feedback+	Details \| Diff \| Splinter Review
Adds a telemetry probe and console logging to record subprocess launch failures 8 years ago Haik Aftandilian [:haik] 3.18 KB, patch		Details \| Diff \| Splinter Review

Bob Owen (:bobowen)

Reporter

Description

•

9 years ago

On Windows, bug 1146874 fixed the problem where a failure to start a content process would cause the main process to crash. Now we don't crash, but nothing gets rendered. Should we handle this better, with retries or communication to the user, like when the content process crashes? Should we be reporting this via the crash reporter? Maybe the reporting should be common code in GeckoChildProcessHost.cpp

Bob Owen (:bobowen)

Reporter

Comment 1

•

9 years ago

We should probably communicate (and capture with telemetry) this sort of child process start-up failure to the user before releasing e10s.

tracking-e10s: --- → ?

Brad Lassey [:blassey] (use needinfo?)

Updated

•

9 years ago

tracking-e10s: ? → +

Brad Lassey [:blassey] (use needinfo?)

Comment 2

•

9 years ago

Bob seems like a printing to the console should suffice for now

Brad Lassey [:blassey] (use needinfo?)

Comment 3

•

9 years ago

In addition to logging to the console, can you add a telemetry probe so we can know if this is happening?

Assignee: nobody → jmathies

Priority: -- → P2

Jim Mathies [:jimm]

Updated

•

9 years ago

Flags: needinfo?(haftandilian)

Haik Aftandilian [:haik]

Updated

•

9 years ago

Assignee: jmathies → haftandilian

Flags: needinfo?(haftandilian)

Haik Aftandilian [:haik]

Comment 4

•

8 years ago

Attached patch Adds a telemetry probe and console logging to record subprocess launch failures (obsolete) — Details — Splinter Review

Bob, is this what you had in mind?

Attachment #8752907 - Flags: review?(bobowen.code)

Bob Owen (:bobowen)

Reporter

Comment 5

•

8 years ago

Comment on attachment 8752907 [details] [diff] [review] Adds a telemetry probe and console logging to record subprocess launch failures Review of attachment 8752907 [details] [diff] [review]: ----------------------------------------------------------------- This bug was originally about handling when the process fails and what happens after we return false from RunPerformAsyncLaunch. I put some thoughts in the Description, but I'm not sure who decides how we should handle it and I don't think I'd be the right person to review it when we do. A simple count like this of process type failures, is probably a good first probe, to see if there is a significant problem that we need to solve. I think you have to get a data peer review for new probes (an f? to :bsmedberg is how the MDN page suggests you do this). ::: ipc/glue/GeckoChildProcessHost.cpp @@ +564,5 @@ > + > + if (PerformAsyncLaunch(aExtraOpts, aArch)) { > + return true; > + } else { > + CHROMIUM_LOG(ERROR) << "Failed to launch " << Does this log to the console?

Attachment #8752907 - Flags: review?(bobowen.code)

Haik Aftandilian [:haik]

Comment 6

•

8 years ago

(In reply to Bob Owen (:bobowen) from comment #5) > Comment on attachment 8752907 [details] [diff] [review] > Adds a telemetry probe and console logging to record subprocess launch > failures > > Review of attachment 8752907 [details] [diff] [review]: > ----------------------------------------------------------------- > > This bug was originally about handling when the process fails and what > happens after we return false from RunPerformAsyncLaunch. > > I put some thoughts in the Description, but I'm not sure who decides how we > should handle it and I don't think I'd be the right person to review it when > we do. OK. I agree that a page like tab crashed would be ideal. Will look into that. Thanks. > A simple count like this of process type failures, is probably a good first > probe, to see if there is a significant problem that we need to solve. > I think you have to get a data peer review for new probes (an f? to > :bsmedberg is how the MDN page suggests you do this). Wanted to check with you first, will get review from bsmedberg regarding the telemetry. > ::: ipc/glue/GeckoChildProcessHost.cpp > @@ +564,5 @@ > > + > > + if (PerformAsyncLaunch(aExtraOpts, aArch)) { > > + return true; > > + } else { > > + CHROMIUM_LOG(ERROR) << "Failed to launch " << > > Does this log to the console? Yet to test on other platforms, but on OS X it does for debug and non-debug builds.

Haik Aftandilian [:haik]

Comment 7

•

8 years ago

Comment on attachment 8752907 [details] [diff] [review] Adds a telemetry probe and console logging to record subprocess launch failures f? for data collection review.

Attachment #8752907 - Flags: feedback?(benjamin)

Benjamin Smedberg

Comment 8

•

8 years ago

Comment on attachment 8752907 [details] [diff] [review] Adds a telemetry probe and console logging to record subprocess launch failures Why is n_values 16? Do we really expect there to be 16 process types in the future? Please document in the histogram "description" what the values mean. Either directly, or reference the enumeration type used. data-review=me with that fixed.

Attachment #8752907 - Flags: feedback?(benjamin) → feedback+

Haik Aftandilian [:haik]

Updated

•

8 years ago

Blocks: 1275430

Haik Aftandilian [:haik]

Comment 9

•

8 years ago

Attached patch Adds a telemetry probe and console logging to record subprocess launch failures — Details — Splinter Review

(In reply to Benjamin Smedberg [:bsmedberg] from comment #8) > Comment on attachment 8752907 [details] [diff] [review] > Adds a telemetry probe and console logging to record subprocess launch > failures > > Why is n_values 16? Do we really expect there to be 16 process types in the > future? The GeckoProcessType macro goes from 0-5 now. It's conceivable that we might add more process types as we implement sandboxing. I've reduced this to 10 to leave room to add more. > Please document in the histogram "description" what the values mean. Either > directly, or reference the enumeration type used. data-review=me with that > fixed. Updated the histogram description to read "Counts the number of times launching a subprocess fails. Counts are by subprocess-type using the GeckoProcessType enum."

Attachment #8752907 - Attachment is obsolete: true

Haik Aftandilian [:haik]

Updated

•

8 years ago

No longer blocks: 1275430

Comment 10

•

8 years ago

I'm splitting off the telemetry and logging to bug 1275430 so that this bug can be used to improve the user experience when a content process fails once we determine how frequently that happens. When I commented out the code to start a content process to test the telemetry code, Firefox hung at launch and no UI ever was displayed.

Haik Aftandilian [:haik]

Updated

•

7 years ago

Assignee: haftandilian → nobody

Sylvestre Ledru [:Sylvestre]

Comment 11

•

6 years ago

Moving to p3 because no activity for at least 1 year(s). See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information

Priority: P2 → P3

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Updated

•

6 years ago

Updated

•

2 years ago

Severity: normal → S3

BugBot [:suhaib / :marco/ :calixte]

Comment 12

•

2 years ago

The severity field for this bug is relatively low, S3. However, the bug has 5 See Also bugs.
:jjalkanen, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jjalkanen)

Jens Stutte [:jstutte]

Comment 13

•

2 years ago

GeckoChildProcessHost::AsyncLaunch has been largely reworked in the meantime. Which does not mean it cannot be improved, but I think this specific bug is obsolete.

Status: NEW → RESOLVED

Closed: 2 years ago

Flags: needinfo?(jjalkanen)

Resolution: --- → WONTFIX

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Comment 14

•

2 years ago

From the original description of this bug, this is more of a UI problem: when the process launch fails, the user gets a blank page, instead of retrying or showing an error. So, I don't know if the internal rearchitecting of GeckoChildProcessHost is relevant here.

Jens Stutte [:jstutte]

Comment 15

•

2 years ago

(In reply to Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧ from comment #14)

From the original description of this bug, this is more of a UI problem: when the process launch fails, the user gets a blank page, instead of retrying or showing an error. So, I don't know if the internal rearchitecting of GeckoChildProcessHost is relevant here.

OK, fair enough. I'd say, given that we added at least telemetry and log message here, the minimum requirement of giving "some" feedback has been fulfilled and we can keep this defect closed. But I filed bug 1795821 as an enhancement and made it depend on the bugs that were linked here with "see also" only. I hope that reflects the state better.

Comment 16

•

2 years ago

(In reply to Jens Stutte [:jstutte] from comment #15)

But I filed bug 1795821 as an enhancement and made it depend on the bugs that were linked here with "see also" only. I hope that reflects the state better.

That's a good summary of the issues as I understand them; thanks for filing it.

You need to log in before you can comment on or make changes to this bug.