Improve error reporting for process launches
Categories
(Core :: IPC, enhancement, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox114 | --- | fixed |
People
(Reporter: gerard-majax, Assigned: gerard-majax)
References
(Blocks 1 open bug)
Details
Attachments
(3 files, 2 obsolete files)
The rationale here is related to https://bugzilla.mozilla.org/show_bug.cgi?id=1789126#c22 ; tl;dr is:
- we hit a reject
- everything suggest this is because of
GeckoChildProcessHost::Destroy
that rejects when we are in the begining of starting a process - we have
IsShutdown()
guarding on our utility process manager - this depends on a boolean updated when
xpcom-shutdown
observer is delivered to us - the test reproducing are tests where I already spotted raciness around starting a process when shutting down
So my take is we enter a small window of possible race between the GeckoChildProcessHost::Destroy
and the moment we receive the observer, hence we get rejected by Destroy
but we still dont know we are in shutdown. Moving to AppShutdown
would, I hope, avoid that.
Assignee | ||
Comment 1•2 years ago
|
||
Comment 2•2 years ago
|
||
I'm not convinced that this would do anything to improve the situation from https://bugzilla.mozilla.org/show_bug.cgi?id=1789126#c21. From looking at the stack you can see that we're not yet in proper XPCOM shutdown, so we can't have passed either xpcom-shutdown
or xpcom-will-shutdown
yet.
I think whatever failure is happening here is probably unrelated to shutdown. There are many different reasons why child process startup could fail (e.g. due to the binary being moved/replaced, among probably other things) so we should probably change the code to handle failure here, rather than diagnostic asserting.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 3•2 years ago
|
||
So there's already some data collection on subprocess launch failures at https://searchfox.org/mozilla-central/rev/aa3ccd258b64abfd4c5ce56c1f512bc7f65b844c/ipc/glue/GeckoChildProcessHost.cpp#776-779 and it should cover what we want to look at, except maybe we lack of actual error message (but I could only see usages of LaunchError{}
around, with LaunchError
being an empty struct. I have however not been able to see the telemetry data so far.
Assignee | ||
Comment 4•2 years ago
|
||
It does indeed show a few Utility failures to start.
Assignee | ||
Comment 5•2 years ago
|
||
Jens, on bug 1795821 you mention tracking better the kind error failure Which errors are propagated under which conditions by the async promise chain we have here and if there are recoverable conditions we might be able to distinguish and handle better or at least to notify to the user
. So far looking at GeckoChildProcessHost
I see mostly opaque LaunchError
, is fixing this was what you implied ?
Comment 6•2 years ago
•
|
||
I think I wanted to say something on the lines of:
- check if we understand which conditions can lead to an error
- check if those have distinct error codes/messages
- verify if some of those are potentially avoidable (like checking something before launching)
- verify if some of those are potentially recoverable (like retrying)
- check then if all callers are checking well for error conditions and how they react and if we should tell something to the user (if we are not in shutdown)
So having something better than opaque LaunchError
is definitely an improvement in that direction, as you would touch at least 1., 2. and a bit of 5. of the above list, yeah.
(this would alter the scope of this bug quite heavily, btw)
Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 7•2 years ago
|
||
Depends on D171226
Assignee | ||
Comment 8•2 years ago
|
||
Depends on D172433
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 9•2 years ago
|
||
Updated•2 years ago
|
Comment 10•2 years ago
|
||
Comment on attachment 9326644 [details]
data review
I'm not a data steward, but chutten is and can take care of it here.
Assignee | ||
Comment 11•2 years ago
|
||
Updated•2 years ago
|
Comment 12•2 years ago
|
||
Comment on attachment 9326644 [details]
data review
DATA COLLECTION REVIEW RESPONSE:
Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?
Yes.
Is there a control mechanism that allows the user to turn the data collection on and off?
Yes. This collection can be controlled through Firefox's Preferences.
If the request is for permanent data collection, is there someone who will monitor the data over time?
Yes, Alexandre Lissy is responsible.
Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?
Category 1, Technical.
Is the data collection request for default-on or default-off?
Default on for all channels.
Does the instrumentation include the addition of any new identifiers?
No.
Is the data collection covered by the existing Firefox privacy notice?
Yes.
Does the data collection use a third-party collection tool?
No.
Result: datareview+
Comment 13•2 years ago
|
||
![]() |
||
Comment 14•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/654ce2247015
https://hg.mozilla.org/mozilla-central/rev/6ef392f0f69a
Description
•