Closed Bug 1370520 Opened 5 years ago Closed 5 months ago

With MOZ_CRASHREPORTER_SHUTDOWN environment variable set the browser should quit with a non 0 exit code

Categories

(Toolkit :: Startup and Profile System, enhancement, P5)

enhancement

Tracking

()

RESOLVED FIXED
102 Branch
Tracking Status
firefox102 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

Attachments

(1 file)

With bug 1311016 we originally implemented the shutdown sequence for a content crash and MOZ_CRASHREPORTER_SHUTDOWN set. Sadly this causes Firefox to quit with the exit code 0.

To better reflect that the browser has been quit due to a crash we should select a non 0 exit code.

Has anyone an idea how to do this?
I'm going to link to the codepaths currently involved in setting a return value, but I don't see an obvious place to interject this.

rv here: http://searchfox.org/mozilla-central/rev/d441cb24482c2e5448accaf07379445059937080/toolkit/xre/nsAppRunner.cpp#4833
comes from the end of xre_mainrun where it calls nsAppStartup::Run: http://searchfox.org/mozilla-central/rev/d441cb24482c2e5448accaf07379445059937080/toolkit/xre/nsAppRunner.cpp#4571
which comes from here: http://searchfox.org/mozilla-central/rev/d441cb24482c2e5448accaf07379445059937080/toolkit/components/startup/nsAppStartup.cpp#288

Since return codes are fragile, please add an automated test to make sure this keeps working? I suspect you're aware of the old bug 894697 which asked for a generic API for setting the Firefox return code. I rejected that because at the time we couldn't write tests for it, and there are end-user cases where it would break. For specific automation needs where we know we're not going to be doing restarts for update that's less of a problem, but it's still going to be fragile.
Component: General → Startup and Profile System
Product: Firefox → Toolkit
Priority: -- → P5
Depends on: 1449637

With bug 1675329 fixed, we should be able to exit Firefox with a different exit code now. Dave, what do you think which code would make sense?

Flags: needinfo?(dtownsend)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] (away 01/15 - 01/31) from comment #2)

With bug 1675329 fixed, we should be able to exit Firefox with a different exit code now. Dave, what do you think which code would make sense?

As I think the only person currently using it I'm going to say Nick should make a suggestion here.

Flags: needinfo?(dtownsend) → needinfo?(nalexander)

(In reply to Dave Townsend [:mossop] from comment #3)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] (away 01/15 - 01/31) from comment #2)

With bug 1675329 fixed, we should be able to exit Firefox with a different exit code now. Dave, what do you think which code would make sense?

As I think the only person currently using it I'm going to say Nick should make a suggestion here.

I have no strong opinions. For background tasks I settled on:

0 - success
2 - couldn't find background task with given name
3 - task threw an exception

I expect tasks themselves to exit with code 0 (success) or 1 (error) most frequently, hence no 1 in the harness.

In Henrik's use case I'd probably go with something >1 just so that it stands out a little bit more.

Flags: needinfo?(nalexander)

So when we call Services.startup.quit() with Ci.nsIAppStartup.eForceQuit we shutdown Firefox similar to a SIGTERM, right? If that is the case maybe we could use 15 as exit code?

I'm happy to provide a patch once we agreed on a value, and update Marionette to better handle content crash shutdowns.

Flags: needinfo?(gsvelto)

I don't think so. If I'm reading the implementation of Services.startup.quit() correctly it just stops the main event loop and lets shutdown happen normally.

Flags: needinfo?(gsvelto)

Nick, do you have more insights into how Firefox gets shutdown in this scenario? Is it like a SIGKILL (9) then?

Flags: needinfo?(nalexander)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #7)

Nick, do you have more insights into how Firefox gets shutdown in this scenario? Is it like a SIGKILL (9) then?

No, I don't think it's like any signal. If we invoke Services.startup.quit(), that's a normal/orderly shutdown. (No legacy consumer of that API set a non-zero exit code; it's only the very recent background task work that may set a non-zero exit code.)

I think you just want to fix an "unusual exit code", maybe 3, and then

Services.startup.quit(Ci.nsIAppStartup.eForceQuit, exitCode);

around here. In some way your tests will need to know about the special exit code, but I have no idea if there's an established convention. I guess you could try to use SIGCHLD, but that seems as confusing as anything else.

Flags: needinfo?(nalexander)

I would like to pick this up again, and as such checked a bit more what a good exit code could be. There was nothing that helpful to find, but then I thought about the exit code when Firefox crashes when loading about:crashparent. And here I can see the usage of 245.

Nick, would it make sense to re-use the same exit code here? At least it would give the same behavior when loading about:crashcontent in the browser, which would force a shutdown when the env variable is set. If you agree I could make a patch for it and enhance our Marionette tests.

Flags: needinfo?(nalexander)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #9)

I would like to pick this up again, and as such checked a bit more what a good exit code could be. There was nothing that helpful to find, but then I thought about the exit code when Firefox crashes when loading about:crashparent. And here I can see the usage of 245.

Nick, would it make sense to re-use the same exit code here? At least it would give the same behavior when loading about:crashcontent in the browser, which would force a shutdown when the env variable is set. If you agree I could make a patch for it and enhance our Marionette tests.

Sure, 245 is as good as anything else. This is -10, i.e., -SIGUSR1, which makes sense for a user-invoked crash.

Flags: needinfo?(nalexander)

While working on the patch I found a similar usage of a forced shutdown but for the socket process, which is using 1 instead:
https://searchfox.org/mozilla-central/rev/b72e9d7d63bf499d1d8168291b93d4ec7fde236e/netwerk/ipc/SocketProcessParent.cpp#95-103

I wonder if we should sync both exit codes to 245 or leave that one as 1 to have the possibility to differentiate between forced shutdowns that are triggered by different crashes.

Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cc56afa024dc
Use SIGUSR1 (245) as exit code for a content crash forced shutdown. r=webdriver-reviewers,gsvelto,jgraham
Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 102 Branch
Blocks: 1365391
Regressions: 1769584
Regressions: 1769544
You need to log in before you can comment on or make changes to this bug.