Closed Bug 902790 Opened 6 years ago Closed 6 years ago

Browser crash on failure to create child process

Categories

(Core :: IPC, defect, critical)

x86
Windows 7
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla26
blocking-b2g leo+
Tracking Status
firefox24 --- wontfix
firefox25 --- verified
firefox26 --- verified
b2g18 --- fixed
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- wontfix
b2g-v1.1hd --- fixed

People

(Reporter: markh, Assigned: markh)

References

Details

(Keywords: crash, Whiteboard: [LeoVB+])

Crash Data

Attachments

(1 file)

STR:
* Move plugin-container.exe away so it can't be found by the browser.
* Start the browser and do something that attempts to start a child process (eg, show about:newtab when thumbnails are out of date.

Actual:
Browser crashes hard.

Expected:
Browser gets a little upset but stays alive.

Debug build hit the assertion:

  MOZ_ASSERT(mProcessState == PROCESS_ERROR || mChildProcessHandle);

The problem is GeckoChildProcessHost doesn't correctly detect a failure to start the process.  It could do this 2 ways - initialize the 'process' variable to zero or check the result of base::LaunchApp() - but it does neither.

The following attachment takes the former route.  Also note that I suspect the same bug exists for non-Windows platforms, even though a different code path is being taken.  In that case, the code doesn't check if the launch was successful but starts waiting for messages from the (non-existent) child.  I suspect this is the cause of bug 902755, but haven't attempted to fix that in this trivial patch.
Attachment #787320 - Flags: feedback?(cjones.bugs)
Blocks: 899758
Please provide the crash ID from about:crashes.
Severity: normal → critical
Flags: needinfo?(mhammond)
Keywords: crash, stackwanted
Comment on attachment 787320 [details] [diff] [review]
child-process-creation-crash.patch

I've been out of this part of the code for a while.  :bent can handle or forward as needed.
Attachment #787320 - Flags: feedback?(cjones.bugs) → feedback?(bent.mozilla)
It's similar to bug 771090.
It has spiked since 25.0a1/20130725.
Crash Signature: [@ RtlpCallQueryRegistryRoutine | BaseThreadInitThunk ]
Keywords: stackwanted
Hardware: x86_64 → x86
(In reply to Scoobidiver from comment #4)
> It's similar to bug 771090.

It looks like that bug fixed one of the reasons the child process failed to launch, but not the underlying problem if it fails for other reasons.

> It has spiked since 25.0a1/20130725.

That's probably the background thumbnails.
Comment on attachment 787320 [details] [diff] [review]
child-process-creation-crash.patch

This is a crasher that is spiking on 25.  The patch here is trivial - it initializes an integer local variable to zero and the crash is prevented.  Given that, it seems to me that we should try and take this ASAP, even if a better fix is found later.  Given this and the fact bent's review queue is huge, Gavin suggested I could try roping jlebar in (even without checking how big Justin's review queue is - apologies if yours is just as bad!)
Attachment #787320 - Flags: feedback?(justin.lebar+bug)
Comment on attachment 787320 [details] [diff] [review]
child-process-creation-crash.patch

Oh gosh, yes please.

Would you mind moving the |process| variable closer to where it's first used, while we're here?
Attachment #787320 - Flags: feedback?(justin.lebar+bug)
Attachment #787320 - Flags: feedback?(bent.mozilla)
Attachment #787320 - Flags: feedback+
Attachment #787320 - Flags: feedback+ → review+
Thanks for the speedy review!  As mentioned in IRC, the #ifdef mess there prevents moving the variable more than a line or 2, so I left it where it was.

https://hg.mozilla.org/integration/mozilla-inbound/rev/87b19d2934a0
Assignee: nobody → mhammond
Status: NEW → ASSIGNED
In IRC, jlebar and I thought it worth bringing to Leo's attention that b2g is almost certainly going to hit this same problem.  The patch is zero risk, and if we are correct, the consequences might be somewhat catastrophic if child process creation fails.  Flagging as needinfo to ensure it gets attention whatever the outcome.
Flags: needinfo?(leo)
blocking-b2g: --- → leo?
Flags: needinfo?(leo)
(In reply to Mark Hammond (:markh) from comment #5)
> (In reply to Scoobidiver from comment #4)
> > It has spiked since 25.0a1/20130725.
> That's probably the background thumbnails.
It should be uplifted at least in Aurora then.
(In reply to Mark Hammond (:markh) from comment #9)
> In IRC, jlebar and I thought it worth bringing to Leo's attention that b2g
> is almost certainly going to hit this same problem.  The patch is zero risk,
> and if we are correct, the consequences might be somewhat catastrophic if
> child process creation fails.  Flagging as needinfo to ensure it gets
> attention whatever the outcome.

I think you might have the wrong Leo :)
https://hg.mozilla.org/mozilla-central/rev/87b19d2934a0
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
Comment on attachment 787320 [details] [diff] [review]
child-process-creation-crash.patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Unknown
User impact if declined: Browser may crash if a child process can't be created.
Testing completed (on m-c, etc.): Landed on m-c
Risk to taking this patch (and alternatives if risky): Zero risk patch.
String or IDL/UUID changes made by this patch: None
Attachment #787320 - Flags: approval-mozilla-aurora?
Attachment #787320 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
leo.bugzilla.gecko@gmail.com - I'd like to bring this bug specifically to your attention given comment 9. We're approving for the branch given zero risk, and high reward.
blocking-b2g: leo? → leo+
Flags: needinfo?(leo.bugzilla.gecko)
Leo takes this patch, thank you.
Flags: needinfo?(leo.bugzilla.gecko)
Whiteboard: [LeoVB+]
Reproduced the crash in Nightly 2013-08-05 using the STR in comment 0.
Verified fixed FF 25b4 Win 7 x64.
Mozilla/5.0 (Windows NT 6.1; rv:26.0) Gecko/20100101 Firefox/26.0

Verified as fixed on latest Aurora (buildID: 20131011004001).
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.