Closed Bug 904323 Opened 6 years ago Closed 6 years ago

NS_DebugBreak() on Windows when MOZ_DEBUG_APP_PROCESS or MOZ_DEBUG_CHILD_PROCESS is set

Categories

(Core :: IPC, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla26

People

(Reporter: markh, Assigned: markh)

Details

Attachments

(1 file, 2 obsolete files)

If the environment contains MOZ_DEBUG_APP_PROCESS or MOZ_DEBUG_CHILD_PROCESS, we print a message and sleep for 30 seconds to allow the developer to attach a debugger to the process.  This 30 second delay is tiresome - calling NS_DebugBreak() should allow us to break into the debugger directly and start debugging quicker than waiting for the 30 seconds to expore.

MOZ_DEBUG_APP_PROCESS was added in bug 821044, so I'm asking both cjones and kanru for feedback on this, and insights into whether it should work on b2g - I've only tried Windows.  FWIW, the patch does the printf and printf_stderr as suggested in that bug.  I'm also wondering if this qualifies for a beverage ;)
Attachment #789310 - Flags: feedback?(kchen)
Attachment #789310 - Flags: feedback?(cjones.bugs)
Comment on attachment 789310 [details] [diff] [review]
NS_DebugBreak() instead of sleeping to debug the child process.

Review of attachment 789310 [details] [diff] [review]:
-----------------------------------------------------------------

The problem is that we don't know which child process id to attach to. So when we run into the NS_DebugBreak it's already too late to attach the debugger. IIRC on Linux this will wait there and let the developer to attach the debugger. I'm not sure how does this work on B2G.
Obviously I could make this change Windows only - but on Windows at least, NS_DebugBreak() offers to break into the debugger - is that not how it works on other platforms?  (If not, I'm interested to know what use it is on those platforms?)
On B2G currently this just kill the child processes.

I/Gecko   ( 1611): Invoking NS_DebugBreak() to debug child process Camera [1611]
I/Gecko   ( 1611): 
I/Gecko   ( 1611): [Child 1611] ###!!! BREAK: file /home/kanru/zone2/mozilla/central/dom/ipc/ContentChild.cpp, line 344
F/libc    ( 1611): Fatal signal 7 (SIGBUS) at 0x416b73da (code=0)

It would be great if this could invoke gdbserver and allow us to attach to it.
Attachment #789310 - Flags: feedback?(kchen)
(In reply to Kan-Ru Chen [:kanru] from comment #3)
> On B2G currently this just kill the child processes.
...
> It would be great if this could invoke gdbserver and allow us to attach to
> it.


Thanks!  I think I need to research this a bit more - eg, exactly how XPCOM_DEBUG_BREAK works. I hope there's some way we can ensure *anything* calling NS_DebugBreak() can be reliably broken into then this patch can just leverage that.  I'll dig a little more before I give up and put this new functionality behind a windows #ifdef :)
Comment on attachment 789310 [details] [diff] [review]
NS_DebugBreak() instead of sleeping to debug the child process.

(Word of warning: I don't hack on gecko full-time anymore so you can probably get faster reviews by asking around on IRC.)

As I think was mentioned above, this will by default cause the child processes to commit suicide on non-win32, which isn't helpful.
Attachment #789310 - Flags: feedback?(cjones.bugs) → feedback-
There's no decent way to JIT debug on Linux :(  Mac *might* work, but some Mac developer can worry about that.  So this patch just changes the Windows behaviour and leaves everyone else alone.

I can't think of a good reviewer who would be familiar with this but also uses Windows (or doesn't care about Windows enough to trust me ;)  Any suggestions?
Attachment #789310 - Attachment is obsolete: true
Attachment #791163 - Flags: review?
Comment on attachment 791163 [details] [diff] [review]
0001-Bug-904323-use-NS_DebugBreak-on-Windows-instead-of-s.patch

Why wouldn't you pass the message to NS_DebugBreak instead of printf'ing it separately?
(In reply to ben turner [:bent] from comment #7)
> Comment on attachment 791163 [details] [diff] [review]
> 0001-Bug-904323-use-NS_DebugBreak-on-Windows-instead-of-s.patch
> 
> Why wouldn't you pass the message to NS_DebugBreak instead of printf'ing it
> separately?

Thanks for chiming in!

I didn't feel it worthwhile to sprintf() first so the PID and process name can be printed too - but on reflection, it's probably not necessary to print them at all - these can be easily discovered once you are in the debugger.

This version just passes a static string that doesn't include the PID or process name to NS_DebugBreak().
Assignee: nobody → mhammond
Attachment #791163 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #791163 - Flags: review?
Attachment #791545 - Flags: review?(bent.mozilla)
Attachment #791545 - Flags: review?(bent.mozilla) → review+
Summary: NS_DebugBreak() when MOZ_DEBUG_APP_PROCESS or MOZ_DEBUG_CHILD_PROCESS is set → NS_DebugBreak() on Windows when MOZ_DEBUG_APP_PROCESS or MOZ_DEBUG_CHILD_PROCESS is set
https://hg.mozilla.org/mozilla-central/rev/1cf565c9884b
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
I'm sure this is a dumb question.. but, now that firefox on windows no longer pauses when MOZ_DEBUG_CHILD_PROCESS is set, how does one debug an npapi plugin?

Before this change, I'd set the MOZ_DEBUG_CHILD_PROCESS flag, then attach visual studio to the plugin-container.exe process.. but now, since firefox 26, the process isn't spawned with this flag and therefore I have nothing to attach to.

With respect to JIT debugging.. I have JIT debugging enabled for all types in visual studio, but it's never triggered from the MOZ_DEBUG_CHILD_PROCESS flag. I've also tried spawning firefox via: vsjitdebugger.exe, but I don't want to debug firefox.exe - just my plugin. I've found no way to attach to pause/attach to plugin-container.exe.

So.. what's the current/correct methodology?
Thnx!
(In reply to mark from comment #11)
> So.. what's the current/correct methodology?
> Thnx!

I'm not sure.  We end up in http://mxr.mozilla.org/mozilla-central/source/xpcom/base/nsDebugImpl.cpp#480 where we try and launch windbgdlg.exe - but this is only built in debug builds.  It might be worth checking if it works in a debug build.

But what confuses me is that even when we enter that block and fail to launch the exe, we still end up calling RealBreak() - but that doesn't actually break unless you are already in a debugger - which is somewhat useless.

I didn't try with a debug build, but if that also doesn't work, then yeah, the fix here seems to have become pointless - but it did work OK back when I landed it.
> But what confuses me is that even when we enter that block and fail to
> launch the exe, we still end up calling RealBreak() - but that doesn't
> actually break unless you are already in a debugger

It doesn't? AIUI ::DebugBreak() on Windows ought to trigger the JIT debugging dialog if there is not a debugger already attached.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #13)
> > But what confuses me is that even when we enter that block and fail to
> > launch the exe, we still end up calling RealBreak() - but that doesn't
> > actually break unless you are already in a debugger
> 
> It doesn't? AIUI ::DebugBreak() on Windows ought to trigger the JIT
> debugging dialog if there is not a debugger already attached.

It didn't for me - I expected it to work, and remember it working as you described (which was the entire point of this patch).  But to my surprise, today I could repro comment 11 and started debugging via printf and re-adding the sleep to manually break in.  I can't explain why the code didn't offer to start a new JIT session even though printf told me RealBreak() was called.  As mentioned, only tried opt.
(and I keep referring to RealBreak vs DebugBreak as even when it did break (due to already being in the debugger), DebugBreak didn't appear on the stack - which isn't particularly unexpected due to the opt build.  And although I gave up digging further, I'd put money on the fact it works fine in debug builds even without windbgdlg getting involved)
You need to log in before you can comment on or make changes to this bug.