626830 - jsd.asyncOn fails

Reporter

Description

•

13 years ago

Although our actual use of Firebug with jsd.asyncOn() seems to be successful in a few cases so far, I am not able to get our FBtest case turning jsd off and on to work.

My current code looks like:

            if (!jsd.isOn)
            {
                var startAsyncOn = new Date().getTime();
                if (FBTrace.DBG_ACTIVATION)
                    FBTrace.sysout("firebug-service.activation begin jsd.asyncOn "+startAsyncOn);
                jsd.asyncOn(  // turn on jsd for the next event
                {
                    onDebuggerActivated: function doDebuggerActivated()
                    {
                        // now we are in the next event and jsd is on.
                        var nowAsyncOn = new Date().getTime();
                        if (FBTrace.DBG_ACTIVATION)
                            FBTrace.sysout("firebug-service.activation now we are in the next event and JSD is ON "+nowAsyncOn+" delta: "+(nowAsyncOn - startAsyncOn)+"ms" );
                        fbs.onDebuggerActivated();
                        fbs.onJSDebuggingActive();
                    }
                });
            }

After returning from this code I run a setTimeout with a 5000ms delay. My tracing messages show the begin jsd.asyncOn, the pause for 5000ms, then the "now we are in the next event" message with a delta of 228ms. 

I can't explain this result.

However this bizarre result is not the only result. Sometimes the "now we are in" message never comes out.  Sometimes I get an XPCOM exception (NS_ERROR_NOT_AVAILABLE) [jsdIDebuggerService.off, sometimes I get
(NS_ERROR_NOT_INITIALIZED) [jsdIDebuggerService.pause]. I think eventually my code stops trying to start jsd because it thinks it is started.

I also see the XPCOM exceptions at other times, outside of the test.

I created a new test just with the script panel. This is *not* a reduced test case, but it is what I have:

1. install firebug svn (to get the same tracing messages)
2, install fbtest svn
3. run fbtest case Script section script/3918/AsyncOn.js

If you open Firebug tracing you can see the time stamps.  If you apply the steps manually, the test fails.  

I don't understand why it fails here and not on other cases.

John J. Barton

Reporter

Comment 1

•

13 years ago

See also Bug 595243 Expose debugMode to JSD

Whiteboard: [firebug-p1]

John J. Barton

Reporter

Comment 2

•

13 years ago

Firebug JS debug cannot be reliable without a fix or workaround here.

blocking2.0: --- → ?

Steve Fink [:sfink] [:s:]

Assignee

Comment 3

•

13 years ago

It is possible that this is due to early breakage from bug 626743; perhaps the compartment is getting reset in some scenarios. I'll deal with that first.

David Mandelin [:dmandelin]

Updated

•

13 years ago

blocking2.0: ? → betaN+

Depends on: 626743

Whiteboard: [firebug-p1] → [firebug-p1][hardblocker]

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

13 years ago

Assignee: nobody → sphink

Steve Fink [:sfink] [:s:]

Assignee

Comment 4

•

13 years ago

Nope, my WIP patch for bug 626743 does not appear to help here.

(In reply to comment #0)
> My current code looks like:
> ...
> After returning from this code I run a setTimeout with a 5000ms delay. My
> tracing messages show the begin jsd.asyncOn, the pause for 5000ms, then the
> "now we are in the next event" message with a delta of 228ms. 
> 
> I can't explain this result.

Which part?

You wait 5000ms between them because jsd.asyncOn just sets things up so that debugging will be turned on when the next event is processed, but it doesn't do anything to queue up an event itself. I assume if you were to move the mouse or set up a timeout or something, that it would happen sooner.

As for the human-experienced time being 5sec and the delta reporting being far less, I'd say either the browser was traveling near the speed of light relative to you, or there's something very odd going on. When I was doing the test manually, though, I didn't see that -- my delta was getting reported as very large.

Does the current fbtest have the 5sec delay in it? Doesn't seem to. I guess I could try putting one in myself.

> However this bizarre result is not the only result. Sometimes the "now we are
> in" message never comes out.  Sometimes I get an XPCOM exception
> (NS_ERROR_NOT_AVAILABLE) [jsdIDebuggerService.off, sometimes I get
> (NS_ERROR_NOT_INITIALIZED) [jsdIDebuggerService.pause]. I think eventually my
> code stops trying to start jsd because it thinks it is started.

I'm seeing these too, though it's hard to untangle what's going on and when.

It makes sense, though, if the debugger never managed to activate, but you started attempting to use it anyway. That'd result in the NS_ERROR_NOT_INITIALIZED. NS_ERROR_NOT_AVAILABLE looks like it happens if you had previously turned it off and it hasn't done a GC run to clean up dead scripts. Or something like that. Looks tricky; I'll need to talk to dmandelin about it.

> I also see the XPCOM exceptions at other times, outside of the test.
> 
> I created a new test just with the script panel. This is *not* a reduced test
> case, but it is what I have:
> 
> 1. install firebug svn (to get the same tracing messages)
> 2, install fbtest svn
> 3. run fbtest case Script section script/3918/AsyncOn.js
> 
> If you open Firebug tracing you can see the time stamps.  If you apply the
> steps manually, the test fails.
> 
> I don't understand why it fails here and not on other cases.

It does seem very erratic. If I run the test through the test console multiple times, it fails most of the time but occasionally succeeds.

Will look further...

Steve Fink [:sfink] [:s:]

Assignee

Comment 5

•

13 years ago

I've found one thing that legitimately seems wrong: the jsd.off() frequently fails because it has some dead scripts queued up and we're supposedly in the middle of a GC pass. We shouldn't really be able to run the JS code that calls jsd.off() in that case, so there's something wrong with the interlocking setup. I'm looking into it, but this does make me realize that this test is dependent on jsd.off() working. The comments in bug 595243 didn't sound too confident that it'd work. I'm wondering if the priority of jsd.off() is a little lower than eg bug 626909, and could potentially be postponed past beta10?

How reliant is the test suite on jsd.off() working?

I don't know if that ends up causing the other errors mentioned here.

The jsd.off() error will manifest itself as the NS_ERROR_NOT_AVAILABLE return value.

John J. Barton

Reporter

Comment 6

•

13 years ago

(In reply to comment #5)
> How reliant is the test suite on jsd.off() working?
> 

The real question could be: how reliant is Firebug on jsd.off() working? Answer: a lot ;-). That is how we stop debugging every page the users looks at.

Steve Fink [:sfink] [:s:]

Assignee

Comment 7

•

13 years ago

Attached patch Take note of forced GC cancels (obsolete) — Details — Splinter Review

The problem where jsd.off() returning NS_ERROR_NOT_AVAILABLE happens when you have some dead scripts, but GC callback cancels the GC run for some evil reason. JSD only remembers that it got a JSGC_BEGIN callback, and assumes that GC is still happening, so when jsdService::Off() is invoked, it barfs because you're not supposed to call it mid-GC (although it doesn't really matter if you have no dead scripts lying around, so it lets you).

This fixes the JSD bookkeeping. An alternative, broader fix would be to have GC invoke the JSGC_END callback when a JSGC_BEGIN callback returns false, cancelling the GC. That might fix other callers with similar assumptions, though it's more of a behavior change than this patch.

Attachment #505696 - Flags: review?(dmandelin)

Steve Fink [:sfink] [:s:]

Assignee

Comment 8

•

13 years ago

Attached file Log of failure to turn on — Details

Unfortunately, that patch does very little to help the test case.

What seems strange to me about the test case is that when it fails (or perhaps all the time?), I never see a call to jsdService::Off(). I don't know whether that's on the Firebug side or the JSD side.

Ah. Looks like it fails to turn off because it's already off; it failed to turn on first.

Ok, I'm attaching a log showing an example session where it failed to turn on. I added two bits of debugging output: first, I added a 0ms timer just after calling asyncOn in Firebug. Before its callback runs, JSD should have been turned on. If it isn't, it screams "PROBLEM SHOULD HAVE ACTIVATED!!!" You'll see that at the end of the log.

Second, I added a printout in nsXPConnect::Push that displays the "count", which I called depth, of the context stack before the push. It also says whether this is the main thread or not, because only the main thread counts. Debug mode will only be turned on when count reaches zero.

The log shows that count is not reaching zero. Could a Firebug person look at the portion of the log after the "PROBLEM" lines and before the "Push depth 0 main=1" line and tell me what was happening?

In brief, this appears to show that JSD is not turning on because the context stack doesn't drain, even though it was empty at the time asyncOn was called. Did Firebug somehow get stuff onto the stack after that? I don't know how that's possible, other than spinning a nested event loop, and that seems unlikely.

It's a little tricky to investigate, because it seems timing dependent: if I hang out on gdb breakpoints for any length of time, then it works. It's usually reproducible if I just add debugging printouts, though. I guess I should figure out how to dump out the stack or whatever it is that's being counted here.

(argh. forgot to click "submit" last night.)

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

13 years ago

Depends on: 627758

Steve Fink [:sfink] [:s:]

Assignee

Comment 9

•

13 years ago

Comment on attachment 505696 [details] [diff] [review]
Take note of forced GC cancels

Transferring this to a new bug 627758 because the main jsd.asyncOn flakiness is separate.

Attachment #505696 - Attachment is obsolete: true

Attachment #505696 - Flags: review?(dmandelin)

John J. Barton

Reporter

Comment 10

•

13 years ago

(In reply to comment #8)
...
> The log shows that count is not reaching zero. Could a Firebug person look at
> the portion of the log after the "PROBLEM" lines and before the "Push depth 0
> main=1" line and tell me what was happening?

I guess you are tracing with just ACTIVATION. We can know more with more tracing flags set.  Please set TESTCASE so we can see where you are.  Also if you open the tracing console you can click on the tracing messages and see that part of the JavaScript stack that Firefox chooses to show us, typically up to the event handler, sometimes across xpcom boundaries.

PanelActivation.showPanel; script
   We are just about to reveal the script panel 
fbs.showEvents true fbs.showEvals true
   A preference changed we are observing it.
shouldCreateContext hasAnnotation true for http://getfirebug.com/tests/content/script/3918/AsyncJSDPage.html 
   A new web page has arrived, we have determined that Firebug should be active on the page.


> 
> In brief, this appears to show that JSD is not turning on because the context
> stack doesn't drain, even though it was empty at the time asyncOn was called.

You must mean something else here. At the time jsd.asyncOn() is called, the stack can't be empty.

> Did Firebug somehow get stuff onto the stack after that? I don't know how
> that's possible, other than spinning a nested event loop, and that seems
> unlikely.

Firebug (JavaScript) has no direct way to add stuff to a stack that is empty.

One thing I thought of: maybe the tracing is trying to use jsd to get the call stack. But it seems we use Component.utils.stack.

Steve Fink [:sfink] [:s:]

Assignee

Comment 11

•

13 years ago

(In reply to comment #10)
> (In reply to comment #8)
> > In brief, this appears to show that JSD is not turning on because the context
> > stack doesn't drain, even though it was empty at the time asyncOn was called.
> 
> You must mean something else here. At the time jsd.asyncOn() is called, the
> stack can't be empty.

Sorry, I was being imprecise. I meant that just before an event was processed for which JS was invoked to call jsd.asyncOn(), the stack was empty. So after it returns, it should be empty again, unless it doesn't return for some reason -- eg, if it spins a nested event loop. But I doubt that it does.

> > Did Firebug somehow get stuff onto the stack after that? I don't know how
> > that's possible, other than spinning a nested event loop, and that seems
> > unlikely.
> 
> Firebug (JavaScript) has no direct way to add stuff to a stack that is empty.

I'm not talking about direct ways. I mean calling functions inside of which a timer expiration could be handled. (Again, a nested event loop is the obvious example, but I don't think it's that simple.)

> One thing I thought of: maybe the tracing is trying to use jsd to get the call
> stack. But it seems we use Component.utils.stack.

Hm.

Well, that was my state when I stopped. I threw the questions out hoping someone might have an obvious (to them) but miraculous (to me) insight. I'm not at all blocked; I'll start digging what's actually on the stack now. It's looking unlikely that I'll manage to fix this for beta10, though.

John J. Barton

Reporter

Comment 12

•

13 years ago

(In reply to comment #11)
> (In reply to comment #10)
> > (In reply to comment #8)
> > > In brief, this appears to show that JSD is not turning on because the context
> > > stack doesn't drain, even though it was empty at the time asyncOn was called.
> > 
> > You must mean something else here. At the time jsd.asyncOn() is called, the
> > stack can't be empty.
> 
> Sorry, I was being imprecise. I meant that just before an event was processed
> for which JS was invoked to call jsd.asyncOn(), the stack was empty. So after
> it returns, it should be empty again, unless it doesn't return for some reason
> -- eg, if it spins a nested event loop. But I doubt that it does.

So if I understand you, the function onDebuggerActivated() is called on an empty stack and when it returns that same stack is magically not empty?

Or is there another stack involved here (we don't what you 'context' thing is)?

Steve Fink [:sfink] [:s:]

Assignee

Comment 13

•

13 years ago

jjb: ignore me for now, looks like I was getting confused by multiple threads.

The stack I'm referring to is actually a stack of contexts (no, I won't go into exactly what those are right now). And it turns out the stack wasn't at all empty when entering the JS that calls asyncOn() -- the empty stack turned out to be on a different thread.

So I have this funky XPCJSContextStack of contexts that is nonempty, and that's blocking debugging from getting turned on. I need to understand why, and what it is in the first place.

Steve Fink [:sfink] [:s:]

Assignee

Comment 14

•

13 years ago

Attached patch Allow debugging to turn on when dormant contexts are on the XPCJSContextStack (obsolete) — Details — Splinter Review

I have weakened the condition required to activate debugging: instead of requiring the XPCJSContextStack to be completely empty, I allow it to contain contexts that are either NULL or dormant.

This fixes the asyncJSD test case for me.

Attachment #506004 - Flags: review?(dmandelin)

David Mandelin [:dmandelin]

Comment 15

•

13 years ago

Comment on attachment 506004 [details] [diff] [review]
Allow debugging to turn on when dormant contexts are on the XPCJSContextStack

Nit: |notRunningJS| seems a more informative name to me than |checkDebug|. Up to you, though.

Attachment #506004 - Flags: review?(dmandelin) → review+

Steve Fink [:sfink] [:s:]

Assignee

Comment 16

•

13 years ago

You're right, notRunningJS is much better.

Unfortunately, this fix is not the end of things. Here's another failure mode:

 - debugging is turned on fully
 - jsd.off() is called. This disables all the callbacks and things, but only calls XPConnect's SetDebugModeWhenPossible, which sets a flag for processing when no JS is running (gDesiredDebugMode=false, but gDebugMode is still true)
 - several ::Push()es go by with running JS, so the off request stored in gDesiredDebugMode cannot be satisified.
 - debugging is requested to be activated again via jsdService::AsyncOn(). This will set gDesiredDebugMode=true.

Now, from XPConnect's standpoint, gDesiredDebugMode=gDebugMode=true, so there's nothing to be done. That means that CheckDebugMode will not be called, and it will never call ActivateDebugger(). Debugging ends up being half-on (mostly on, really, but jsd.isOn will return false.)

The problem is that jsd_xpc.cpp says turning debugging on is asynchronous, but turning it off is synchronous. nsXPConnect.cpp says both are asynchronous.

This is fixable by making XPConnect turn off debugging immediately. This will skip some scripts, because they're currently live, but any debugging hooks firing in those scripts will be ignored.

And yet... even after this fix, tests like firebug/openDisableEnableReload.js are still not working all the time. It's possible that the test is invalid (because it's keeping running JS alive across disable/enable/reload, which would never happen in actual usage), but I haven't looked closely enough at it to be sure. And it would be nice if it could be made to work anyway. Being testable is good.

Steve Fink [:sfink] [:s:]

Assignee

Comment 17

•

13 years ago

Attached patch Part 2: jsd.off() should take effect immediately — Details — Splinter Review

Make XPConnect turn off debugging immediately, without waiting for a quiescent stack.

If any scripts are live when you turn off debugging, then they will remain in debugging mode until the next time they are recompiled. But no debugger functionality will be used, so it's just a temporary performance hit.

This fixes, somewhat inelegantly, the problem where turning debugging off then back on will end up with debugging only partly on if the stack does not quiesce in between.

Attachment #506500 - Flags: review?(dmandelin)

Steve Fink [:sfink] [:s:]

Assignee

Comment 18

•

13 years ago

Honza -- I still get a failure in firebug/openEnableDisableReload.js even after these patches. I don't really understand the failure mode, though. The failure is in

"The script [panel?] should be marked on the Firebug Statusbar Icon."

which makes it seem like debugging is not getting turned back on properly still, but I notice that soon before that it checks

    FBTest.compare("false", FBTestFirebug.isPanelTabDisabled(panelName), "The "+panelName+" panel should be enabled");

and that passes. So the script panel is enabled, but it is not marked on the Firebug Statusbar Icon, whatever that means? Is that referring to the light gray/dark gray color scheme on the "Console/HTML/CSS/Script/DOM/Net" panel? Why is that an "Icon"?

Sorry, I'm not trying to be nitpicky, I'm just having trouble performing the test steps manually and checking the result to see if this is a difference between automated test mode and manual operation.

David Mandelin [:dmandelin]

Updated

•

13 years ago

Attachment #506500 - Flags: review?(dmandelin) → review+

John J. Barton

Reporter

Comment 19

•

13 years ago

(In reply to comment #18)
>  So the script panel is enabled, but it is not marked on the
> Firebug Statusbar Icon, whatever that means? Is that referring to the light
> gray/dark gray color scheme on the "Console/HTML/CSS/Script/DOM/Net" panel? Why
> is that an "Icon"?

The "Firebug Statusbar Icon" is in the bar at the bottom (formerly status bar). Its a square UI element with a image of a bug: thus "Icon".

The Icon has two visual states: colored and gray. Gray means "all panels that cause big overhead are off"; colored means "at least one is one". The tooltip breaks it down.

The state of the icon is driven by our event onJSDActivate which is called by fbs.onJSDebuggingActive() in our callback onDebuggerActivated.

So the "enable" UI event is a request that fires jsd.asyncOn(); The Icon state is a response, driven the the onDebuggerActivated event. 

If the icon does not say "On" for Script panel, we did not get the event (or we have a bug). I had to change this code for asyncOn, so be sure you are on SVN head and if you are still puzzled I'll take a look. (Tracing  ACTIVATION should show the events).

Steve Fink [:sfink] [:s:]

Assignee

Comment 20

•

13 years ago

http://hg.mozilla.org/tracemonkey/rev/b7e91bafdcd9
http://hg.mozilla.org/tracemonkey/rev/d54eac606dd7

Status: NEW → ASSIGNED

Whiteboard: [firebug-p1][hardblocker] → [firebug-p1][hardblocker][fixed-in-tracemonkey]

Steve Fink [:sfink] [:s:]

Assignee

Comment 21

•

13 years ago

Backed out due to Windows template compilation failure.

http://hg.mozilla.org/tracemonkey/rev/8f09ff778d69
http://hg.mozilla.org/tracemonkey/rev/c52e5d68a747

Whiteboard: [firebug-p1][hardblocker][fixed-in-tracemonkey] → [firebug-p1][hardblocker]

Chris Leary [:cdleary] (not checking bugmail)

Comment 22

•

13 years ago

cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/b7e91bafdcd9
http://hg.mozilla.org/mozilla-central/rev/d54eac606dd7
http://hg.mozilla.org/mozilla-central/rev/c52e5d68a747 (backout)
http://hg.mozilla.org/mozilla-central/rev/8f09ff778d69 (backout)
Note: not marking as fixed because last changeset is a backout.

Steve Fink [:sfink] [:s:]

Assignee

Comment 23

•

13 years ago

The patch to make jsd.off() synchronous also triggers a crash on my test_callhooks mochitest, only when methodjit is enabled.

Seems like something is now getting recompiled when it shouldn't be. Either of the two patches on this bug could be at fault for that.

I really should get around to doing the jit support for gdb. A howling Jägermonkey ate my stack.

timeless

Updated

•

13 years ago

Component: JavaScript Debugging APIs → XPConnect

QA Contact: jsd → xpconnect

Steve Fink [:sfink] [:s:]

Assignee

Comment 24

•

13 years ago

(In reply to comment #18)
> Honza -- I still get a failure in firebug/openEnableDisableReload.js even after
> these patches. I don't really understand the failure mode, though.

(Replying to myself) The test had not been updated for the AsyncOn change. I submitted a change to FBTest which has been applied, and the test now succeeds. I think there are probably a number of other tests with the same problem.

Steve Fink [:sfink] [:s:]

Assignee

Comment 25

•

13 years ago

Attached patch Allow debugging to turn on when dormant contexts are on the XPCJSContextStack — Details — Splinter Review

Sorry for the patch churn, but I have to update this patch to do the main thread detection, matching what I just did in bug 626743.

The other patch here on this bug does not change.

Attachment #506004 - Attachment is obsolete: true

Attachment #509496 - Flags: review?(dmandelin)

David Mandelin [:dmandelin]

Updated

•

13 years ago

Attachment #509496 - Flags: review?(dmandelin) → review+

Steve Fink [:sfink] [:s:]

Assignee

Comment 26

•

13 years ago

http://hg.mozilla.org/tracemonkey/rev/fa29bac3f858
http://hg.mozilla.org/tracemonkey/rev/b647ff21bfed

Whiteboard: [firebug-p1][hardblocker] → [firebug-p1][hardblocker][fixed-in-tracemonkey]

Asa Dotzler [:asa]

Updated

•

13 years ago

Whiteboard: [firebug-p1][hardblocker][fixed-in-tracemonkey] → [firebug-p1][hardblocker][fixed-in-tracemonkey][has patch]

Chris Leary [:cdleary] (not checking bugmail)

Comment 27

•

13 years ago

cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/b647ff21bfed

Chris Leary [:cdleary] (not checking bugmail)

Updated

•

13 years ago

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

John J. Barton

Reporter

Comment 28

•

13 years ago

Shortly after this landed our testbot began crashing, Bug 632343. Of course I don't know if these are related.

Dave Townsend [:mossop]

Updated

•

13 years ago

Depends on: 632343

Take note of forced GC cancels 13 years ago Steve Fink [:sfink] [:s:] 1.64 KB, patch		Details \| Diff \| Splinter Review
Log of failure to turn on 13 years ago Steve Fink [:sfink] [:s:] 6.54 KB, text/plain		Details
Allow debugging to turn on when dormant contexts are on the XPCJSContextStack 13 years ago Steve Fink [:sfink] [:s:] 1.37 KB, patch	dmandelin : review+	Details \| Diff \| Splinter Review
Part 2: jsd.off() should take effect immediately 13 years ago Steve Fink [:sfink] [:s:] 1.33 KB, patch	dmandelin : review+	Details \| Diff \| Splinter Review
Allow debugging to turn on when dormant contexts are on the XPCJSContextStack 13 years ago Steve Fink [:sfink] [:s:] 1.81 KB, patch	dmandelin : review+	Details \| Diff \| Splinter Review