Closed Bug 1279129 Opened 3 years ago Closed 3 years ago

[jsdbg2] `findScripts` does not always return all scripts

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox50 --- fixed

People

(Reporter: jlong, Assigned: ejpbruel)

Details

Attachments

(2 files, 1 obsolete file)

Sliding breakpoints strike again... I'm trying to add a breakpoint on line 17, 18, or 19 but slides down to line 22: http://jlongster.com/s/upload/a1223781-f6d7-4000-ac6b-9fd2656fd6e7.png

This is on the URL http://todomvc.com/examples/react/#/ in `todoModel.js`.

This only happens once in a blue moon. We've been using this page as a test page for the new debugger and we've hit it a few times. I sat down town and could reliably reproduce it by randomly doing actions on the page and refreshing it a *ton*. About 5-10 min into randomly refreshing it, I can hit a session where I can't set the breakpoint.

The current sliding breakpoint algorithm is predicated on the fact that parent scripts *always* keep their children alive. But that doesn't seem to be the case here. I added logging at the top of `_setBreakpoint` (https://github.com/mozilla/gecko-dev/blob/master/devtools/server/actors/source.js#L711) to get all the scripts on that line and dump them out (the first 100 chars of them). When I can set the breakpoint on line 17, there are 2 scripts, which is correct:

JWL length: 2
JWL () {
	'use strict';

	var Utils = app.Utils;
	// Generic "model" object. You can use whatever
	// fr
JWL (key) {
		this.key = key;
		this.todos = Utils.store(key);
		this.onChanges = [];
	}

The first one is the IIFF, and the second one is the TodoModel constructor.

When I cannot set a breakpoint, the constructor script is gone:

JWL length: 1
JWL () {
	'use strict';

	var Utils = app.Utils;
	// Generic "model" object. You can use whatever
	// fr

I will dig into this more and see if I can get any more info. Is there any reason why parent scripts will not keep their children alive?
I'm actually glad that sliding breakpoints exposed this behavior because parents not keeping their children alive should be a bug. I suspect we may have a bug in the script store, which we've been meaning to remove anyway
I added some more logging that explicitly calls `getChildScripts` on the IIFE that should have the constructor as a child, and I have not been able to reproduce for 15 min. Is there any way there's a weird interaction which is causing a bug in the Debugger API? We use `findScripts` to populate the script store which is where the breakpoint functions get scripts.
Of course, right when I say that, I'm able to reproduce. It looks like `getChildScripts()` is still returning the right number of child scripts. My guess it that the script does still exist, but it's not being populated in the script store correctly for some reason.
Alright, I've reduced this down a little farther. This seems to reproduce it somewhat reliably:

1. Go to http://todomvc.com/examples/react/#/
2. Open devtools, go to `togoModel.js` and try to set a breakpoint on line 18. It should set properly.
3. Close devtools
4. Wait ~5 seconds (maybe interact with the page some)
5. Open devtools again (don't reload page)
6. Try to set a breakpoint in the same place.
7. If breakpoint doesn't slide, repeat 3-6

I was able to reproduce a few times in a row with this method. Closing & re-opening the devtools on the same page seems to aggravate it (able to repro within ~1 min). I was able to reproduce by refreshing the page and opening the devtools in rare cases; it took a *lot* longer for it to appear (~10-15 min of testing).

I added logging to isolate the issue. In the first comment I showed that for some reason the script store isn't giving me the inner script, and thus we slide across it. I added more logging to use `getChildScripts()`, count them, and show the first script in that array (which should be the one we want).

In `_setBreakpoint`, first I logged the scripts that I got back from the script store for line 18:

JWL length: 1
JWL () {
	'use strict';

	var Utils = app.Utils;
	// Generic "model" object. You can use whatever
	// fr

This is wrong. It should have 2 scripts, and the second one should be the inner script. Now I logged the result from `script.getChildScripts()` to see if it even exists as a child of the script from above:

JWL 9 scripts
JWL first child: (key) {
		this.key = key;
		this.todos = Utils.store(key);
		this.onChanges = [];
	}

This is correct. The script from the script store has 9 child scripts, and first child is the inner script that *should* have been returned.

Next, I added logging in the script store in two methods: `addScripts` and `getScriptsBySourceAndLine`. The former is called when the devtools is loaded to get all the scripts on the existing page, and latter is called by `_setBreakpoint` to get the scripts. I simply logged the total number of scripts that exists in the store. The results:

// When the breakpoint is set correctly:

JWL addScripts: 2061
JWL getScriptsBySourceAndLine count: 2061

// Close and re-open devtools and the breakpoint is wrong:

JWL addScripts: 924
JWL getScriptsBySourceAndLine count: 924

The debugger uses `findScripts` to get all the scripts passed to `addScripts`. So `findScripts` is not getting all of the scripts (1137 are missing), or they are somehow GCed even though we can still reference them.

At this point, I'm not sure what's wrong but it's definitely something inside the Debugger API, probably within `findScripts`. In fact, I just closed and re-opened the devtools and the breakpoint *now works*, which means this has nothing to do with garbage collection. It's simply a bug in `findScripts`. It's not finding all the scripts that it should sometimes.
Summary: [jsdbg2] Parent scripts don't always keep their children alive → [jsdbg2] `findScripts` does not always return all scripts
Attached patch 1279129.patchSplinter Review
Here is a mochitest that seems to reliably repro it for me. With this applied, run `/mach mochitest --subsuite devtools --run-until-failure devtools/client/debugger/test/mochitest/browser_dbg_slide.js` to run it. After the toolbox opens a few times you should start seeing a test failure.

With this, you should be able to get it into a bad state and see what is going on. It sounded like Eddy is interested in working on this.

I can try to work on a more minimal test case later on but I don't have time right now.
Marking P1 because this is a critical bug. Even if we turned off breakpoint sliding, it won't be setting breakpoints on the inner scripts so breakpoints won't be hit. It's not a problem with sliding itself. We need these inner scripts so we can set breakpoints on them.
Priority: -- → P1
Also assigning to Eddy because he said he was interested. If not we should find someone else who can fix this soon.
Assignee: nobody → ejpbruel
Started to look into this.
I think I figured out what is going on here:

In the debugger, on page reload, we first remove all globals for the page as debuggee. At that point, in order to save memory, SpiderMonkey may decide to relazify the scripts in the globals' compartment. We then re-add the globals for the page as debuggee (if the reload is actually a navigation, the set of globals for the page will have changed).
    
If we then call findScripts in the debugger, any lazified scripts should automatically be delazified. However, this only happens if the compartment in
which those lazified scripts live has been explicitly marked as containing scripts that need delazification. We do this when a script is lazified, or when
two compartments are merged, but not when scripts are relazified.
    
As a consequence, scripts that are relazified during a reload will not be unlazified, making them effectively invisible to the debugger. To avoid this,
we need to explicitly mark schedule each compartment for delazification when a script in that compartment is delazified.

With the attached patch applied, I can no longer reproduce the bug. James, can you verify that this is indeed the case?

Shoutout to Jan, without whom's help I would have never found the issue so quickly!
Flags: needinfo?(jlong)
Attachment #8767961 - Flags: review?(jdemooij)
Sounds like a concrete reason for the behavior that makes sense with my mental model of how to reproduce it. I can't get the test to fail anymore for me. This definitely fixes it.

Thanks a lot for tracking this down! This is worthy of uplifting IMHO, it's a 1-line patch that seems relatively harmless and fixes a rather serious debugger issue.
Flags: needinfo?(jlong)
Would you mind writing a jit-test for this? It shouldn't be too hard I think. I can also take a stab at it if needed.
(In reply to Jan de Mooij [:jandem] from comment #11)
> Would you mind writing a jit-test for this? It shouldn't be too hard I
> think. I can also take a stab at it if needed.

Sure, but I might need some guidance. Now that James confirmed that the patch fixes the issue, I'll rewrite it with a test tomorrow.
Comment on attachment 8767961 [details] [diff] [review]
Should schedule delazification for Debugger when relazifying scripts.

Review of attachment 8767961 [details] [diff] [review]:
-----------------------------------------------------------------

Cancelling review for now until we have a test.

::: js/src/jsfun.cpp
@@ +1779,5 @@
>          MOZ_ASSERT(isExtended());
>          MOZ_ASSERT(getExtendedSlot(LAZY_FUNCTION_NAME_SLOT).toString()->isAtom());
>      }
> +
> +    script->compartment()->scheduleDelazificationForDebugger();

Nit: this can be comp->scheduleDelazificationForDebugger();
Attachment #8767961 - Flags: review?(jdemooij)
New patch with test added.
Attachment #8767961 - Attachment is obsolete: true
Attachment #8768403 - Flags: review?(jdemooij)
Comment on attachment 8768403 [details] [diff] [review]
Should schedule delazification for Debugger when relazifying scripts.

Review of attachment 8768403 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, thanks for writing the test.

::: js/src/jsfun.cpp
@@ +1779,5 @@
>          MOZ_ASSERT(isExtended());
>          MOZ_ASSERT(getExtendedSlot(LAZY_FUNCTION_NAME_SLOT).toString()->isAtom());
>      }
> +
> +    script->compartment()->scheduleDelazificationForDebugger();

Nit: this can be |comp->scheduleDelazificationForDebugger();|
Attachment #8768403 - Flags: review?(jdemooij) → review+
Would be great to get this checked in :)
Flags: needinfo?(ejpbruel)
I'm on it.
Flags: needinfo?(ejpbruel)
Pushed by ejpbruel@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/81cb60fb8fbd
Should schedule delazification for Debugger when relazifying scripts. r=jandem
https://hg.mozilla.org/mozilla-central/rev/81cb60fb8fbd
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
You need to log in before you can comment on or make changes to this bug.