Closed Bug 785704 Opened 12 years ago Closed 7 years ago

Source text is never received from the active thread when hitting a breakpoint too soon

Categories

(DevTools :: Debugger, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: vporof, Assigned: past)

References

Details

(Whiteboard: [fixed-in-fx-team][leave open])

Attachments

(2 files, 6 obsolete files)

It takes some perseverance, but I managed to get the source editor into a "Loading..." state (basically waiting for the channel's asyncOpen to finish in SS__onLoadSource, or so it seems).

STR:
1. go to http://support.mozilla.org/en-US/kb
2. add a breakpoint on the fourth line in build:6d3703a
3. refresh it like it's 1999

I think black eagle's source fetching patches (bug 772119 iirc) may autofix this.
Depends on: 755661
Priority: -- → P1
Assignee: nobody → vporof
Status: NEW → ASSIGNED
Attached file protocol logs (obsolete) —
Ok, so bug 795368 changed the behavior a bit. Now in the frontend, "Loading..." doesn't appear anymore and it looks like activeThread.source never callbacks when trying to get a source's text (in SS_getText from debugger-controller.js), although the request was made correctly/successfully to the server.

However the breakpoint hits and everything else works as expected (variables are fetched in scope, stackframes displayed etc.).

Nick, wanna take a look at this please?
Flags: needinfo?(nfitzgerald)
Summary: Hitting a breakpoint too soon may cause source to not be displayed in the debugger → Source text is never received from the active thread when hitting a breakpoint too soon
Attachment #719898 - Attachment is patch: false
I can always reproduce this:

1. Open debugger on http://grssam.com/test/test.html
2. Refresh

Source text is not displayed, debugger is paused.

As far as I can tell, the issue is that the thread actor gets paused before it gets a chance to send out the sources.
I believe the problem is that in this case (break during page loading - script parsing) we never get the second tabNavigated event, so DC__onTabNavigated doesn't instruct SourceScripts to fetch the sources.

It isn't related to Nick's change, I see the same behavior in Nightly. I know that we don't get the page title in the first tabNavigated event, but I can't remember if we had a reason to not trigger frontend updates on the first event and ignore the second one instead.
(In reply to Victor Porof [:vp] from comment #3)
> I can always reproduce this:
> 
> 1. Open debugger on http://grssam.com/test/test.html
> 2. Refresh
> 
> Source text is not displayed, debugger is paused.
> 
> As far as I can tell, the issue is that the thread actor gets paused before
> it gets a chance to send out the sources.

I created that page to show that debugger considers two scripts are one, that is js.php?type=1 and js.php?type=2 are both considered as a single script (js.php) . Maybe that confusion is the cause of this ?
(In reply to Girish Sharma [:Optimizer] from comment #5)
> 
> I created that page to show that debugger considers two scripts are one,
> that is js.php?type=1 and js.php?type=2 are both considered as a single
> script (js.php) . Maybe that confusion is the cause of this ?

No, that's a completely different problem, solved by bug 812083.
(In reply to Panos Astithas [:past] from comment #4)
> I believe the problem is that in this case (break during page loading -
> script parsing) we never get the second tabNavigated event, so
> DC__onTabNavigated doesn't instruct SourceScripts to fetch the sources.
> 

Ah, that's a very plausible explanation. It seems that the request is definitely being made to the server to send out the source text, but that never gets fulfilled, and it's certainly possible that this happens because we don't getSources() in time.

> It isn't related to Nick's change, I see the same behavior in Nightly. I
> know that we don't get the page title in the first tabNavigated event, but I
> can't remember if we had a reason to not trigger frontend updates on the
> first event and ignore the second one instead.

Maybe just reordering the way tab navigations are handled in the frontend could do the trick.
Attached patch re-enable tab navigation (obsolete) — Splinter Review
Enabled tab navigation handling for ThreadState, StackFrames and SourceScripts as discussed. At first glance, this seems to do the trick, however, intermittently, the following behavior occurs:

(there are two sources avaialble, js.php?type=1 and js.php?type=2, let's call them A and B for brevity)

1. Page is refreshed, tab navigated event is fired.
2. SourceScripts tries to getSources. Only source A is retrieved as part of the response
3. Server got B as a newSource, but the frontend listener SS__onNewSource isn't fired in time
4. First breakpoint hits in B
5. The frontend tries to update the editor with source B, but the text is never retrieved
Attachment #719898 - Attachment is obsolete: true
Attachment #719943 - Flags: feedback?(past)
(In reply to Panos Astithas [:past] from comment #4)
> 
> It isn't related to Nick's change, I see the same behavior in Nightly. I
> know that we don't get the page title in the first tabNavigated event, but I
> can't remember if we had a reason to not trigger frontend updates on the
> first event and ignore the second one instead.

Hmm, it seems it is somehow related to bug 795368. I can reproduce this with almost any page now, for example:

1. Go to http://todomvc.com/architecture-examples/backbone/ and open debugger
2. Set a breakpoint on line 32 in app.js
3. Refresh
4. Wait for the breakpoint to be hit
5. Refresh again

There's a spew of errors in the Error console, like

Error: can't set breakpoint: script global is not a debuggee
Source File: chrome://global/content/devtools/dbg-script-actors.js
Line: 576

or 

Error: Error: Error getting sources: unknownError
Source File: chrome://browser/content/debugger-controller.js
Line: 1174

or 

Error: can't set breakpoint: script global is not a debuggee
Source File: chrome://global/content/devtools/dbg-script-actors.js
Line: 576
Blocks: 795368
If you make the following changes (which should probably be committed):

diff --git a/browser/devtools/debugger/debugger-controller.js b/browser/devtools/debugger/debugger-controller.js
index 22bab7e..6021893 100644
--- a/browser/devtools/debugger/debugger-controller.js
+++ b/browser/devtools/debugger/debugger-controller.js
@@ -1171,7 +1171,7 @@ SourceScripts.prototype = {
    */
   _onSourcesAdded: function SS__onSourcesAdded(aResponse) {
     if (aResponse.error) {
-      Cu.reportError(new Error("Error getting sources: " + aResponse.error));
+      Cu.reportError(new Error("Error getting sources: " + aResponse.message));
       return;
     }


you get the following error message which is slightly better:

Error: Error: Error getting sources: error occurred while processing 'sources' request: Error: can't set breakpoint: script global is not a debuggee
Source File: chrome://browser/content/debugger-controller.js
Line: 1174
Flags: needinfo?(nfitzgerald)
Attached patch wat (obsolete) — Splinter Review
After some digging around with Nick we found out that _addScript from _discoverScriptsAndSources is behaving weirdly in between tab navigation events (after "start" and before "stop"). This scenario is only possible if the active thread got paused while on a breakpoint during page load, in which case Debugger.Script throws an "Error: can't set breakpoint: script global is not a debuggee". The culprit line seems to be
> s.setBreakpoint(offsets[i], bpActor);
in TA__setBreakpoint. 

Now, the initial guess was "let's put that into a try catch" and things seemed to be working as expected, even without the previous patch (attachment 719943 [details] [diff] [review]) and the scenario described in comment #10 was also fixed.

However, I started writing some tests and the problem persisted: the source wasn't visible after a refresh if a breakpoint was set, with or without the previous patch.

Panos, halp! :)
Attachment #720967 - Flags: feedback?(past)
Ok, so what is happening is when the server detaches due to the refresh, we blow away the threadLifetimePool, which contains all the source actors.

We can't really promote source actors to a foreverLifeTimePool (obviously) so I guess we will need to watch for noSuchActor errors and just try again?
Attached patch Patch v2 (obsolete) — Splinter Review
So apparently there was a regression from bug 795368, that I didn't catch during review: newSource events weren't handled, because the listener was being added to the wrong client. This fixes the case mentioned in comment 10, but isn't enough for the case mentioned in comment 3, or to get the test to pass.

The problem is that the server never loads the requested source in those cases, for some reason. What I see from the additional logging is that the channel listener never gets called. I can't see what is different in this case from others where it works as expected though.
Attachment #719943 - Attachment is obsolete: true
Attachment #720967 - Attachment is obsolete: true
Attachment #719943 - Flags: feedback?(past)
Attachment #720967 - Flags: feedback?(past)
Comment on attachment 721306 [details] [diff] [review]
Patch v2

Review of attachment 721306 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/devtools/debugger/server/dbg-script-actors.js
@@ -1192,5 @@
> -        startLine: aScript.startLine,
> -        lineCount: aScript.lineCount,
> -        source: this._getSource(aScript.url).form()
> -      });
> -    }

Woah, are we never sending newScript packets anymore? I thought we discussed this and decided to keep them around?
Also, removing that will cause the backwards compat tests to fail.
(In reply to Nick Fitzgerald [:fitzgen] from comment #15)
> Comment on attachment 721306 [details] [diff] [review]
> Patch v2
> 
> Review of attachment 721306 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/devtools/debugger/server/dbg-script-actors.js
> @@ -1192,5 @@
> > -        startLine: aScript.startLine,
> > -        lineCount: aScript.lineCount,
> > -        source: this._getSource(aScript.url).form()
> > -      });
> > -    }
> 
> Woah, are we never sending newScript packets anymore? I thought we discussed
> this and decided to keep them around?

I mainly did this to be able to read packet logs without going crazy. We did leave it in explicitly for forwards compat (old client -> new server), but to be honest, I'm beginning to think that it was not a very good idea.
I am fine with removing it in the future, but this means we will need to rewrite either testcompatactors.js or the whole tests for backwards compatibility.

I get that we are getting the noSuchActor error because the threadLifetimePool that contained the source actor is getting blown away on refresh, but shouldn't we get that source again via onNewSource? Why don't we display the source again after we get the newSource packet?
Attached patch patch v3 (obsolete) — Splinter Review
In this patch, I build on our other patches and make it so that we don't do the infinite noSuchActor error loop, and make it so the test will actually run and finish, rather than timeout. It still doesn't pass, but it is easier to test against now. What event should I be listening to rather than doing the setTimeout in the test?

> The problem is that the server never loads the requested source in those cases,
> for some reason. What I see from the additional logging is that the channel
> listener never gets called. I can't see what is different in this case from
> others where it works as expected though.

This is because the actor which does the fetching is getting destroyed because it is part of the threadLifetimePool, which is blown away on reload. This is why we were getting noSuchActor errors, and then because the DebuggerView.Sources item attachment wasn't marked as loaded, we try to load it again, etc, etc, but we will always have a reference to a source actor that no longer exists so the loop won't stop.

I don't know why dropping all the sources, and letting our listeners of the newSource event repopulate DebuggerView.Sources isn't working, what with the preferredValue and all that. I'm looking into it. Although this is still a band aid, not really a solution to why bugs like this are cropping up.

I feel that instead of blowing away all of our threadLifetimePool actors immediately when the user hits reload, we should do it when the page is *actually* unloaded right before it is replaced with the refreshed/new one. As far as I know, we don't have any events for this. I think the tab navigation start/stop is the wrong cycle to mirror here. Does that make sense?
Attachment #721306 - Attachment is obsolete: true
I didn't get a chance to be more detailed in my previous comment, which confused things a bit, sorry about that. Let me try to fix it.

The reason we had to return early in DC__onTabNavigated when receiving tabNavigated:start is that at that point the old actors are still in place. This is still necessary.

During navigation the client used to be notified when a DOMWindowCreated occurred, which was the right time to get rid of any existing actor fronts (clients) and request new ones. In order to accommodate the web console we modified tabNavigated to trigger a little earlier than that (tabNavigated:start - when page loading is about to start) and a little later than that (tabNavigated:stop - when page loading has finished). The latter event is a good place to hang a request to fetch sources from, provided that the intermediate newSource events are properly handled, since this is the case for inline scripts.

If the client sends a "source" packet to a source actor obtained after receiving the tabNavigated:start packet, it will most likely get a noSuchActor error, since by the time the request will be processed, the actors will be removed, as you noticed. The source actors received after tabNavigated:stop will not have such problems, that's why with patch v2 I didn't see any such errors.

I tried adding a third tabNavigated:windowCreated event on DOMWindowCreated and hang the "sources" request off that, but it didn't provide any real benefit, besides clearing the script list sooner.

Note also that my previous patch fixed another bug that I discovered in SS__onNewSource that failed to filter out bogus URLs (f.e. self-hosted) after the scripts->sources change.

I think the above should clarify some things I left unexplained last time. Now, in regards to your question about setTimeout in the test, you shouldn't have to wait for anything, which was obvious from the previous form of the test, which was left waiting indefinitely without a source loaded in the editor.

I see that resuming the debugger after the state where the editor displays "Loading..." results in the source being loaded and delivered, so it seems to me that either the loading started before the server spun the nested event loop, or that suppressEventHandling somehow blocks channel operation. The latter seems more in line with the packet logs that I have observed, but I haven't heard of a similar platform change lately and I don't think we could have avoided hitting this case until now.
I'm too stupid for this bug.
Assignee: vporof → nobody
Status: ASSIGNED → NEW
Assignee: nobody → past
Status: NEW → ASSIGNED
Attached patch Part 1: fix for the regression (obsolete) — Splinter Review
The first part is just the the fix for the regression. Breaks browser_dbg_scripts-searching-files_ui.js for some reason.
Attachment #721505 - Attachment is obsolete: true
Attached patch v1.1Splinter Review
Tests pass.
Attachment #723728 - Attachment is obsolete: true
Attachment #723971 - Flags: review+
Blocks: 762160
The previous patch fixed the most critical issue, so I'm lowering the priority of the bug to reflect reality.
Priority: P1 → P2
I think this has been left open long enough.  Feel free to reopen if you feel otherwise.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: