Closed Bug 909121 Opened 11 years ago Closed 11 years ago

Inspector breaks when navigating backwards

Categories

(DevTools :: Inspector, defect)

x86
macOS
defect
Not set
normal

Tracking

(firefox24 unaffected, firefox25+ verified, firefox26 verified, firefox27 verified)

VERIFIED FIXED
Firefox 26
Tracking Status
firefox24 --- unaffected
firefox25 + verified
firefox26 --- verified
firefox27 --- verified

People

(Reporter: fitzgen, Assigned: dcamp)

References

Details

Attachments

(3 files, 2 obsolete files)

Not 100% what the STR is yet, but I was on hacker news, inspecting things with Cmd+Opt+C and going back and forth  between the debugger and inspector with Cmd + [ and Cmd + ]. Accidentally went forward (and then back again) in history. Next thing I know, the tree view is blank and white.

Getting these errors in the console:

baseNode is null

WalkerActor<.querySelector<@resource://gre/modules/commonjs/toolkit/loader.js -> resource://gre/modules/devtools/server/actors/inspector.js:1218
actorProto/</handler@resource://gre/modules/commonjs/toolkit/loader.js -> resource://gre/modules/devtools/server/protocol.js:906
DSC_onPacket@resource://gre/modules/devtools/dbg-server.jsm -> resource://gre/modules/devtools/server/main.js:932
@resource://gre/modules/devtools/dbg-server.jsm -> resource://gre/modules/devtools/server/transport.js:251
@resource://gre/modules/devtools/dbg-server.jsm -> resource://gre/modules/devtools/DevToolsUtils.js:72
Attached patch fix-unload.diff (obsolete) — Splinter Review
Can reproduce this by going Back.

The problem here is that the tabNavigated event comes immediately, but the documentUnload is a mutation that might be delivered a bit later.

Two possible solutions: we could stop caring about 'tabNavigated' in the inspector, and watch for the 'newRoot' mutation instead.  That would work.

This solution just lets the walker front know that it's going to be unloaded, for sure, so it can drop its stuff now.

I actually prefer the first solution, but here's a patch for the second.
Summary: I broke the html tree view somehow → Inspector breaks when navigating backwards
Attached patch fix-unload.diff (obsolete) — Splinter Review
So this bug is caused by the fact that a tab actor's 'navigate' signal is delivered immediately, but the walker front isn't ready for navigation until it has processed mutations, which might come later.

Since the whole inspector is concerned with the walker actor's view of the world, it makes sense to just wait for the walker to catch up before processing a navigation in the inspector.
Attachment #795496 - Attachment is obsolete: true
Attachment #801813 - Flags: review?(jwalker)
Attachment #801813 - Flags: feedback?(bgrinstead)
This affects Firefox 25, and is pretty nasty.
Comment on attachment 801813 [details] [diff] [review]
fix-unload.diff

Review of attachment 801813 [details] [diff] [review]:
-----------------------------------------------------------------

With the patch applied, it seems to work, but if I open devtools, navigate to a new page, and then go back I get a console.trace from line 2009 in inspector.js: console.trace("Got a mutation for an unexpected actor: " + targetID + ", please file a bug on bugzilla.mozilla.org!");

console.trace: 
_actors/inspector.js 2009 exports.WalkerFront<.getMutations</<
_sdk/core/promise.js 118 resolve
_sdk/core/promise.js 43 then
_sdk/core/promise.js 185 resolve
_sdk/core/promise.js 118 resolve
_sdk/core/promise.js 43 then
_sdk/core/promise.js 185 resolve
_sdk/core/promise.js 118 resolve
_sdk/core/promise.js 43 then
_sdk/core/promise.js 185 resolve
_/server/protocol.js 1057 Front<.onPacket
_ools/dbg-client.jsm 657 DC_onPacket/<
_sdk/core/promise.js 118 resolve
_sdk/core/promise.js 43 then
_sdk/core/promise.js 153 then
_ools/dbg-client.jsm 698 DC_onPacket
_server/transport.js 255 LDT_send/<
_ls/DevToolsUtils.js 72 makeInfallible/<

I'd like to try implementing your first possible solution, and have a patch incoming.
Attachment #801813 - Flags: feedback?(bgrinstead) → feedback-
Attached patch 909121.patchSplinter Review
Stop caring about 'tabNavigated' in the inspector, and watch for the 'newRoot' mutation instead.

Is this a reasonable solution?  I could definitely be missing something, but I've tested loading new pages and going back/forward.  Alternatively, we could figure out what is causing console.trace() in the first path.
Attachment #802468 - Flags: feedback?(dcamp)
Comment on attachment 802468 [details] [diff] [review]
909121.patch

Review of attachment 802468 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah, that seems reasonable.
Attachment #802468 - Flags: feedback?(dcamp) → review+
I also opened Bug 914744, which I bumped into while debugging this.  It has to do with an error in the highlighter when changing nodes during the time in between starting to load a new page and actually loading the page.  I think it is a separate issue entirely, but it may affect the approach to take here.
Comment on attachment 802468 [details] [diff] [review]
909121.patch

Review of attachment 802468 [details] [diff] [review]:
-----------------------------------------------------------------

I was taking a more thorough look at this, and realized that neither of these patches will be good.  Both fix the back button, but they both break on multiple page changes.  So if I navigate from Page A -> Page B that works, but from Page B -> Page C they both fail.  

Attachment #801813 [details] [diff] fails by completely removing the markup view (much like the unpatched version does when pressing back)

Attachment #802468 [details] [diff] fails by keeping the markup tree of Page B no matter how many pages you revisit.  It seems that the then callback _rootNodeDeferred is not firing after the second page is loaded, which causes new-root to not fire. Probably similar to what is happening with the first patch, and could be a timing issue.
Attachment #802468 - Flags: review+ → review-
Attachment #801813 - Flags: review?(jwalker) → review-
(In reply to Brian Grinstead [:bgrins] from comment #8)
> Comment on attachment 802468 [details] [diff] [review]
> 909121.patch
> 
> Review of attachment 802468 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I was taking a more thorough look at this, and realized that neither of
> these patches will be good.  Both fix the back button, but they both break
> on multiple page changes.  So if I navigate from Page A -> Page B that
> works, but from Page B -> Page C they both fail.  
> 
> Attachment #801813 [details] [diff] [diff] fails by completely removing the markup
> view (much like the unpatched version does when pressing back)
> 
> Attachment #802468 [details] [diff] [diff] fails by keeping the markup tree of Page
> B no matter how many pages you revisit.  It seems that the then callback
> _rootNodeDeferred is not firing after the second page is loaded, which
> causes new-root to not fire. Probably similar to what is happening with the
> first patch, and could be a timing issue.

More information: this navigation problem (open devtools, naviagate from Page A -> Page B -> Page C) also happens to the unpatched source.  It was working on nightly 26.0a1 (2013-09-06).

Neither the newRoot nor the documentUnload mutations are firing after Page B loads.  You see a brief "flicker" on the unpatched one because it is listening for the "navigate" event from the frontend on inspector-panel, whereas with Attachment #802468 [details] [diff] applied it doesn't change at all (as the newRoot isn't happening).  I'm looking into this more, but I may need some more information, as I'm not too familiar with the lifecycle of the WalkerActor and WalkerFront.  Is there a place I should start or anything that stands out as a problem?

Seems like the onFrameUnload is bailing out before the mutatation at 

    if (!documentActor) {
      return;
    }

and the onFrameLoad is skipping the mutation out with rootDoc not falsy here:

    if (!frame && !this.rootDoc) {

    }
Flags: needinfo?(dcamp)
(In reply to Brian Grinstead [:bgrins] from comment #9)
> (In reply to Brian Grinstead [:bgrins] from comment #8)
> > Comment on attachment 802468 [details] [diff] [review]
> > 909121.patch
> > 
> > Review of attachment 802468 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > I was taking a more thorough look at this, and realized that neither of
> > these patches will be good.  Both fix the back button, but they both break
> > on multiple page changes.  So if I navigate from Page A -> Page B that
> > works, but from Page B -> Page C they both fail.  
> > 
> > Attachment #801813 [details] [diff] [diff] [diff] fails by completely removing the markup
> > view (much like the unpatched version does when pressing back)
> > 
> > Attachment #802468 [details] [diff] [diff] [diff] fails by keeping the markup tree of Page
> > B no matter how many pages you revisit.  It seems that the then callback
> > _rootNodeDeferred is not firing after the second page is loaded, which
> > causes new-root to not fire. Probably similar to what is happening with the
> > first patch, and could be a timing issue.
> 
> More information: this navigation problem (open devtools, naviagate from
> Page A -> Page B -> Page C) also happens to the unpatched source.  It was
> working on nightly 26.0a1 (2013-09-06).
> 
> Neither the newRoot nor the documentUnload mutations are firing after Page B
> loads.  You see a brief "flicker" on the unpatched one because it is
> listening for the "navigate" event from the frontend on inspector-panel,
> whereas with Attachment #802468 [details] [diff] [diff] applied it doesn't change
> at all (as the newRoot isn't happening).  I'm looking into this more, but I
> may need some more information, as I'm not too familiar with the lifecycle
> of the WalkerActor and WalkerFront.  Is there a place I should start or
> anything that stands out as a problem?
> 
> Seems like the onFrameUnload is bailing out before the mutatation at 
> 
>     if (!documentActor) {
>       return;
>     }
> 
> and the onFrameLoad is skipping the mutation out with rootDoc not falsy here:
> 
>     if (!frame && !this.rootDoc) {
> 
>     }
 
The error described in Comment #8 is not a problem for all pages.  It only seems to happen when the page fires the onFrameLoad too close to onFrameUnload (I was seeing it when Page B is an "Unable to connect" error like http://www.asdfasdfasdfasdfasfdasdf.com/).  Pretty much once that page loads things go awry.  However, if I wrap the onFrameLoaded callback in a timeout it works.
So, I believe we are bumping into an issue with the nsIWebProgressListener - https://developer.mozilla.org/en-US/docs/XPCOM_Interface_Reference/nsIWebProgressListener.

I've attached a patch with some additional logging to track down the issue.  We are listening for the final STATE_STOP event (called with STATE_IS_WINDOW) to determine when the page has finished loading.  However, on certain pages (at least on pages in which Firefox cannot establish a connection with the server), this event fires before the document is updated with the new URL.

Here are the steps to reproduce:

Open DevTools, navigate to bgrins.github.io.  This is the normal sequence of events, and works during page navigations.  The URLs being logged are gotten from progress.DOMWindow.document.location.toString().

  console.log: onLocationChange:  http://bgrins.github.io/
  console.log: onStatusChange:  http://bgrins.github.io/ Read bgrins.github.io
  console.log: onStateChange: Window STOP  http://bgrins.github.io/ <-- Works fine

Navigate to 0.0.0.1 (Firefox can't establish a connection to the server at 0.0.0.1).

  console.log: onStatusChange:  http://bgrins.github.io/ Connecting to 0.0.0.1.
  console.log: onStateChange: Window STOP  http://bgrins.github.io/ <-- Breaks here, since this is the event we are listening for to update the UI when it loads.  Expecting the URL here to be http://0.0.0.1/.
  console.log: onLocationChange:  http://0.0.0.1/

What I can't figure out is why the final STATE_STOP fires with the previous document location, and why it is firing before onLocationChange.  If I set a breakpoint / watch expression the location or wrap the callbacks in a short timeout, the document is properly updated because of the additional timeout.

Is this a bug with the nsIWebProgressListener, or just a misunderstanding of the interface?  Is there a better way to listen for the completion of the request?
Flags: needinfo?(dcamp) → needinfo?(bzbarsky)
It's entirely possible that the bizarre way in which error pages load confuses the progress listener code...

Could you try getting the url from aRequest instead of DOMWindow.document.location?
Flags: needinfo?(bzbarsky)
(In reply to Boris Zbarsky [:bz] from comment #12)
> It's entirely possible that the bizarre way in which error pages load
> confuses the progress listener code...
> 
> Could you try getting the url from aRequest instead of
> DOMWindow.document.location?

Thanks for the information Boris,

I am seeing the correct URL in the aRequest.name property as you suggested.  However, for our purposes, the timing (state of the document when the stop event is fired) is still triggering a bug.

Looking into the problem I found this in DocManager.cpp: https://mxr.mozilla.org/mozilla-central/source/accessible/src/base/DocManager.cpp#291:

  // XXX: handle error pages loading separately since they get neither
  // webprogress notifications nor 'pageshow' event.

It seems to be calling the event on DOMContentLoaded if it is an error pages, while for other pages it is waiting for the OnStateChange notification.

The issue this is causing with our code is that not only is the URL wrong, the actual DOM of the document is still the same as the old one when this event fires.  The aWebProgress.DOMWindow.document.body.innerHTML is still set to the innerHTML of the old page when the STOP event fires on the error page.  And since the document.readyState is complete we can't wait for the DOMContentLoaded event.  There may be another way involving timeouts or mutations, waiting for the document state to be consistent with what we expect, but it seems that we should be able to detect the load consistently on all pages, including error pages.
Error page loading is completely weird.  Dealing with them is a major hassle...

I suggest looking for a OnLocationChange, then if it's an error page waiting for DOMContentLoaded or whatnot and if not waiting for OnStateChange.  :(
(In reply to Boris Zbarsky [:bz] from comment #14)
> Error page loading is completely weird.  Dealing with them is a major
> hassle...
> 
> I suggest looking for a OnLocationChange, then if it's an error page waiting
> for DOMContentLoaded or whatnot and if not waiting for OnStateChange.  :(

It's odd, document.readyState seems to already be "complete" during OnLocationChange, so DOMContentLoaded never fires for the error page only (does fire for normal pages).  Seems like this could get a little tricky.  I'm thinking we should go ahead and fix the back button issue here and fix the error page issue in another bug.  I've opened Bug 916407 to try and deal with this case.  I wonder if it would be possible to resolve the issue with error page loading from the platform rather than working around it in the code here.  I'd be happy to work on that, but would need to be mentored.
Wild guess: we should not use DOMContentLoaded, but webProgressListeners.
Attachment #802468 - Flags: review-
Comment on attachment 802468 [details] [diff] [review]
909121.patch

Review of attachment 802468 [details] [diff] [review]:
-----------------------------------------------------------------

Resetting r+ from :dcamp.  Going to handle additional issue with error pages on Bug 916407
Attachment #802468 - Flags: review+
(In reply to Paul Rouget [:paul] from comment #16)
> Wild guess: we should not use DOMContentLoaded, but webProgressListeners.

We are using webProgressListeners, but their behavior is different than normal for error pages (Comment 11,  Comment 13, Comment 14).  Using DOMContentLoaded would be a potential workaround for those particular pages, but even that does not seem to work since document.readyState is complete even during OnLocationChange.
Attachment #801813 - Attachment is obsolete: true
https://hg.mozilla.org/integration/fx-team/rev/69b30be06df3
Assignee: nobody → dcamp
Flags: in-testsuite+
Keywords: checkin-needed
Whiteboard: [fixed-in-fx-team]
https://hg.mozilla.org/mozilla-central/rev/69b30be06df3
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → Firefox 26
(In reply to Dave Camp (:dcamp) from comment #3)
> This affects Firefox 25, and is pretty nasty.

Would only track if this was a recent regression or recently exacerbated. Is that the case?

Otherwise, we'd just accept an uplift (but no need to track).
Fixing error page loads in the platform would be great.  I'm not sure there's a good mentor, though: no one really knows that code that well.  Except maybe biesi.
(In reply to Alex Keybl [:akeybl] from comment #21)
> (In reply to Dave Camp (:dcamp) from comment #3)
> > This affects Firefox 25, and is pretty nasty.
> 
> Would only track if this was a recent regression or recently exacerbated. Is
> that the case?
> 
> Otherwise, we'd just accept an uplift (but no need to track).

This regressed in Firefox 25 with the remote inspector.
Comment on attachment 802468 [details] [diff] [review]
909121.patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 805526, but bug 879520 will manifest this in 25
User impact if declined: Back button will leave the developer tools useless
Testing completed (on m-c, etc.): On m-c
Risk to taking this patch (and alternatives if risky): Risk is limited to inspector problems during navigation.
String or IDL/UUID changes made by this patch: None.
Attachment #802468 - Flags: approval-mozilla-beta?
Attachment #802468 - Flags: approval-mozilla-aurora?
Attachment #802468 - Flags: approval-mozilla-beta?
Attachment #802468 - Flags: approval-mozilla-beta+
Attachment #802468 - Flags: approval-mozilla-aurora?
Attachment #802468 - Flags: approval-mozilla-aurora+
Needs a branch-specific patch for the Fx25 uplift.
Flags: needinfo?(dcamp)
I can get one today.
Attached patch beta patchSplinter Review
Flags: needinfo?(dcamp)
Keywords: verifyme
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:25.0) Gecko/20100101 Firefox/25.0
Build ID: 20130923194050

I can confirm that on Firefox 25 beta 2 the Inspector does not break when navigating backwards but I'm still getting a JS error when following the directions given in the Description and in Comment 1:
The error is:

TypeError: can't access dead object: ElementStyle_addElementRules@resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/styleinspector/rule-view.js:146
ElementStyle_populate@resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/styleinspector/rule-view.js:129
CssRuleView_nodeChanged@resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/styleinspector/rule-view.js:961
RVT_refresh@resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/styleinspector/style-inspector.js:125
EventEmitter_emit@resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/shared/event-emitter.js:107
@resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/inspector/inspector-panel.js:692
 @ resource://gre/modules/commonjs/toolkit/loader.js -> resource:///modules/devtools/shared/event-emitter.js:112

Also on the latest Nightly I'm still seeing the error mentioned in the Description:

baseNode is null

"WalkerActor<.querySelector<@resource://gre/modules/commonjs/toolkit/loader.js -> resource://gre/modules/devtools/server/actors/inspector.js:1329actorProto/</handler@resource://gre/modules/commonjs/toolkit/loader.js -> resource://gre/modules/devtools/server/protocol.js:906DSC_onPacket@resource://gre/modules/devtools/dbg-server.jsm -> resource://gre/modules/devtools/server/main.js:1018@resource://gre/modules/devtools/dbg-server.jsm -> resource://gre/modules/devtools/server/main.js -> resource://gre/modules/devtools/server/transport.js:255@resource://gre/modules/devtools/dbg-server.jsm -> resource://gre/modules/devtools/server/main.js -> resource://gre/modules/devtools/DevToolsUtils.js:72"

Any thoughts?
Flags: needinfo?(dcamp)
I believe that the "dead object" error (and maybe baseNode is null) is resolved by Bug 917448, but it probably has not made its way into Nightly yet.
Flags: needinfo?(dcamp)
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:25.0) Gecko/20100101 Firefox/25.0
Build ID: 20131001024718

Considering that on Firefox 25 beta 4 - the Inspector does not break when navigating backwards and that Bug 917448 (based on Comment 31) takes care of the errors mentioned in Comment 30, I'm setting the status for Firefox 25 to verified.
verified with Nightly build 20131024030204 and the latest Aurora Build
Status: RESOLVED → VERIFIED
Keywords: verifyme
See Also: → 983386
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: