Closed Bug 1914863 Opened 3 months ago Closed 3 months ago

NodePicker stops working after canceled navigation from about:newtab

Categories

(DevTools :: Inspector, defect, P3)

defect

Tracking

(firefox132 fixed)

RESOLVED FIXED
132 Branch
Tracking Status
firefox132 --- fixed

People

(Reporter: jdescottes, Assigned: jdescottes)

Details

Attachments

(1 file)

Not a recent regression, already happens for me in the release channel.

Need to refine STRs, but the following seem to work relatively consistently for me:

  • open about:newtab
  • open devtools > inspector
  • click on any link in the page, preferably to a slow loading page
  • during the navigation, navigate back
  • try to use the node picker

The highlighter will not be displayed. It can take a few tries to get it. Try to use gestures or a keyboard shortcut in order to navigate back relatively fast (don't need to be super fast either, but if the navigation is over or sufficiently advanced it won't trigger the bug)

Some notes about the STRs:

  • you can also reproduce by navigating from newtab to another page with the url bar, but I found it easier to repro with the link
  • I could not reproduce when navigating between two content pages
  • I could not reproduce when navigating from another about page (tried about:debugging, about:config, about:memory ...)

In the logs we can see:

console.error: "Error while calling actor 'inspector's method 'getHighlighterByType'" "Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsIDOMWindowUtils.getRootBounds]"
console.error: ({})
JavaScript error: resource://devtools/client/fronts/inspector.js, line 167: Error: The target doesn't support creating highlighters by types or BoxModelHighlighter is unknown

I suspect we are failing on the NS_ENSURE_STATE at the beginning of https://searchfox.org/mozilla-central/rev/3f193a8bb0759058c2f9a9d7761ad73bc6ffd4b1/dom/base/nsDOMWindowUtils.cpp#2157-2183

NS_IMETHODIMP
nsDOMWindowUtils::GetRootBounds(DOMRect** aResult) {
  Document* doc = GetDocument();
  NS_ENSURE_STATE(doc);

  nsRect bounds(0, 0, 0, 0);
  PresShell* presShell = doc->GetPresShell();
  if (presShell) {
    ScrollContainerFrame* sf = presShell->GetRootScrollContainerFrame();
    if (sf) {
      bounds = sf->GetScrollRange();
      bounds.SetWidth(bounds.Width() + sf->GetScrollPortRect().Width());
      bounds.SetHeight(bounds.Height() + sf->GetScrollPortRect().Height());
    } else if (presShell->GetRootFrame()) {
      bounds = presShell->GetRootFrame()->GetRect();
    }
  }

  nsCOMPtr<nsPIDOMWindowOuter> window = do_QueryReferent(mWindow);
  RefPtr<DOMRect> rect = new DOMRect(window);
  rect->SetRect(nsPresContext::AppUnitsToFloatCSSPixels(bounds.x),
                nsPresContext::AppUnitsToFloatCSSPixels(bounds.y),
                nsPresContext::AppUnitsToFloatCSSPixels(bounds.Width()),
                nsPresContext::AppUnitsToFloatCSSPixels(bounds.Height()));
  rect.forget(aResult);
  return NS_OK;
}

The issue actually comes from a windowUtils cache handled in devtools, using a WeakMap keyed by window objects:

https://searchfox.org/mozilla-central/rev/490a1df802d8872f996f8ef4093d54e3c854c8f9/devtools/shared/layout/utils.js#37-51

/**
 * Returns the `DOMWindowUtils` for the window given.
 *
 * @param {DOMWindow} win
 * @returns {DOMWindowUtils}
 */
const utilsCache = new WeakMap();
function utilsFor(win) {
  // XXXbz Given that we now have a direct getter for the DOMWindowUtils, is
  // this weakmap cache path any faster than just calling the getter?
  if (!utilsCache.has(win)) {
    utilsCache.set(win, win.windowUtils);
  }
  return utilsCache.get(win);
}

We can try to check the performance impact of removing this, although we have many other call sites for windowUtils in devtools and none use such caching.

Doing quick tests locally, the getter seems 5 times slower than the weak map cache (including the hit test). That's with a dumb microbenchmark calling 100 000 times each. On my machine the figures are extremely low for both still: around 6 ms for weakmap cache, around 30 ms for win.windowUtils getter - again, for 100 000 calls.

To recap we have an about:newtab window which starts navigating to site A, but then we hit previous() and go back to about:newtab. At this point I guess this existing window is pulled from bfcache, but its windowUtils has been cleared/updated? And the previous windowUtils from our cache is no longer valid. I think windowUtils is owned by the outer window, but given https://firefox-source-docs.mozilla.org/dom/navigation/nav_replace.html I'm not sure how we could get the same window object, but a different outerWindow? Or maybe the outerWindow is still the same and only windowUtils has been cleared?

I'm not sure we should be concerned about the performance regression here, we could just get rid of the caching to solve the issue.
Emilio: do you think there is something worth checking on the platform side here, or should DevTools just stop caching windowUtils per window objects?

Flags: needinfo?(emilio)

Yeah unless it regresses real benchmarks seems worth just removing the cache.

Curious, do you have a profile of that micro-benchmark? It might be worth comparing and seeing if there's a bug to be filed in the js engine. But my guess is that some set operations are getting jitted while the getter will always jump to native code (but it's not super-likely we get a lot of benefit from jitting utilsFor? Hard to say...).

nsGlobalWindowOuter::CleanUp clears the pointer when navigated away, and if you bfcache the page back it would create a new one for the same window, right?

Flags: needinfo?(emilio)

nsGlobalWindowOuter::CleanUp clears the pointer when navigated away, and if you bfcache the page back it would create a new one for the same window, right?

That's what I was suspecting. Then the current behavior makes sense and it's going to be challenging to maintain such a cache.

I'll do a profile for my benchmark, although in real situations we are very unlikely to stress the utilsFor API (at least from DevTools).

Severity: -- → S3
Priority: -- → P3

Curious, do you have a profile of that micro-benchmark?

Profile for the regular getter: https://share.firefox.dev/3z3eaee
Profile for the cached getter: https://share.firefox.dev/4cOnu3p

This cache might keep destroyed windowUtils helpers in some scenarios involving navigations with a process change.
The regular getter is slightly slower, but the performance hit should not be noticeable.

Assignee: nobody → jdescottes
Status: NEW → ASSIGNED
Pushed by jdescottes@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bcb631924f5a [devtools] Remove windowUtils cache in layout/utils.js r=devtools-reviewers,nchevobbe

Hi Emilio, just a quick needinfo to share the profiles I recorded for my benchmarks:

(In reply to Julian Descottes [:jdescottes] from comment #5)

Curious, do you have a profile of that micro-benchmark?

Profile for the regular getter: https://share.firefox.dev/3z3eaee
Profile for the cached getter: https://share.firefox.dev/4cOnu3p

We can see that the cached version is fully jitted, while the getter is not. So I guess we don't need to file a bug for platform here?

Flags: needinfo?(emilio)

It might be worth to file a bug, I think that's more overhead than I'd expect to access an xray property... But yeah the comparison with jitted code is not so useful.

Flags: needinfo?(emilio)
Status: ASSIGNED → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → 132 Branch
QA Whiteboard: [qa-132b-p2]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: