Closed Bug 1396539 Opened 2 years ago Closed 2 years ago

DAMP incorrectly records inspector open timing

Categories

(DevTools :: General, enhancement)

enhancement
Not set

Tracking

(firefox57 fixed)

RESOLVED FIXED
Firefox 57
Tracking Status
firefox57 --- fixed

People

(Reporter: ochameau, Assigned: ochameau)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file)

For now, "OPEN" records are measuring the time it takes to resolve gDevTools.showToolbox for each tool.
Which mostly relates to each tool Panel.open method returned promise.
In the case of the inspector, its open method doesn't correctly resolve.
It resolved before it finishes loading things.
I saw two main pieces missing:
* Markup-view load: showNode is still processing *after* showToolbox resolves
* Inspector.setupToolbar is also still processing

It mostly relates to this code:
http://searchfox.org/mozilla-central/source/testing/talos/talos/tests/devtools/addon/content/damp.js#69-73
which is very asynchronous, and we don't wait for it to finish before proceeding.
Here is DAMP results, which reflect exactly what was wrong:
https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=try&originalRevision=07af3090520d8784564b4cc4ad8c1e2ee051ca55&newProject=try&newRevision=e46b579106904056fc3362b65d1fdc1bb474fc8d&originalSignature=edaec66500db21d37602c99daa61ac983f21a6ac&newSignature=edaec66500db21d37602c99daa61ac983f21a6ac&framework=1

All inspector.open probes increase while reload and also close (only for simple test) decrease.

This patch should:
 * remove some fuzziness in DAMP
   The markup view and inspector code was either not measured at all, or included into reload and close steps.
 * fix some random test failure
   Waiting for showToolbox, or 'inspector-selected' event wasn't enough to wait for inspector load.
   There could be some async step from markup-view that was still computing while firefox was shutting down firefox
   or another test started running.

Inspector.open may still not be completely perfect. Some other code somwhere may still be running *after* Inspector.open resolves, but these two things should already improve things quite a lot.
Comment on attachment 8904238 [details]
Bug 1396539 - Wait correctly for inspector load before resolving toolbox-ready.

https://reviewboard.mozilla.org/r/176016/#review181038

We've had a lenghtly discussion about this on IRC. Overall this looks good for tests and for DAMP. Indeed, those are 2 environments where we definitely want to make sure we wait for everything to be ready before continuing (in tests, we want to make sure server-side requests are done before ending, and in DAMP, we want to measure the *entire* init step).
For production code though, we need to make sure this does not delay user-facing things. Indeed, the inspector might be displayed and partly interactive even though some more requests are ongoing. These requests might be for non-vital features, such as saving the unique selector for the current node.

Your patch looks good to me, right now we don't seem to be waiting for more than we need.

As discussed, there's one user-facing scenario that might suffer a bit from this: when clicking on a DOM node in the console output to select it in the inspector (and if the inspector was not open yet). We will now have to wait for the rule-view to be done initializing before switching to the inspector. 
Is this a problem? I'm not sure. And if it's too slow, we should optimize the inspector, not remove this step. Indeed, the rule-view is the most important part of the inspector, so waiting for it to be ready seems logical.

::: devtools/client/inspector/inspector.js:287
(Diff revision 1)
> +      let onMarkupViewUpdated = this.once("inspector-updated");
> -          this.selection.setNodeFront(defaultSelection, "inspector-open");
> +      this.selection.setNodeFront(defaultSelection, "inspector-open");
> -          this.markup.expandNode(this.selection.nodeFront);
> +      await onMarkupViewUpdated;

As discussed, "inspector-updated" is in fact delayed by the init of the rule-view (and possibly other sidebar panels). So waiting for this event in fact also waits for the rule-view, not just the markup-view.

If the goal here is to make sure open only resolves when everything is ready, then we do need to wait for the rule-view. The markup-view, in a way, is "just" a navigation tool for the inspector, while the rule-view is where most of the work happens.
It can also take a long time to load when there are many rules, so better measure it too (like we do here).
Attachment #8904238 - Flags: review?(pbrosset) → review+
Assignee: nobody → poirot.alex
Yes, it may delay some action, but hopefully bug 1378418 is going to make everything faster.
If this additional delay reveal any issue I imagine we should come up with more than one loading state.

Also regarding the webconsole DOM links to inspector, one more effecient way would be to change the selected node first before opening the inspector. It would prevent selecting default node before selecting the console node.
Note for (talos) sheriffs, this patch is going to report a regression on DAMP and especially *.inspector.* tests.
But this was because previous timings were wrong and weren't covering all inspector load steps.
Pushed by apoirot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/666cff43b2c3
Wait correctly for inspector load before resolving toolbox-ready. r=pbro
Backed out for failing devtools' devtools/client/animationinspector/test/browser_animation_pseudo_elements.js in Linux stylo builds:

https://hg.mozilla.org/integration/autoland/rev/1bf2e9fab3ac2a60f0f5e446e7fe29c71230dadd

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=666cff43b2c3eea70c7751752546cdf3740c9177&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=128473054&repo=autoland

TEST-UNEXPECTED-FAIL | devtools/client/animationinspector/test/browser_animation_pseudo_elements.js | Uncaught exception - at chrome://mochitests/content/browser/devtools/client/animationinspector/test/browser_animation_pseudo_elements.js:31 - TypeError: bodyContainer.elt.children[1].childNodes[index] is undefined
Flags: needinfo?(poirot.alex)
The issue here is that Servo introduces an exception here:
http://searchfox.org/mozilla-central/rev/f2a1911ad310bf8651f342d719e4f4ca0a7b9bfb/devtools/server/actors/styles.js#1578
  let {offset: textOffset, text} =
      getTextAtLineColumn(initialText, line, column);
  let lexer = DOMUtils.getCSSLexer(text);

  // Search forward for the opening brace.
  while (true) {
    let token = lexer.nextToken();
    if (!token) {
      throw new Error("couldn't find start of the rule"); // <== THROWS HERE

Here is the values of important variables here:
initialText=
  html, body {
    margin: 0;
    height: 100%;
    width: 100%;
    overflow: hidden;
    display: flex;
    justify-content: center;
    align-items: flex-end;
  }
  body {
    animation: color 2s linear infinite;
    background: #333;
  }
  @keyframes color {
    to {
      filter: hue-rotate(360deg);
    }
  }
  body::before,
  body::after {
    content: "";
    flex-grow: 1;
    height: 100%;
    animation: grow 1s linear infinite alternate;
  }
  body::before {
    background: hsl(120, 80%, 80%);
  }
  body::after {
    background: hsl(240, 80%, 80%);
    animation-delay: -.5s;
  }
  @keyframes grow {
    0% {height: 100%; animation-timing-function: ease-in-out;}
    10% {height: 80%; animation-timing-function: ease-in-out;}
    20% {height: 60%; animation-timing-function: ease-in-out;}
    30% {height: 70%; animation-timing-function: ease-in-out;}
    40% {height: 50%; animation-timing-function: ease-in-out;}
    50% {height: 30%; animation-timing-function: ease-in-out;}
    60% {height: 80%; animation-timing-function: ease-in-out;}
    70% {height: 90%; animation-timing-function: ease-in-out;}
    80% {height: 70%; animation-timing-function: ease-in-out;}
    90% {height: 60%; animation-timing-function: ease-in-out;}
    100% {height: 100%; animation-timing-function: ease-in-out;}
  }
line= 0
column= 0

This exception isn't related to this patch, it already happens on tip.
It ends up making PageStyleActor.getApplied request to fail from here:
http://searchfox.org/mozilla-central/rev/f2a1911ad310bf8651f342d719e4f4ca0a7b9bfb/devtools/client/inspector/rules/models/element-style.js#91
  populate: function () {
    let populated = this.pageStyle.getApplied(this.element, {
Itself make the following promise to be rejected:
http://searchfox.org/mozilla-central/rev/f2a1911ad310bf8651f342d719e4f4ca0a7b9bfb/devtools/client/inspector/rules/rules.js#1661-1662
      this.view.selectElement(this.inspector.selection.nodeFront)
        .then(done, done);

Now, having this rejecting is not an issue as we pass `done` as resolve and reject callbacks.
But the `await onAllPanelsUpdated;` changes the timings of actions and ends up breaking pseudo element display.

At the end I believe there is nothing wrong here, but this undercover a stylo issue.


jryans: any idea what could be wrong on Stylo side? Is this a known bug?
Flags: needinfo?(poirot.alex) → needinfo?(jryans)
Following change fixes stylo error, but I would like to know more about what is going on on Stylo:
https://reviewboard.mozilla.org/r/176016/diff/2-3/
(In reply to Alexandre Poirot [:ochameau] from comment #9)
> jryans: any idea what could be wrong on Stylo side? Is this a known bug?

Yes, it's a known issue: keyframe rules are missing line / column info with Stylo (bug 1394994).  A fix should be landing today.

What's the behavior with your try / catch in place?  Does it mean the rule view works again, but for now the keyframes are just missing?  (If you want to land the try / catch, I just want to make sure it won't hide other problems in the future.)
Flags: needinfo?(jryans)
(In reply to J. Ryan Stinnett [:jryans] (use ni?) from comment #12)
> (In reply to Alexandre Poirot [:ochameau] from comment #9)
> > jryans: any idea what could be wrong on Stylo side? Is this a known bug?
> 
> Yes, it's a known issue: keyframe rules are missing line / column info with
> Stylo (bug 1394994).  A fix should be landing today.

I can wait for that to land and not introduce this try/catch.

> What's the behavior with your try / catch in place?  Does it mean the rule
> view works again, but for now the keyframes are just missing?  (If you want
> to land the try / catch, I just want to make sure it won't hide other
> problems in the future.)

But, it looks like the try/catch fixes more things than it hides.
If I only add the try/catch (i.e. ignore all the rest of this patch) against mozilla-central without bug 1394994.
If I open doc_pseudo_elements.html from the failing test and then I select ::before pseudo element:
* with the try/catch, the rule view contains a bunch of rules, and it seems that it contains everything, keyframes and other rules.
* without, on stylo, the rule view is empty.

We see that on stylo all the rules of ::before do not contain line information, there is only "inline" being displayed.
(In reply to Alexandre Poirot [:ochameau] from comment #13)
> But, it looks like the try/catch fixes more things than it hides.
> If I only add the try/catch (i.e. ignore all the rest of this patch) against
> mozilla-central without bug 1394994.
> If I open doc_pseudo_elements.html from the failing test and then I select
> ::before pseudo element:
> * with the try/catch, the rule view contains a bunch of rules, and it seems
> that it contains everything, keyframes and other rules.
> * without, on stylo, the rule view is empty.
> 
> We see that on stylo all the rules of ::before do not contain line
> information, there is only "inline" being displayed.

Okay, that's about what I expected between the two...  This particular bug seems specific to bringing up Stylo (rules are present but bad line / column info), but I suppose there could be other kinds of failures in the get rule text path.  One "nice" thing about the current failure state is that it's more visually obvious that something is wrong when the whole rule view just fails, so might lead more quickly to bug reports.

So, I don't have really have a strong feeling for or against landing the try / catch.
I don't have strong call either. I'll check to see if bug 1394994 fixes everything.
Ok, so I looked at this again and finally understood why the two pseudo elements weren't displayed.
There is various things being weak and completely independant!

1/ the first most obvious issue, really related to bug 1394994 is PageStyleActor.getApplied request that fails because of the exception due to Stylo having an issue with keyframes:
  http://searchfox.org/mozilla-central/source/devtools/server/actors/styles.js#454
    for (let rule of result.rules) {
      // See the comment in |form| to understand this.
      yield rule.getAuthoredCssText(); // <== this throws and make the whole request fail
    }
Now, speaking about recovering from errors, I think this is weak.
Any time we have *one* exception for *one* rule, we fail fetching *all* rules. The rule view is completely empty!
I don't know if we have common guideline on how to handle errors and make best effort to not be completely broken?
That may be "nice to have" to recover better from random exception, but this isn't a big deal as bug 1394994 is most likely going to fix the exception. It isn't as easy as just putting a try/catch as it would easily hide issue. If we want to do something here we would need a smarter error catching which would make things work in production but still introduce failures when running tests on automation.

2/ PageStyleActor.getApplied is only used in the Rule View (ElementStyle), but the two ::before and ::after pseudo elements aren't displayed with this patch in the Markup View!?
This is related to this change in my patch:
      let onAllPanelsUpdated = this.once("inspector-updated");
      this.selection.setNodeFront(defaultSelection, "inspector-open");
      await onAllPanelsUpdated;
      await this.markup.expandNode(this.selection.nodeFront);
We end up calling expandNode *after* the rule view is initialized. And it changes the very precise time when we call markup.expandNode. The test fail not because of the rule view being empty (otherwise it would have been disabled on stylo already). But it fails because the markup view do not display the pseudo elements.

I tracked this down to MarkupView.updateChildren and in particular this two chunks:
http://searchfox.org/mozilla-central/source/devtools/client/inspector/markup/markup.js#1648-1652
        // If children are dirty, we got a change notification for this node
        // while the request was in progress, we need to do it again.
        if (container.childrenDirty) { // <== with and without my patch, childrenDirty is true
          return this._updateChildren(container, {expand: centered});
        }

http://searchfox.org/mozilla-central/source/devtools/client/inspector/markup/markup.js#1626-1631
    // If we're not expanded (or asked to update anyway), we're done for
    // now.  Note that this will leave the childrenDirty flag set, so when
    // expanded we'll refresh the child list.
    if (!(container.expanded || expand)) {
      return promise.resolve(container); // <=== we get here, we don't update children, so we miss the pseudo elements
    }

Without this patch, this code runs sooner, via another codepath.
`expand` as well as `centered` end up being false, so the second chunk doesn't prevent displaying the childrens.
But with this patch, we slightly move code order and `expand` is true.
That's because this is now being called from:
  await this.markup.expandNode(this.selection.nodeFront);
from my patch, in inspector.js::deferredOpen

And centered is still false.

I see two things probably wrong in the markup view:
* Shouldn't _checkSelectionVisible return true if `container == selection.nodeFront`?
http://searchfox.org/mozilla-central/source/devtools/client/inspector/markup/markup.js#1527-1546
because that the case in updateChildren function. container is the currently selected node, shouldn't we force expanding it?
So do this:
  _checkSelectionVisible: function (container) {
    let centered = null;
    let node = this.inspector.selection.nodeFront;
+   if (node == container.node) return node;
    while (node) {
      if (node.parentNode() === container.node) {
        centered = node;
        break;
      }
      node = node.parentNode();
    }

    return centered;
  },

* Shouldn't updateChildren respect the given `options.expand`?
        if (container.childrenDirty) {
-          return this._updateChildren(container, {expand: centered});
+          return this._updateChildren(container, {expand: centered || expand});
        }

This change to updateChildren fixes this test, without having to depend on bug 1394994, which at the end if quite unrelated to this failure...

If you think these two fixes to markup view are valid, or at least the second one, I'll proceed with just that and not add the try/catch in styles.js actor.
Flags: needinfo?(pbrosset)
(In reply to Alexandre Poirot [:ochameau] from comment #16)
> * Shouldn't _checkSelectionVisible return true if `container == selection.nodeFront` ?

While it looks like what this function should do, given its name, if I do that, I end up with this exception:
Node cannot be inserted at the specified point in the hierarchy
http://searchfox.org/mozilla-central/source/devtools/client/inspector/markup/markup.js#1674
        container.children.appendChild(fragment);
In updateChildren method.
(In reply to Alexandre Poirot [:ochameau] from comment #16)
> 1/ the first most obvious issue, really related to bug 1394994 is
> PageStyleActor.getApplied request that fails because of the exception due to
> Stylo having an issue with keyframes:
>  
> http://searchfox.org/mozilla-central/source/devtools/server/actors/styles.
> js#454
>     for (let rule of result.rules) {
>       // See the comment in |form| to understand this.
>       yield rule.getAuthoredCssText(); // <== this throws and make the whole
> request fail
>     }
> Now, speaking about recovering from errors, I think this is weak.
> Any time we have *one* exception for *one* rule, we fail fetching *all*
> rules. The rule view is completely empty!
> I don't know if we have common guideline on how to handle errors and make
> best effort to not be completely broken?
I think we don't have guidelines. But I think if we did they would be different from case to case.
In the case of the rule-view, I don't see the value of showing some rules and not others.
Ok maybe I do in one case: the keyframe rules. Those could potentially be missing while the other (non-keyframe) rules are shown.
But other than that, showing some rules and not others is more confusing that an entirely blank screen. Rules are ordered by specificity, they override each other, etc. so if just one was missing, it would be really weird to understand how the currently shown CSS applies, and we wouldn't know that one rule was missing.
I'd rather have a blank screen (with a sad face, or big JS exception in the browser console like today).
Again, for keyframe rules, maybe these could be missing, but that's really a nice to have.
> That may be "nice to have" to recover better from random exception, but this
> isn't a big deal as bug 1394994 is most likely going to fix the exception.
Good! I see the bug is resolved now.
> It isn't as easy as just putting a try/catch as it would easily hide issue.
> If we want to do something here we would need a smarter error catching which
> would make things work in production but still introduce failures when
> running tests on automation.
Right, so let's leave the try/catch out for now.

> 2/ PageStyleActor.getApplied is only used in the Rule View (ElementStyle),
> but the two ::before and ::after pseudo elements aren't displayed with this
> patch in the Markup View!?
getApplied is also used in the box-model view in fact.
> This is related to this change in my patch:
>       let onAllPanelsUpdated = this.once("inspector-updated");
>       this.selection.setNodeFront(defaultSelection, "inspector-open");
>       await onAllPanelsUpdated;
>       await this.markup.expandNode(this.selection.nodeFront);
> We end up calling expandNode *after* the rule view is initialized. And it
> changes the very precise time when we call markup.expandNode. The test fail
> not because of the rule view being empty (otherwise it would have been
> disabled on stylo already). But it fails because the markup view do not
> display the pseudo elements.
Right, the call to getBodyChildNodeFront in browser_animation_pseudo_elements.js is the one that fails. This indeed walks though the markup-view to get a particular node from the tree, and then get the NodeFront from it.
> I tracked this down to MarkupView.updateChildren and in particular this two
> chunks:
> http://searchfox.org/mozilla-central/source/devtools/client/inspector/markup/
> markup.js#1648-1652
>         // If children are dirty, we got a change notification for this node
>         // while the request was in progress, we need to do it again.
>         if (container.childrenDirty) { // <== with and without my patch,
> childrenDirty is true
>           return this._updateChildren(container, {expand: centered});
>         }
> 
> http://searchfox.org/mozilla-central/source/devtools/client/inspector/markup/
> markup.js#1626-1631
>     // If we're not expanded (or asked to update anyway), we're done for
>     // now.  Note that this will leave the childrenDirty flag set, so when
>     // expanded we'll refresh the child list.
>     if (!(container.expanded || expand)) {
>       return promise.resolve(container); // <=== we get here, we don't
> update children, so we miss the pseudo elements
>     }
> 
> Without this patch, this code runs sooner, via another codepath.
> `expand` as well as `centered` end up being false, so the second chunk
> doesn't prevent displaying the childrens.
> But with this patch, we slightly move code order and `expand` is true.
> That's because this is now being called from:
>   await this.markup.expandNode(this.selection.nodeFront);
> from my patch, in inspector.js::deferredOpen
> 
> And centered is still false.
This code is really hairy and has not been really changed since it was first implemented a long time ago (changes to the function you mention below date back to June 2013!)
I'm sad to say that I don't think anyone really ever had to dig deep enough to really have a good hold on it. So this is sort of uncharted territory here. Every time we've had to make changes, we relied on automated and manual tests to make sure things worked rather than really making sense of all the ways this code can run.
I know that this and the rule-view are 2 of the hardest code to follow when it comes to init/updates, because async stuff happens all over the place.

> I see two things probably wrong in the markup view:
> * Shouldn't _checkSelectionVisible return true if `container ==
> selection.nodeFront`?
> http://searchfox.org/mozilla-central/source/devtools/client/inspector/markup/
> markup.js#1527-1546
> because that the case in updateChildren function. container is the currently
> selected node, shouldn't we force expanding it?
> So do this:
>   _checkSelectionVisible: function (container) {
>     let centered = null;
>     let node = this.inspector.selection.nodeFront;
> +   if (node == container.node) return node;
>     while (node) {
>       if (node.parentNode() === container.node) {
>         centered = node;
>         break;
>       }
>       node = node.parentNode();
>     }
> 
>     return centered;
>   },
So, the centered thing is here for when there are many children in a node. In this case, the markup-view does not display them all, but only a sub-list with a button to allow showing all.
But to do this right, it needs to know whether it shows the top of the list and then cuts off, or the bottom of the list, hiding the first ones, or only a section in the middle.
That depends on whether a node is selected.
Imagine you select a node inside a parent, and then this parent gets thousands of nodes added to it all over the place. 
The markup-view is going to check which node is selected, so that this one (at least) stays visible, but the new added nodes (before and after) get partly hidden because there are just too many.
That's what the _checkSelectionVisible does if I remember correctly.
That's why it walks the DOM up from the current selection until the given container.

Having said that, returning node if container is the currently selection sounds ok to me. I don't think this should cause any sort of trouble. This is a situation that can happen, and your solution in that case makes sense to me. Why not just this though:

  _checkSelectionVisible: function (container) {
    let centered = null;
    let node = this.inspector.selection.nodeFront;
    while (node) {
      if (node === container.node) {
        centered = node;
        break;
      }
      node = node.parentNode();
    }

    return centered;
  },

> * Shouldn't updateChildren respect the given `options.expand`?
>         if (container.childrenDirty) {
> -          return this._updateChildren(container, {expand: centered});
> +          return this._updateChildren(container, {expand: centered ||
> expand});
>         }
Yes, this looks right to me.
Flags: needinfo?(pbrosset)
Green try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=791b58a754c3023aa6a8ab490cd3d2f496b18bac

Note for (talos) sheriffs, this patch is going to report a regression on DAMP and especially *.inspector.* tests.
But this was because previous timings were wrong and weren't covering all inspector load steps.
Pushed by apoirot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/14c9d45b6175
Wait correctly for inspector load before resolving toolbox-ready. r=pbro
Blocks: damp
https://hg.mozilla.org/mozilla-central/rev/14c9d45b6175
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 57
See Also: → 1399779
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.