Merge reflow and flicker tests in browser/base/content/test/performance

RESOLVED FIXED in Firefox 61

Status

()

enhancement
P1
normal
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: florian, Assigned: florian)

Tracking

Trunk
Firefox 61
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox60 wontfix, firefox61 fixed)

Details

Attachments

(7 attachments, 2 obsolete attachments)

Assignee

Description

a year ago
The browser/base/content/test/performance folder currently contains tests that are named like this:
browser_<userinteraction>_<testtype>.js where the user interaction can be something like "windowopen" "tab_open" and test type can be something like "reflow", "flicker", "images".
This makes it painful to add a new kind of perf test and cover all the existing interactions we care about, and also makes it painful to add a new interaction to cover, as that requires creating new test files for each relevant kind of test, which involves a lot of copy/pasting despite the already shared helper code in head.js

I'm proposing that instead we make test files only contain the code triggering the interactions and the whitelists, and move everything else to head.js.
Assignee

Comment 1

a year ago
This dirtyFrameFn function passed as a parameter gets in the way when trying to make withReflowObserver more generic, and I would like to make the reflow observer code await on a promise that could also be provided to other test code, rather than directly await on the async function that triggers the interactions.
Attachment #8956529 - Flags: review?(mconley)
Assignee

Comment 2

a year ago
Only requesting feedback for now because this may still need to change a bit when working on adapting more tests, but this seems otherwise ready for review.

The main ideas here are that:
- we want individual tests to call a generic helper function (withPerfObserver) rather than a function about reflows
- for each kind of test, the data collection code needs to be separate from the data analysis code. This is so that we can setup all listeners, then run the interaction, then remove all listeners, and finally produce the test's output. This is also needed if we want startup tests to benefit from this refactoring, even though they have to use startupRecorder for their data collection.
Attachment #8956531 - Flags: feedback?(mconley)
Assignee

Comment 3

a year ago
This patch:
- moves all the flicker test logic that wasn't shared yet to head.js
- applies this to the windowopen tests.

If this gets f+, I'll try to apply similar changes to all the other tests we have in the folder.
Attachment #8956532 - Flags: feedback?(mconley)
Comment on attachment 8956529 [details] [diff] [review]
part1 - Stop passing dirtyFrameFn as a parameter

Seems fine, thanks.
Attachment #8956529 - Flags: review?(mconley) → review+
Comment on attachment 8956531 [details] [diff] [review]
part 2 - introduce withPerfObserver and split reflow code

Review of attachment 8956531 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah, this is a great design. Thanks!
Attachment #8956531 - Flags: feedback?(mconley) → feedback+
Comment on attachment 8956532 [details] [diff] [review]
part 3 - Merge browser_windowopen_{reflow,flicker}.js tests

Review of attachment 8956532 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah, I like the general idea here. Thanks!

::: browser/base/content/test/performance/browser_windowopen.js
@@ +66,5 @@
> +                              "chrome,all,dialog=no,remote,suppressanimation",
> +                              "about:home");
> +
> +  let ignoreTinyPaint = true;
> +  let expectations = {

Nit: I feel like we need to try to make the expectations super readable. Is it super necessary to have the expectation object have these function properties, or can we stabilize it to some static set of expectations like we do for reflows?
Attachment #8956532 - Flags: feedback?(mconley) → feedback+
Priority: -- → P1
Assignee

Comment 7

a year ago
I'm narrowing the focus of this bug so that the changes have a chance to land soon. In this bug I'll cover only merging reflow and flicker tests, and will exclude the startup tests. Other pieces of this reorganization can be handled in follow-up bugs.
Summary: Reorganize browser/base/content/test/performance tests → Merge reflow and flicker tests in browser/base/content/test/performance
Assignee

Comment 8

a year ago
I think the changes since attachment 8956531 [details] [diff] [review] that got f+ are minimal.
Attachment #8959610 - Flags: review?(mconley)
Assignee

Updated

a year ago
Attachment #8956531 - Attachment is obsolete: true
Assignee

Updated

a year ago
Attachment #8956532 - Attachment is obsolete: true
Assignee

Comment 10

a year ago
For each test that now includes flicker coverage, this list the areas that are known to be modified between frames. They are split into a list of expected changed areas (rects filtered out by the filter method) and a list of exceptions for known bugs.
Attachment #8959613 - Flags: review?(mconley)
Assignee

Comment 11

a year ago
The existing flicker tests were capturing all frames and comparing them at the end. For long running tests, this may cause OOM failures. This patch changes the behavior to do a trivial comparison with the previous captured frame before storing a newly captured frame.
Attachment #8959614 - Flags: review?(mconley)
Assignee

Comment 12

a year ago
The current reflow tests dirty the frame tree after each event received in the window. Dirtying on MozAfterPaint forces each refresh driver tick to trigger a paint. Given that we capture a screenshot after each MozAfterPaint event, this behavior caused us to run out of memory quickly for long running tests. This was also very slow (even without flicker test).
Attachment #8959619 - Flags: review?(mconley)
Assignee

Comment 13

a year ago
This removes the _reflows suffix from the names of tests that now cover both reflows and flicker.
Attachment #8959620 - Flags: review?(mconley)
Comment on attachment 8959610 [details] [diff] [review]
part 2 - Introduce withPerfObserver and split reflows code, v2

Review of attachment 8959610 [details] [diff] [review]:
-----------------------------------------------------------------

Glad to see this stuff coming in! Thanks, florian.

::: browser/base/content/test/performance/browser_appmenu_reflows.js
@@ +54,5 @@
>      let popupShown =
>        BrowserTestUtils.waitForEvent(PanelUI.panel, "popupshown");
>      await PanelUI.show();
>      await popupShown;
> +  }, {expectedReflows: EXPECTED_APPMENU_OPEN_REFLOWS});

Nit: I _think_ the ESLint style we're starting to converge on prefers to have one-liner objects like this formatted like:

{ prop: value }

with spaces on either side. I'll leave it up to you if you want to do that now for this patch, or save it for the future once ESLint gets turned on in this directory.

::: browser/base/content/test/performance/browser_tabopen_reflows.js
@@ +34,5 @@
>      let switchDone = BrowserTestUtils.waitForEvent(window, "TabSwitchDone");
>      BrowserOpenTab();
>      await BrowserTestUtils.waitForEvent(gBrowser.selectedTab, "transitionend",
>          false, e => e.propertyName === "max-width");
> +      await switchDone;

Busted indentation

::: browser/base/content/test/performance/browser_tabstrip_overflow_underflow_reflows.js
@@ +47,5 @@
>      await switchDone;
>      await BrowserTestUtils.waitForCondition(() => {
>        return gBrowser.tabContainer.arrowScrollbox.hasAttribute("scrolledtoend");
>      });
> +  }, {expectedReflows: EXPECTED_OVERFLOW_REFLOWS}, window);

I guess we don't need to pass window here?

@@ +61,5 @@
>      await switchDone;
>      await BrowserTestUtils.waitForCondition(() => {
>        return gBrowser.tabContainer.arrowScrollbox.hasAttribute("scrolledtoend");
>      });
> +  }, {expectedReflows: []}, window);

or here?

@@ +68,4 @@
>      let switchDone = BrowserTestUtils.waitForEvent(window, "TabSwitchDone");
>      await BrowserTestUtils.removeTab(gBrowser.selectedTab, { animate: true });
>      await switchDone;
> +  }, {expectedReflows: []}, window);

or here?

@@ +87,5 @@
>      await BrowserTestUtils.switchTab(gBrowser, firstTab);
>      await BrowserTestUtils.waitForCondition(() => {
>        return gBrowser.tabContainer.arrowScrollbox.hasAttribute("scrolledtostart");
>      });
> +  }, {expectedReflows: []}, window);

or here...

@@ +112,5 @@
>        let switchDone = BrowserTestUtils.waitForEvent(window, "TabSwitchDone");
>        await BrowserTestUtils.removeTab(lastTab, { animate: true });
>        await switchDone;
>        await BrowserTestUtils.waitForCondition(() => !lastTab.isConnected);
> +    }, {expectedReflows: EXPECTED_UNDERFLOW_REFLOWS}, window);

or here. :)

::: browser/base/content/test/performance/head.js
@@ +436,5 @@
>  
>    info(canvas.toDataURL());
>  }
> +
> +async function withPerfObserver(testFn, exceptions = {}, win = window) {

I feel like this function, since it's likely to be the main entry point for a lot of these tests, deserves some documentation describing how to interact with it, and what it expects as input.
Attachment #8959610 - Flags: review?(mconley) → review+
Comment on attachment 8959612 [details] [diff] [review]
part 3 - Merge browser_windowopen_{reflow,flicker}.js tests

Review of attachment 8959612 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/test/performance/browser_windowopen.js
@@ +54,1 @@
>  add_task(async function() {

It's not visible in this diff viewer, but I think there's a comment block above this add_task saying what this test does. It needs to be updated now.

@@ +63,5 @@
> +                              "chrome,all,dialog=no,remote,suppressanimation",
> +                              "about:home");
> +
> +  let ignoreTinyPaint = true;
> +  let inRange = (val, min, max) => min <= val && val <= max;

Seems a bit odd to define this out here when it's only being used by the one exception down below. Do you expect to use it again? If not, maybe we should just define it inside the one exception...

@@ +70,5 @@
> +    frames: {
> +      filter(rects, frame, previousFrame) {
> +        if (ignoreTinyPaint &&
> +            previousFrame.width == 1 && previousFrame.height == 1) {
> +          todo(false, "shouldn't initially paint a 1x1px window");

Associated bug #?

::: browser/base/content/test/performance/head.js
@@ +478,5 @@
>  
>    info(canvas.toDataURL());
>  }
>  
> +function reportUnexpectedFlicker(frames, exceptions) {

Would be nice to document this function while you're here.

@@ +495,5 @@
> +
> +    for (let rect of rects) {
> +      rects = rects.filter(rect => {
> +        let rectText = `${rect.toSource()}, window width: ${frame.width}`;
> +        for (let e of (exceptions.exceptions || [])) {

exceptions.exceptions seems a little odd.
Attachment #8959612 - Flags: review?(mconley) → review+
Comment on attachment 8959613 [details] [diff] [review]
part 4 - Set the expectations for new flicker tests

Review of attachment 8959613 [details] [diff] [review]:
-----------------------------------------------------------------

r-ing for now until I hear more about why we're skipping the reflow in the URL bar now. I just want to hear the reasoning so we can put it into this bug for reference.

::: browser/base/content/test/performance/browser_tabclose_grow_reflows.js
@@ +60,5 @@
> +          // So we accept up to the width of n-1 tabs.
> +          r.w <= (gBrowser.tabs.length - 1) * Math.ceil(tabStripRect.width / gBrowser.tabs.length)
> +        ))
> +      }
> +     });

The indentation here looks a little off, on this final });...

::: browser/base/content/test/performance/browser_tabclose_reflows.js
@@ +54,5 @@
> +           r.x2 <= newTabButtonRect.right) ||
> +          // The '+' icon moves with an animation. At the end of the animation
> +          // the former and new positions can touch each other causing the rect
> +          // to have twice the icon's width.
> +          (r.h == 14 && r.w <= 2 * 14 + kMaxEmptyPixels) ||

Some of these hard-coded values seem really brittle. Can we not dynamically compute them to avoid breakage if we decide to make our icons a little smaller someday?

::: browser/base/content/test/performance/browser_tabopen_reflows.js
@@ +26,5 @@
>    await ensureNoPreloadedBrowser();
>  
> +  // Prepare the window to avoid flicker and reflow that's unrelated to our
> +  // tab opening operation.
> +  await ensureFocusedUrlbar();

Is it unrelated? A lot of the times we end up focusing the URL bar when opening a new tab...

@@ +50,5 @@
> +          r.x1 >= tabStripRect.left && r.x2 <= tabStripRect.right && (
> +          // The first tab should get deselected at the same time as the next
> +          // tab starts appearing, so we should have one rect that includes the
> +          // first tab but is wider.
> +          (inRange(r.w, firstTabRect.width, firstTabRect.width * 2) &&

I worry a little bit about the readability and the seeming brittleness of these filters. Lots of magic numbers, and it's all quite dense. Do you think this will be difficult for future developers to reason about or hack on months or years in the future?

::: browser/base/content/test/performance/browser_tabopen_squeeze_reflows.js
@@ +30,5 @@
>    const TAB_COUNT_FOR_SQUEEZE = computeMaxTabCount() - 1;
>  
>    await createTabs(TAB_COUNT_FOR_SQUEEZE);
>  
> +  await ensureFocusedUrlbar();

Same concern as before - I think we're skipping a legitimate reflow here - focusing the URL bar happens all of the time when opening tabs.

@@ +57,5 @@
> +        )),
> +        exceptions: [
> +          {name: "the urlbar placeolder moves up and down by a few pixels",
> +           condition: r =>
> +             r.x1 >= textBoxRect.left && r.x2 <= textBoxRect.right &&

A lot of these filters and exceptions seem pretty common - especially the tabstrip ones. I wonder if it makes sense to put them into head.js somehow?

::: browser/base/content/test/performance/browser_tabstrip_overflow_underflow_reflows.js
@@ +34,5 @@
>    const TAB_COUNT_FOR_OVERFLOW = computeMaxTabCount();
>  
>    await createTabs(TAB_COUNT_FOR_OVERFLOW);
>  
> +  await ensureFocusedUrlbar();

Same concern as elsewhere.

::: browser/base/content/test/performance/browser_windowclose_reflows.js
@@ +52,5 @@
> +  }, {expectedReflows: EXPECTED_REFLOWS, frames: {
> +    filter: rects => rects.filter(r => !(
> +      // The dropmarker is visible when the window opens and sometimes hasn't
> +      // finished its transition to opacity: 0 by the time waitForFocus resolves.
> +      (r.x1 >= dropmarkerRect.left - 1 && r.x2 <= dropmarkerRect.right + 1 &&

Seems like we might want a "is rect inside this other rect" general function to simplify some of this stuff.

::: browser/base/content/test/performance/head.js
@@ +383,5 @@
>    };
>    win.addEventListener("MozAfterPaint", afterPaintListener);
>  
> +  // If the test is using an existing window, capture a frame immediately.
> +  if (win.document.readyState == "complete")

I think most one-liners are being braced in the rest of this file. Let's stay consistent if possible.
Attachment #8959613 - Flags: review?(mconley) → review-
Attachment #8959614 - Flags: review?(mconley) → review+
Comment on attachment 8959619 [details] [diff] [review]
part 6 - Avoid dirtying frames after MozAfterPaint events

Review of attachment 8959619 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/test/performance/head.js
@@ +63,5 @@
>                      .QueryInterface(Ci.nsIDocShell);
>    docShell.addWeakReflowObserver(observer);
>  
> +  let dirtyFrameFn = event => {
> +    if (event.type != "MozAfterPaint")

Please brace the one-liner.
Attachment #8959619 - Flags: review?(mconley) → review+
Attachment #8959620 - Flags: review?(mconley) → review+
Assignee

Comment 19

a year ago
(In reply to Mike Conley (:mconley) (:⚙️) (Totally backlogged on reviews and needinfos) from comment #17)

> r-ing for now until I hear more about why we're skipping the reflow in the
> URL bar now. I just want to hear the reasoning so we can put it into this
> bug for reference.

The reason is that if I add an exception to the flicker test that's of the size of the whole urlbar, then we lose coverage for any flicker that may happen there, and some could happen in eg. the page action icons.

Also, it seems the reflow in the .focus() call is due to platform code and there's nothing the front-end code can do about it. Moving the focus to the urlbar is a visible UX decision. The reflow tests will still catch reflow stacks if the focus moves back and forth. So IMO we are not losing any real reflow test coverage here, and moving the focus to the urlbar is an operation that isn't entirely related to interacting with tabs.
Assignee

Comment 20

a year ago
(In reply to Mike Conley (:mconley) from comment #16)

> @@ +63,5 @@
> > +                              "chrome,all,dialog=no,remote,suppressanimation",
> > +                              "about:home");
> > +
> > +  let ignoreTinyPaint = true;
> > +  let inRange = (val, min, max) => min <= val && val <= max;
> 
> Seems a bit odd to define this out here when it's only being used by the one
> exception down below. Do you expect to use it again? If not, maybe we should
> just define it inside the one exception...

It was part of the existing windowopen_flicker test. I was wondering if we should move "inRange" to head.js. I hesitated because I'm not super excited by exposing globally a function with such a short name... but it would simplify the exceptions in quite a few places.

> @@ +70,5 @@
> > +    frames: {
> > +      filter(rects, frame, previousFrame) {
> > +        if (ignoreTinyPaint &&
> > +            previousFrame.width == 1 && previousFrame.height == 1) {
> > +          todo(false, "shouldn't initially paint a 1x1px window");
> 
> Associated bug #?

Bug 1439875 is changing this, so it's no longer relevant.


(In reply to Mike Conley (:mconley) from comment #17)

> ::: browser/base/content/test/performance/browser_tabclose_reflows.js
> @@ +54,5 @@
> > +           r.x2 <= newTabButtonRect.right) ||
> > +          // The '+' icon moves with an animation. At the end of the animation
> > +          // the former and new positions can touch each other causing the rect
> > +          // to have twice the icon's width.
> > +          (r.h == 14 && r.w <= 2 * 14 + kMaxEmptyPixels) ||
> 
> Some of these hard-coded values seem really brittle. Can we not dynamically
> compute them to avoid breakage if we decide to make our icons a little
> smaller someday?

How would you reliably compute this? I can't find an obvious way. It's neither the size of the button, nor the size of the image file (which could include transparent pixels on the side).

If I really had to compute it, I would open an about:blank tab, then display the icon, and compare screenshots of the two. Seems overengineered.

> @@ +50,5 @@
> > +          r.x1 >= tabStripRect.left && r.x2 <= tabStripRect.right && (
> > +          // The first tab should get deselected at the same time as the next
> > +          // tab starts appearing, so we should have one rect that includes the
> > +          // first tab but is wider.
> > +          (inRange(r.w, firstTabRect.width, firstTabRect.width * 2) &&
> 
> I worry a little bit about the readability and the seeming brittleness of
> these filters. Lots of magic numbers, and it's all quite dense. Do you think
> this will be difficult for future developers to reason about or hack on
> months or years in the future?

It's probably going to be difficult to maintain as-is. I would expect this to keep evolving over the next couple months; like reflow tests did. It's difficult to think now about what would make these exception lists easy to express clearly. It'll become obvious once we know where the pain points are.

> @@ +57,5 @@
> > +        )),
> > +        exceptions: [
> > +          {name: "the urlbar placeolder moves up and down by a few pixels",
> > +           condition: r =>
> > +             r.x1 >= textBoxRect.left && r.x2 <= textBoxRect.right &&
> 
> A lot of these filters and exceptions seem pretty common - especially the
> tabstrip ones. I wonder if it makes sense to put them into head.js somehow?
> 
> ::: browser/base/content/test/performance/browser_windowclose_reflows.js
> @@ +52,5 @@
> > +  }, {expectedReflows: EXPECTED_REFLOWS, frames: {
> > +    filter: rects => rects.filter(r => !(
> > +      // The dropmarker is visible when the window opens and sometimes hasn't
> > +      // finished its transition to opacity: 0 by the time waitForFocus resolves.
> > +      (r.x1 >= dropmarkerRect.left - 1 && r.x2 <= dropmarkerRect.right + 1 &&
> 
> Seems like we might want a "is rect inside this other rect" general function
> to simplify some of this stuff.

So in addition to moving 'inRange' to head.js, I would consider an 'inRect' helper that would check if a changed rect is within the DOMRect obtained through getClientRect on a DOM node. We could also have an inTabstrip helper.
Assignee

Comment 21

a year ago
Comment on attachment 8959613 [details] [diff] [review]
part 4 - Set the expectations for new flicker tests

Here is an updated try push that includes all your comments that were easy to address (indent, coding style and adding documenting comments) https://treeherder.mozilla.org/#/jobs?repo=try&revision=4010433d4b7d0bf3e82654a9ec43d522df215734

I think we agreed on IRC that the larger changes that would make the exception lists more maintainable can be done in follow-ups, and I explained in comment 19 why I'm focusing the urlbar.
Attachment #8959613 - Flags: review- → review?(mconley)
Attachment #8959613 - Flags: review?(mconley) → review+

Comment 23

a year ago
Pushed by florian@queze.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2504c7e392f5
Stop providing the dirtyFrameFn function as a parameter to test functions, r=mconley.
https://hg.mozilla.org/integration/mozilla-inbound/rev/e74846136e01
Introduce a generic withPerfObserver function and separate the data collection from the reflow analysis code for the reflow observers, r=mconley.
https://hg.mozilla.org/integration/mozilla-inbound/rev/9519b3e311da
Merge browser_windowopen_{reflow,flicker}.js tests into a single browser_windowopen.js test, r=mconley.
https://hg.mozilla.org/integration/mozilla-inbound/rev/7fb53d0e4edc
set the expectations for new flicker tests, r=mconley.
https://hg.mozilla.org/integration/mozilla-inbound/rev/1750054916de
only store captured frames when they are actually different, r=mconley.
https://hg.mozilla.org/integration/mozilla-inbound/rev/34f6b3c3f070
avoid dirtying frames after MozAfterPaint events, to ensure we don't repaint all the time and capture 60 screenshots per second when nothing changes on screen, r=mconley.
https://hg.mozilla.org/integration/mozilla-inbound/rev/bea32e65f18e
remove the _reflows suffix from tests that now cover both reflows and flicker, r=mconley.

Comment 24

a year ago
Pushed by florian@queze.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b0f1f640001c
remove the _reflows suffix in a few more places to fix bustage, rs=bustage-fix.
Assignee

Updated

a year ago
Depends on: 1448241
Assignee

Updated

a year ago
Depends on: 1448245
You need to log in before you can comment on or make changes to this bug.