Closed Bug 1568501 Opened 5 years ago Closed 5 years ago

talos regression on perf-reftest-singletons/bidi-resolution-1.html or (on ARM64 Windows) Perma [tier2] console.error: "You should not call finishTest without having first initted the Profiler" (from perf-reftest-singleton suite timing out)

Categories

(Core :: Layout: Text and Fonts, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla70
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox68 --- unaffected
firefox69 --- unaffected
firefox70 --- fixed

People

(Reporter: malexandru, Assigned: dbaron)

References

(Regression)

Details

(Keywords: perf, regression)

Attachments

(2 files)

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=258042228&repo=mozilla-central&lineNumber=960

05:32:37 INFO - PID 1414 | Cycle 1(14): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/nth-index-2.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html)
05:32:38 INFO - PID 1414 | Waiting for idle-callback
05:32:39 INFO - PID 1414 | Cycle 1(15): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/nth-index-2.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html)
05:33:10 INFO - PID 1414 | __WARNTimeout (1/3) exceeded on http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html__WARN
05:33:24 INFO - PID 1414 | Cycle 1(0): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:33:24 INFO - PID 1414 | Waiting for idle-callback
05:34:07 INFO - PID 1414 | Cycle 1(0): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:34:38 INFO - PID 1414 | __WARNTimeout (2/3) exceeded on http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html__WARN
05:34:54 INFO - PID 1414 | Cycle 1(1): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:34:55 INFO - PID 1414 | Waiting for idle-callback
05:35:40 INFO - PID 1414 | Cycle 1(1): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:36:11 INFO - PID 1414 | __FAILTimeout in perf-reftest-singletons__FAIL
05:36:11 INFO - PID 1414 | __FAILTimeout (3/3) exceeded on http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html__FAIL
05:36:11 INFO - PID 1414 | console.error: "You should not call finishTest without having first initted the Profiler"
05:36:27 INFO - PID 1414 | Cycle 1(3): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:36:27 INFO - PID 1414 | Waiting for idle-callback
05:37:12 INFO - PID 1414 | Cycle 1(3): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:37:42 INFO - PID 1414 | __FAILTimeout in perf-reftest-singletons__FAIL
05:37:42 INFO - PID 1414 | __FAILTimeout (4/3) exceeded on http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html__FAIL
05:37:42 INFO - PID 1414 | console.error: "You should not call finishTest without having first initted the Profiler"
05:37:59 INFO - PID 1414 | Cycle 1(5): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:38:29 INFO - PID 1414 | __FAILTimeout in perf-reftest-singletons__FAIL
05:38:29 INFO - PID 1414 | __FAILTimeout (5/3) exceeded on http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html__FAIL
05:38:29 INFO - PID 1414 | console.error: "You should not call finishTest without having first initted the Profiler"
05:38:45 INFO - PID 1414 | Cycle 1(6): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:38:45 INFO - PID 1414 | Waiting for idle-callback
05:39:15 INFO - PID 1414 | __FAILTimeout in perf-reftest-singletons__FAIL
05:39:15 INFO - PID 1414 | __FAILTimeout (6/3) exceeded on http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html__FAIL
05:39:15 INFO - PID 1414 | console.error: "You should not call finishTest without having first initted the Profiler"
05:39:32 INFO - PID 1414 | Cycle 1(7): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:39:32 INFO - PID 1414 | Waiting for idle-callback
05:40:03 INFO - PID 1414 | __FAILTimeout in perf-reftest-singletons__FAIL
05:40:03 INFO - PID 1414 | __FAILTimeout (7/3) exceeded on http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html__FAIL
05:40:03 INFO - PID 1414 | console.error: "You should not call finishTest without having first initted the Profiler"
05:40:17 INFO - PID 1414 | Cycle 1(8): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:40:17 INFO - PID 1414 | Waiting for idle-callback
05:40:48 INFO - PID 1414 | __FAILTimeout in perf-reftest-singletons__FAIL
05:40:48 INFO - PID 1414 | __FAILTimeout (8/3) exceeded on http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html__FAIL
05:40:48 INFO - PID 1414 | console.error: "You should not call finishTest without having first initted the Profiler"
05:41:05 INFO - PID 1414 | Cycle 1(9): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:41:35 INFO - PID 1414 | __FAILTimeout in perf-reftest-singletons__FAIL
05:41:35 INFO - PID 1414 | __FAILTimeout (9/3) exceeded on http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html__FAIL
05:41:35 INFO - PID 1414 | console.error: "You should not call finishTest without having first initted the Profiler"
05:41:50 INFO - PID 1414 | Cycle 1(10): loaded http://127.0.0.1:50086/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://127.0.0.1:50086/tests/perf-reftest-singletons/id-getter-1.html)
05:42:21 INFO - PID 1414 | __FAILTimeout in perf-reftest-singletons__FAIL

Summary: console.error: "You should not call finishTest without having first initted the Profiler" → Perma [tier2] console.error: "You should not call finishTest without having first initted the Profiler"

This seems like some sort of a hang in perf-reftest-singletons/bidi-resolution-1.html. Seems like a plausible regression from this patch -- though could easily be related to some existing problem in the code.

Any idea how to get access to an ARM64 windows machine (preferably one with a development environment already set up) to debug this? Who owns ARM64 Windows and would know about this?

Component: Gecko Profiler → Layout: Text and Fonts
OS: Unspecified → Windows 10
Hardware: Unspecified → ARM64
Summary: Perma [tier2] console.error: "You should not call finishTest without having first initted the Profiler" → [aarch64 Windows] Perma [tier2] console.error: "You should not call finishTest without having first initted the Profiler", because perf-reftest-singletons/bidi-resolution-1.html is timing out

Then again, it looks like this test got a lot slower on other platforms:
https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,1646989,1,1&series=mozilla-central,1649907,1,1&series=mozilla-central,1649913,1,1
so presumably this is just a misreporting of a performance regression.

So this bug should also cover https://treeherder.mozilla.org/perf.html#/alerts?id=22088 -- except I can't mark it as such since I'm not a sheriff.

(In reply to David Baron :dbaron: 🏴󠁵󠁳󠁣󠁡󠁿 ⌚UTC-7 from comment #2)

Any idea how to get access to an ARM64 windows machine (preferably one with a development environment already set up) to debug this? Who owns ARM64 Windows and would know about this?

Nathan, maybe? Though if it regressed on other platforms too chances are you can just investigate the regression on your regular machine, and hopefully the fix should be cross-platform again.

Yeah, it's really cross-platform; it's just that it's only orange on ARM64, so this bug got filed for that.

Summary: [aarch64 Windows] Perma [tier2] console.error: "You should not call finishTest without having first initted the Profiler", because perf-reftest-singletons/bidi-resolution-1.html is timing out → [aarch64 Windows] Perma [tier2] console.error: "You should not call finishTest without having first initted the Profiler" / talos regression (or suite timeout) on perf-reftest-singletons/bidi-resolution-1.html
Flags: needinfo?(dbaron)
Summary: [aarch64 Windows] Perma [tier2] console.error: "You should not call finishTest without having first initted the Profiler" / talos regression (or suite timeout) on perf-reftest-singletons/bidi-resolution-1.html → talos regression on perf-reftest-singletons/bidi-resolution-1.html or (on ARM64 Windows) Perma [tier2] console.error: "You should not call finishTest without having first initted the Profiler" (from perf-reftest-singleton suite timing out)
Assignee: nobody → dbaron
Status: NEW → ASSIGNED

This testcase deals with a single line that's very large. I think it's showing two hotspots where we're searching lines from the beginning rather than the end. I think one of them is this one which was something I thought about doing differently when working on bug 1566945; I don't see the other one based on looking at the profile so I'll have to drop into gdb to figure it out.

Priority: -- → P3

OK, the second problem is that we're nulling out mPrevFrame between every pair of calls to AdvanceToFrame. I'm actually not sure how much of that nulling out is necessary given the two-iterator setup, so I should look and see what can be eliminated.

Here's a try run with what I think should fix both problems.

The talos run on autoland just before my push had:

[#15] bidi-resolution-1.html  Cycles:15  Average:399.23  Median:399.74  stddev:3.41 (0.9%)  stddev-sans-first:3.24
Values: 394.3  399.7  395.1  396.5  400.6  395.5  399.3  401.3  401.4  398.1  400.2  405.4  404.3  395.1  401.6

A run of the same job on try with my fixes has:

[#15] bidi-resolution-1.html  Cycles:15  Average:389.25  Median:388.14  stddev:3.79 (1.0%)  stddev-sans-first:3.92
Values: 388.1  394.0  389.5  389.9  387.3  386.9  399.3  385.4  390.9  386.6  385.6  392.4  386.6  390.0  386.1

So maybe even a little faster after all the patches? I think that's plausible.

Keywords: perf, regression
OS: Windows 10 → All
Priority: P3 → P2
Hardware: ARM64 → All

(In reply to David Baron :dbaron: 🏴󠁵󠁳󠁣󠁡󠁿 ⌚UTC-7 from comment #2)

This seems like some sort of a hang in perf-reftest-singletons/bidi-resolution-1.html. Seems like a plausible regression from this patch -- though could easily be related to some existing problem in the code.

Any idea how to get access to an ARM64 windows machine (preferably one with a development environment already set up) to debug this? Who owns ARM64 Windows and would know about this?

The best way we have to get an ARM64 windows machine right now is "order one and expense it". You then have to set up a cross-compilation environment from your Windows desktop (which implies copying over packages, etc. to the ARM64 machine); we don't support compiling on the machines themselves. Visual Studio does install onto the ARM64 laptops, so once you've copied things over, you should be able to debug more-or-less like you would on any "normal" Windows machine.

But it sounds like the problem has been resolved in the interim anyway...?

Pushed by dbaron@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8b24c854efca
Invalidate FastLineIterator::mPrevFrame only when lines actually split. r=jfkthame
https://hg.mozilla.org/integration/autoland/rev/be7e1bce40d8
Search from the end of the line rather than the start, like this code did before bug 1566945. r=dholbert

Here's one talos graph showing that the regression is fixed.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla70
Blocks: 1562138

Confirming the performance fix:

== Change summary for alert #22160 (as of Fri, 26 Jul 2019 13:38:38 GMT) ==

Improvements:

100% perf_reftest bidi-resolution-1.html linux64-shippable opt e10s stylo 15,558.26 -> 12.21
98% perf_reftest_singletons bidi-resolution-1.html linux64-shippable-qr opt e10s stylo 17,668.92 -> 432.96
98% perf_reftest_singletons bidi-resolution-1.html linux64-shippable opt e10s stylo 15,946.01 -> 392.86
43% perf_reftest linux64-shippable opt e10s stylo 2.74 -> 1.56
14% perf_reftest_singletons linux64-shippable-qr opt e10s stylo 56.08 -> 48.13
14% perf_reftest_singletons linux64-shippable opt e10s stylo 53.39 -> 45.83

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=22160

Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: