1623850 - 3.09 - 20.33% raptor-tp6 (linux64-shippable, linux64-shippable-qr, macosx1014-64-shippable) regression on push 0bd7b6fc23db7c5a9536dce865e742e7d6f7f7e8 (Sat March 14 2020)

Florin Strugariu [:Bebe]

Reporter

Description

•

4 years ago

Raptor has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=0bd7b6fc23db7c5a9536dce865e742e7d6f7f7e8

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

20% raptor-tp6-binast-instagram-firefox fcp macosx1014-64-shippable opt 243.33 -> 292.79
6% raptor-tp6-linkedin-firefox-cold loadtime linux64-shippable opt 2,437.12 -> 2,573.92
4% raptor-tp6-linkedin-firefox-cold loadtime linux64-shippable-qr opt 2,600.42 -> 2,713.33
3% raptor-tp6-yandex-firefox-cold loadtime linux64-shippable opt 1,690.54 -> 1,747.00
3% raptor-tp6-yandex-firefox-cold loadtime linux64-shippable-qr opt 1,738.17 -> 1,791.92

You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=25413

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a Treeherder page showing the Raptor jobs in a pushlog format.

To learn more about the regressing test(s) or reproducing them, please see: https://wiki.mozilla.org/TestEngineering/Performance/Raptor

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/TestEngineering/Performance/Talos/RegressionBugsHandling

Greg Mierzwinski [:sparky]

Updated

•

4 years ago

Component: Performance → JavaScript Engine: JIT

Flags: needinfo?(tcampbell)

Product: Testing → Core

Version: Version 3 → unspecified

Ted Campbell [:tcampbell]

Comment 1

•

4 years ago

The Spidermonkey change is in an extremely uncommon code path and is unlikely to be responsible.

I notice that the BinAST regression is just noise in the graph and bounces back and force. The other loadtime graphs seem to better track Bug 1508292 instead.

Christoph, does any part of Bug 1508292 seem like it might impact general loadtime metrics?

(Otherwise this bug might just be WORKSFORME since those graphs are so unstable)

Component: JavaScript Engine: JIT → DOM: Security

Flags: needinfo?(tcampbell) → needinfo?(ckerschb)

Christoph Kerschbaumer [:ckerschb]

Comment 2

•

4 years ago

•

Edited

(In reply to Ted Campbell [:tcampbell] from comment #1)

Christoph, does any part of Bug 1508292 seem like it might impact general loadtime metrics?

Most probably yes. Bug 1508292 is adding Sec-Fetch-* request headers to every load. For now, Sec-Fetch-* Headers are enabled in Nightly only and we are working on improving overall performance of the algorithm within Bug 1623053. I don't know if we can mark this bug as a duplicate of Bug 1623053 or not. For now I'll just add the 'See Also'.

Flags: needinfo?(ckerschb)

Updated

•

4 years ago

status-firefox74: --- → unaffected

status-firefox75: --- → unaffected

status-firefox-esr68: --- → unaffected

tracking-firefox76: --- → -

Christoph Kerschbaumer [:ckerschb]

Comment 3

•

4 years ago

Florin, I know this bug highlights performance impacts on a different range of tests, however it's closely related to Bug 1623053. Do you think we can mark as a duplicate?

Further, just to make sure we are on the same page, we are thinking of WONTFIXING Bug 1623053 in the end (it's not decided yet, we are still investigating).

I know it's a tradeoff between shipping an additional security feature VS performance; would the illustrated performance downgrade be acceptable if we want to support Sec-Fetch-* request headers, or would the perf hit have to much impact on our product?

Flags: needinfo?(fstrugariu)

Alexandru Ionescu (needinfo me) [:alexandrui]

Comment 4

•

4 years ago

•

Edited

Christoph, I was open to WONFIX for Bug 1623053 because that regression is at the lower limit of the regression threshold. In the current regression the magnitudes are higher, which requires a fix.

Christoph Kerschbaumer [:ckerschb]

Comment 5

•

4 years ago

(In reply to Alexandru Ionescu :alexandrui (needinfo me) from comment #4)

Christoph, I was open to WONFIX for Bug 1623053 because that regression is at the lower limit of the regression threshold. In the current regression the magnitudes are higher, which requires a fix.

Fair enough - that's what I wanted to know.

Is the performance impact also a problem within Nightly? If so, we could disable in Nightly as well (if really needed).

Florin Strugariu [:Bebe]

Reporter

Comment 6

•

4 years ago

We can leave this open as we are targeting Nightly and no need to disable this flag

:ckerschb on what version of Firefox are we targeting this for release? can you make sure we have the correct flag active

Flags: needinfo?(fstrugariu)

Christoph Kerschbaumer [:ckerschb]

Comment 7

•

4 years ago

Assigning to myself to take another look.

(In reply to Florin Strugariu [:Bebe] (needinfo me) from comment #6)

:ckerschb on what version of Firefox are we targeting this for release? can you make sure we have the correct flag active

It's really undecided when we are going to ship that at this point, because there are some spec issues around sec-fetch-user which need to be addressed within Bug 1621987 first.

Assignee: nobody → ckerschb

Status: NEW → ASSIGNED

Priority: -- → P2

Whiteboard: [domsecurity-active]

Ted Campbell [:tcampbell]

Comment 8

•

4 years ago

Just to clarify, I believe the BinAST 20% number to just be weird noise when I look at graph. The other load number do look closer to real issues though.

Ethan Tseng [:ethan]

Comment 9

•

4 years ago

Sec-Fetch-* is Nightly-only for now.
If it's still Nightly-only when 76 hits Beta, we should set 76 as unaffected.

status-firefox76: affected → fix-optional

Ryan VanderMeulen [:RyanVM]

Updated

•

4 years ago

status-firefox76: fix-optional → disabled

Christoph Kerschbaumer [:ckerschb]

Comment 10

•

4 years ago

Alexandru, thanks for your info on the same/similar performance issue within Bug 1623053#c16.

I assume the same statement holds true for this bug as well, right? I would imagine so but wanted to make sure. Thank you!

Flags: needinfo?(aionescu)

Alexandru Ionescu (needinfo me) [:alexandrui]

Comment 11

•

4 years ago

(In reply to Christoph Kerschbaumer [:ckerschb] from comment #10)

Alexandru, thanks for your info on the same/similar performance issue within Bug 1623053#c16.

I assume the same statement holds true for this bug as well, right? I would imagine so but wanted to make sure. Thank you!

That bug had only one item with about 2% regression. I see that it is the same culprit. This has multiple ones with high magnitudes which we can't ignore, unfortunately.

Flags: needinfo?(aionescu)

Christoph Kerschbaumer [:ckerschb]

Comment 12

•

4 years ago

(In reply to Alexandru Ionescu (needinfo me) :alexandrui from comment #11)

That bug had only one item with about 2% regression. I see that it is the same culprit. This has multiple ones with high magnitudes which we can't ignore, unfortunately.

Ok, thanks for the update. I'll investigate this problem again using the patches from Bug 1623053 as well. Thanks for now!

Christoph Kerschbaumer [:ckerschb]

Updated

•

4 years ago

Severity: normal → S3

Christoph Kerschbaumer [:ckerschb]

Comment 13

•

4 years ago

Attached file Bug 1623850: Improve performance of Sec-Fetch-* Headers. r=baku,dragana (obsolete) — Details

Christoph Kerschbaumer [:ckerschb]

Comment 14

•

4 years ago

Florian, I did three try runs the other day to investigate if my performance optimization patches for the feature Sec-Fetch are worth landing:
(a) Sec-Fetch Disabled
(b) Sec-Fetch Enabled
(c) Sec-Fetch Enabled with optimization patch applied

COMPARISON:
(a) Sec-Fetch Disabled VS (c) Sec-Fetch Enabled with optimization patch applied

I am slightly confused about results because if I look at the performance comparison of a VS c (see above link) I don't see the 20% slowdown for raptor-tp6-binast-instagram-firefox as mentioned in comment 0. If I select the raptor test suite I see e.g. 72% improvement for raptor-tp6-amazon-firefox-cold with a confidence level of medium. However it seems unlikely that Sec-Fetch-Headers would introduce a perf boost. On the the other end of the spectrum I see raptor-tp6-youtube-firefox with a Delta of 598% which seems really bad to me but was not reported in comment 0.

Summing it up, I was wondering if you could help me interpret that data of the provided perf run. I guess the biggest question I have is, what tests do need to be under what threshold so that we could consider enabling Sec-Fetch-Headers?

Flags: needinfo?(fstrugariu)

Florin Strugariu [:Bebe]

Reporter

Comment 15

•

4 years ago

•

Edited

the tests that need to be under review are:

raptor-tp6-binast-instagram-firefox fcp
macosx1014-64-shippable opt 20%
- https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=8143190f0da4cfc95c578a2847332c720b7da262&newProject=try&newRevision=8838b0c793a155f5991d59845203ac73f33b6c46&framework=10#table-header-3379730404
raptor-tp6-linkedin-firefox-cold loadtime
linux64-shippable opt 6%
linux64-shippable-qr 4%
- https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=8143190f0da4cfc95c578a2847332c720b7da262&newProject=try&newRevision=8838b0c793a155f5991d59845203ac73f33b6c46&framework=10#table-header-2812533469
raptor-tp6-yandex-firefox-cold loadtime
linux64-shippable opt 3%
linux64-shippable-qr 3%
- https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=8143190f0da4cfc95c578a2847332c720b7da262&newProject=try&newRevision=8838b0c793a155f5991d59845203ac73f33b6c46&framework=10#table-header-2771160532

please ignore the confidence, not-replayed and replayed tests as they are playback proxy metrics

Flags: needinfo?(fstrugariu) → needinfo?(ckerschb)

Florin Strugariu [:Bebe]

Reporter

Comment 16

•

4 years ago

Also retriggered the tests so we can have better results

Christoph Kerschbaumer [:ckerschb]

Comment 17

•

4 years ago

Thanks for all the info - I'll look into this again!

Flags: needinfo?(ckerschb)

Anne (:annevk)

Updated

•

3 years ago

Blocks: 1695911

Christoph Kerschbaumer [:ckerschb]

Updated

•

3 years ago

Assignee: ckerschb → nobody

Status: ASSIGNED → NEW

Niklas

Assignee

Updated

•

3 years ago

Assignee: nobody → ngogge

Status: NEW → ASSIGNED

Niklas

Assignee

Comment 18

•

3 years ago

Hey Florin, i have some updated try runs for this:

Comparision of 1. and 2., Comparison of 2. and 3.

I cant seem to find the results for the tests that are causing the problems. (e.g. raptor-tp6-binast-instagram-firefox)
Could you help me make some sense of these runs and numbers?

Flags: needinfo?(fstrugariu)

Ted Campbell [:tcampbell]

Comment 19

•

3 years ago

FYI, the raptor-tp6-binast-instagram-firefox test is removed as well as its underlying BinAST parsing code.

Christoph Kerschbaumer [:ckerschb]

Comment 20

•

3 years ago

(In reply to Ted Campbell [:tcampbell] from comment #19)

FYI, the raptor-tp6-binast-instagram-firefox test is removed as well as its underlying BinAST parsing code.

Thanks for the info - what does that mean for the perf regression discussed in this bug? From comment 0 I take that raptor-tp6-binast-instagram-firefox caused the biggest perf regression of 20%. I am generally somehow confused that adding an HTTP Header could cause such a big performance penalty. But let's see if Florin can help us move this bug forward.

Florin Strugariu [:Bebe]

Reporter

Comment 21

•

3 years ago

Also all raptor tests where migrated to browsertime.
:davehunt any suggestion on how should we go forward on this

Flags: needinfo?(fstrugariu) → needinfo?(dave.hunt)

Florin Strugariu [:Bebe]

Reporter

Comment 22

•

3 years ago

As a suggestion we could just check the borwsertime results and see if we get any performance improvements

Ted Campbell [:tcampbell]

Comment 23

•

3 years ago

To clarify, the large raptor-tp6-binast-instagram-firefox was almost certainly an unlucky amplification of a smaller perf change. For example certain page resources loading in a different order have made big differences in these page load numbers in the past.

Dave Hunt [:davehunt] [he/him] ⌚BST

Comment 24

•

3 years ago

(In reply to Florin Strugariu [:Bebe] (needinfo me) from comment #21)

Also all raptor tests where migrated to browsertime.
:davehunt any suggestion on how should we go forward on this

I agree with your comment 8 we can use the browsertime tests (we currently still have linkedin and yandex) to see if we still have a performance impact.

Flags: needinfo?(dave.hunt)

Christoph Kerschbaumer [:ckerschb]

Comment 25

•

3 years ago

So given the last few comments and the perf numbers in comment 18 - can we mark this bug as INVALID or WONTFIX in the end?

Flags: needinfo?(fstrugariu)

Florin Strugariu [:Bebe]

Reporter

Comment 26

•

3 years ago

Yep I think we can drop this as we don't have data to compare on

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

Flags: needinfo?(fstrugariu)

Resolution: --- → WONTFIX

Phabricator Automation

Updated

•

3 years ago

Attachment #9150070 - Attachment is obsolete: true

BMO Automation

Updated

•

3 years ago

Has Regression Range: --- → yes