Closed Bug 1623850 Opened 4 years ago Closed 3 years ago

3.09 - 20.33% raptor-tp6 (linux64-shippable, linux64-shippable-qr, macosx1014-64-shippable) regression on push 0bd7b6fc23db7c5a9536dce865e742e7d6f7f7e8 (Sat March 14 2020)

Categories

(Core :: DOM: Security, defect, P2)

defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox-esr68 --- unaffected
firefox74 --- unaffected
firefox75 --- unaffected
firefox76 - disabled

People

(Reporter: Bebe, Assigned: n.goeggi)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression, Whiteboard: [domsecurity-active])

Attachments

(1 obsolete file)

Raptor has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=0bd7b6fc23db7c5a9536dce865e742e7d6f7f7e8

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

20% raptor-tp6-binast-instagram-firefox fcp macosx1014-64-shippable opt 243.33 -> 292.79
6% raptor-tp6-linkedin-firefox-cold loadtime linux64-shippable opt 2,437.12 -> 2,573.92
4% raptor-tp6-linkedin-firefox-cold loadtime linux64-shippable-qr opt 2,600.42 -> 2,713.33
3% raptor-tp6-yandex-firefox-cold loadtime linux64-shippable opt 1,690.54 -> 1,747.00
3% raptor-tp6-yandex-firefox-cold loadtime linux64-shippable-qr opt 1,738.17 -> 1,791.92

You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=25413

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a Treeherder page showing the Raptor jobs in a pushlog format.

To learn more about the regressing test(s) or reproducing them, please see: https://wiki.mozilla.org/TestEngineering/Performance/Raptor

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/TestEngineering/Performance/Talos/RegressionBugsHandling

Component: Performance → JavaScript Engine: JIT
Flags: needinfo?(tcampbell)
Product: Testing → Core
Version: Version 3 → unspecified

The Spidermonkey change is in an extremely uncommon code path and is unlikely to be responsible.

I notice that the BinAST regression is just noise in the graph and bounces back and force. The other loadtime graphs seem to better track Bug 1508292 instead.

Christoph, does any part of Bug 1508292 seem like it might impact general loadtime metrics?

(Otherwise this bug might just be WORKSFORME since those graphs are so unstable)

Component: JavaScript Engine: JIT → DOM: Security
Flags: needinfo?(tcampbell) → needinfo?(ckerschb)

(In reply to Ted Campbell [:tcampbell] from comment #1)

Christoph, does any part of Bug 1508292 seem like it might impact general loadtime metrics?

Most probably yes. Bug 1508292 is adding Sec-Fetch-* request headers to every load. For now, Sec-Fetch-* Headers are enabled in Nightly only and we are working on improving overall performance of the algorithm within Bug 1623053. I don't know if we can mark this bug as a duplicate of Bug 1623053 or not. For now I'll just add the 'See Also'.

Flags: needinfo?(ckerschb)
See Also: → 1623053

Florin, I know this bug highlights performance impacts on a different range of tests, however it's closely related to Bug 1623053. Do you think we can mark as a duplicate?

Further, just to make sure we are on the same page, we are thinking of WONTFIXING Bug 1623053 in the end (it's not decided yet, we are still investigating).

I know it's a tradeoff between shipping an additional security feature VS performance; would the illustrated performance downgrade be acceptable if we want to support Sec-Fetch-* request headers, or would the perf hit have to much impact on our product?

Flags: needinfo?(fstrugariu)

Christoph, I was open to WONFIX for Bug 1623053 because that regression is at the lower limit of the regression threshold. In the current regression the magnitudes are higher, which requires a fix.

(In reply to Alexandru Ionescu :alexandrui (needinfo me) from comment #4)

Christoph, I was open to WONFIX for Bug 1623053 because that regression is at the lower limit of the regression threshold. In the current regression the magnitudes are higher, which requires a fix.

Fair enough - that's what I wanted to know.

Is the performance impact also a problem within Nightly? If so, we could disable in Nightly as well (if really needed).

We can leave this open as we are targeting Nightly and no need to disable this flag

:ckerschb on what version of Firefox are we targeting this for release? can you make sure we have the correct flag active

Flags: needinfo?(fstrugariu)

Assigning to myself to take another look.

(In reply to Florin Strugariu [:Bebe] (needinfo me) from comment #6)

:ckerschb on what version of Firefox are we targeting this for release? can you make sure we have the correct flag active

It's really undecided when we are going to ship that at this point, because there are some spec issues around sec-fetch-user which need to be addressed within Bug 1621987 first.

Assignee: nobody → ckerschb
Status: NEW → ASSIGNED
Priority: -- → P2
Whiteboard: [domsecurity-active]

Just to clarify, I believe the BinAST 20% number to just be weird noise when I look at graph. The other load number do look closer to real issues though.

Sec-Fetch-* is Nightly-only for now.
If it's still Nightly-only when 76 hits Beta, we should set 76 as unaffected.

Alexandru, thanks for your info on the same/similar performance issue within Bug 1623053#c16.

I assume the same statement holds true for this bug as well, right? I would imagine so but wanted to make sure. Thank you!

Flags: needinfo?(aionescu)

(In reply to Christoph Kerschbaumer [:ckerschb] from comment #10)

Alexandru, thanks for your info on the same/similar performance issue within Bug 1623053#c16.

I assume the same statement holds true for this bug as well, right? I would imagine so but wanted to make sure. Thank you!

That bug had only one item with about 2% regression. I see that it is the same culprit. This has multiple ones with high magnitudes which we can't ignore, unfortunately.

Flags: needinfo?(aionescu)

(In reply to Alexandru Ionescu (needinfo me) :alexandrui from comment #11)

That bug had only one item with about 2% regression. I see that it is the same culprit. This has multiple ones with high magnitudes which we can't ignore, unfortunately.

Ok, thanks for the update. I'll investigate this problem again using the patches from Bug 1623053 as well. Thanks for now!

Severity: normal → S3

Florian, I did three try runs the other day to investigate if my performance optimization patches for the feature Sec-Fetch are worth landing:
(a) Sec-Fetch Disabled
(b) Sec-Fetch Enabled
(c) Sec-Fetch Enabled with optimization patch applied

COMPARISON:
(a) Sec-Fetch Disabled VS (c) Sec-Fetch Enabled with optimization patch applied

I am slightly confused about results because if I look at the performance comparison of a VS c (see above link) I don't see the 20% slowdown for raptor-tp6-binast-instagram-firefox as mentioned in comment 0. If I select the raptor test suite I see e.g. 72% improvement for raptor-tp6-amazon-firefox-cold with a confidence level of medium. However it seems unlikely that Sec-Fetch-Headers would introduce a perf boost. On the the other end of the spectrum I see raptor-tp6-youtube-firefox with a Delta of 598% which seems really bad to me but was not reported in comment 0.

Summing it up, I was wondering if you could help me interpret that data of the provided perf run. I guess the biggest question I have is, what tests do need to be under what threshold so that we could consider enabling Sec-Fetch-Headers?

Flags: needinfo?(fstrugariu)

Also retriggered the tests so we can have better results

Thanks for all the info - I'll look into this again!

Flags: needinfo?(ckerschb)
Blocks: 1695911
Assignee: ckerschb → nobody
Status: ASSIGNED → NEW
Assignee: nobody → ngogge
Status: NEW → ASSIGNED

Hey Florin, i have some updated try runs for this:

  1. sec-fetch disabled
  2. sec-fetch enabled
  3. sec-fetch enabled with optimization patch applied

Comparision of 1. and 2., Comparison of 2. and 3.

I cant seem to find the results for the tests that are causing the problems. (e.g. raptor-tp6-binast-instagram-firefox)
Could you help me make some sense of these runs and numbers?

Flags: needinfo?(fstrugariu)

FYI, the raptor-tp6-binast-instagram-firefox test is removed as well as its underlying BinAST parsing code.

(In reply to Ted Campbell [:tcampbell] from comment #19)

FYI, the raptor-tp6-binast-instagram-firefox test is removed as well as its underlying BinAST parsing code.

Thanks for the info - what does that mean for the perf regression discussed in this bug? From comment 0 I take that raptor-tp6-binast-instagram-firefox caused the biggest perf regression of 20%. I am generally somehow confused that adding an HTTP Header could cause such a big performance penalty. But let's see if Florin can help us move this bug forward.

Also all raptor tests where migrated to browsertime.
:davehunt any suggestion on how should we go forward on this

Flags: needinfo?(fstrugariu) → needinfo?(dave.hunt)

As a suggestion we could just check the borwsertime results and see if we get any performance improvements

To clarify, the large raptor-tp6-binast-instagram-firefox was almost certainly an unlucky amplification of a smaller perf change. For example certain page resources loading in a different order have made big differences in these page load numbers in the past.

(In reply to Florin Strugariu [:Bebe] (needinfo me) from comment #21)

Also all raptor tests where migrated to browsertime.
:davehunt any suggestion on how should we go forward on this

I agree with your comment 8 we can use the browsertime tests (we currently still have linkedin and yandex) to see if we still have a performance impact.

Flags: needinfo?(dave.hunt)

So given the last few comments and the perf numbers in comment 18 - can we mark this bug as INVALID or WONTFIX in the end?

Flags: needinfo?(fstrugariu)

Yep I think we can drop this as we don't have data to compare on

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Flags: needinfo?(fstrugariu)
Resolution: --- → WONTFIX
Attachment #9150070 - Attachment is obsolete: true
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: