Open Bug 1616236 Opened 6 years ago Updated 2 months ago

"tp5n time_to_session_store_window_restored_ms opt e10s stylo" is too variable to be useful in a single 5-run

Tracking

(Not tracked)

Status:

NEW

People

(Reporter: standard8, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxp][vision])

Mark Banner (:standard8)

Reporter

Description

•

6 years ago

I've been doing various try runs with the aims of hunting down issues with a patch I've been trying to land.

I did test builds with 5 rebuilds a couple of weeks ago:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=49b9c64323badacb3e5aa4a12ff51c61f02321c3&newProject=try&newRevision=05bb331815307dbf0221bc22ecb5e1005744c0b6&framework=1#table-header-561038570

This shows a 14.41% regression (medium confidence) on windows7-32-shippable.

Today, I did another set of builds, using exactly the same m-c base and applied patches:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=87fd18a4071212bc00840c13378d82c816df974c&newProject=try&newRevision=30e6909acffed393b647baebc728890a447c5b8f&framework=1#table-header-561038570

This shows a 7.39% improvement (medium confidence) on windows7-32-shippable.

windows10-64 shows 8.22% low confidence which went down to 0.06% low confidence in the second run.

The variability here really implies that either 5 runs isn't enough, or there's something else affecting the test machines that makes this inconsistent.

(No longer employed by Mozilla) Aaron Klotz

Comment 1

•

6 years ago

time_to_session_store_window_restored_ms is useful when run locally, but not so useful in automation because it's not on dedicated hardware. We should probably not fire alerts based on that metric.

Greg Mierzwinski [:sparky]

Comment 2

•

6 years ago

:standard8, 5 runs is definitely not enough here. In this case, you should probably run at least 30 trials for the test on windows7-32. The improvement or regression in your change is probably very small if you are running into this issue. The variability of this metric is at least 10%.

Looking at the one with a 14% regression, there's an outlier in the data that is throwing it off.

That said, even without the outlier it's still a regression so I'm adding this issue to the fxperftest triage discussion topics because those results are contradictory and I wonder if we have anything in the works to help with this.

I want to mention that I did notice a regression in the perfherder data starting around Feb. 6th. The metric's value increased and the variability also increased (it became more bi-modal) with a change that occurred around that point: https://treeherder.mozilla.org/perf.html#/graphs?highlightAlerts=1&highlightedRevisions=49b9c64323ba&highlightedRevisions=05bb33181530&selected=1922259,1042059648&series=try,1915518,1,1&series=mozilla-central,1941169,1,1&series=autoland,1922259,1,1&timerange=31536000&zoom=1580262151795,1582217411881,537.8701858441731,1530.8610954716917

Priority: -- → P3

Whiteboard: [perftest:triage]

Mark Banner (:standard8)

Reporter

Comment 3

•

6 years ago

Thanks for the information. Unfortunately this combined with bug 1614805 makes this hard to analyse, but currently I'm thinking my patches have no overall major issues. I've added these two bugs to the xperf section on the wiki so that other people hitting issues here can hopefully find them more easily.

Greg Mierzwinski [:sparky]

Comment 4

•

6 years ago

We will look into this more in our sheriffable/non-sheriffable efforts being done in bug 1573129.

Depends on: 1573129

Greg Mierzwinski [:sparky]

Updated

•

6 years ago

Whiteboard: [perftest:triage]

Henrik Skupin [:whimboo][⌚️UTC+2]

Updated

•

6 years ago

Blocks: 1573129

No longer depends on: 1573129

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

Alex Finder

Updated

•

1 year ago

Whiteboard: [fxp]

Jira Integration Bot

Updated

•

1 year ago

See Also: → https://mozilla-hub.atlassian.net/browse/FXP-4142

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

5 months ago

Priority: P3 → P4

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

4 months ago

Whiteboard: [fxp] → [fxp][vision]

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

4 months ago

Severity: S3 → S4

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

2 months ago

Priority: P4 → P5

You need to log in before you can comment on or make changes to this bug.

Bugzilla

"tp5n time_to_session_store_window_restored_ms opt e10s stylo" is too variable to be useful in a single 5-run

Categories

(Testing :: Talos, defect, P5)

Tracking

(Not tracked)

People

(Reporter: standard8, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxp][vision])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Updated

Updated

Updated

Updated

Updated

Updated

Updated

Updated