Open Bug 1536090 Opened 6 years ago Updated 2 years ago

Reduce noise by increasing the browser settle time

Categories

(Testing :: Raptor, enhancement, P3)

enhancement

Tracking

(Not tracked)

People

(Reporter: davehunt, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(1 file)

In bug 1525017 :acreskey found that increasing the time for the browser to settle from 30 to 90 made a significant improvment the the noise in the results. This bug is for identifying a suitable settle time for Raptor tests that balances noise improvements with test run times. Let's experiment with various settle times to see how the noise in impacted for each platform. If we can reduce noise significantly then we may also be able to reduce the number of page cycles, which could help to keep the job run times down.

A related benefit: developers often have to trigger repeat jobs in order to raise the confidence level on performance differences.
With reduced noise fewer repeat jobs would be required.

I'm capturing samples in order to determine what components are active after the current 30second delay in raptor. (So far I see telemetry being submitted, BHMgr Processor, a large GC major, and a few smaller tasks). Ideally we could prevent these as well.

Assignee: nobody → marian.raiciof

I did a round of push to try for different times on browser settle time.
When these will be finished i will be able to compare in perfherder the results.

In the meantime i started to measure the metrics on my local setup and i will
put the data in this document:
https://docs.google.com/spreadsheets/d/1BKmUphvrCWoDuz0Tih5b83znOpLkqPS0vXeAVLq3_RE/edit?usp=sharing

Treeherder URLs:

Browser settle time = 35000
./mach try fuzzy -q tp6m -e

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0fcf5cca33e6e5e131162ea66338534c2fa1d390

=====================================================================

Browser settle time = 35000
./mach try fuzzy -q tp6 -e

https://treeherder.mozilla.org/#/jobs?repo=try&revision=282d41a29c45148974225e7a2318bee248e87e62

=====================================================================

Browser settle time = 40000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0a4f22bb13b99708296c7c35d450f88a3606d8e4

=====================================================================

Browser settle time = 50000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=344c322c312b42f27aace9ba799e49ca416940fe

=====================================================================

Browser settle time = 60000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=bd7167ecc9747f74c25444ce175aa1a96562a79b

=====================================================================

Browser settle time = 70000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6577fc994924dab6e17559c64a90c819b4d99523

=====================================================================

Browser settle time = 80000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2dfd534ed6a9af11b7a15879c0dffee1c26d9544

=====================================================================

Browser settle time = 90000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7b732e7b74b8feaff15a76533451a657bda3d515

=====================================================================

Browser settle time = 100000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a6eb59375c524bd46a946ca5838656c9123a9dae

PLATFORMS

=====================================================================

Browser settle time = 100000
./mach try chooser --full

  • select all tests for tp6 and tp6m

android, android-aarch, win 32, win 64, linux, linux 64, mac 64 : normal and nightly

https://treeherder.mozilla.org/#/jobs?repo=try&revision=16a46e31cbff26bda9f42d62a272c1c706b437bf

=====================================================================

Browser settle time = 90000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0196b4cb97c5fb4df1e84311471a7859baca8342

=====================================================================

Browser settle time = 80000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c86e3c9f2bc143f53e3735cfc6d22a1177d90cfd

=====================================================================

Browser settle time = 70000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a87db66bea0fd1f541951a7742ae14d0a3ff8c0f

=====================================================================

Browser settle time = 60000
./mach try chooser --full

  • select all tests for tp6 and tp6m

https://treeherder.mozilla.org/#/jobs?repo=try&revision=91c77a44121dc521449a9f002832594d0d269f90

Here are a set of measurements for a few tests running on local raptor:
https://docs.google.com/spreadsheets/d/1BKmUphvrCWoDuz0Tih5b83znOpLkqPS0vXeAVLq3_RE/edit#gid=0

Marian, can you tell me:
for each of the experiments in Comment 3 (e.g. Browser Settle time = 30000), how many times was the job run?

Also, I'm curious, is there a reason why only the mobile tp6m tests were run?

Flags: needinfo?(marian.raiciof)

Hi Andrew,

For each test i run the job once (to be more specific 15 pagecycles per job).
For the second question - no there is no reason. Let me know if i should start measuring the desktop tests.

Thanks!

Flags: needinfo?(marian.raiciof)

Marian, thanks, that's interesting - I didn't realize that the tp6m tests were 15 pagecycles and not 25.

Because we're working with results that are very noisy I highly recommend running the jobs numerous times to increase n

I would also suggest testing on desktop as well because there are different background tasks running when the desktop browser starts up vs on android (kinto download of intermediate certs is one that's fresh in my mind).

I personally find it's also much faster to get results using desktop on the try server.

acreskey: settle times for 25 page-cycles are now available on the spreadsheet, could you take a look?

Flags: needinfo?(acreskey)

marian: could you perform the same experiment on desktop?

Flags: needinfo?(marian.raiciof)

Those desktop experiments should be interesting, we can compare them with the Perfherder Compare feature.
e.g.
160s settle vs 30s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=aed1e9993ae6383460c58061bef6b91e3e6a63e2&newProject=try&newRevision=83981a0af0b33970ff26943d92830fa65e750d76&framework=10

Marian, thank you for increasing the pagecycle count.
I'm not sure I explained myself well but I think it's also very important that each test be run multiple times. And when I say multiple I mean a lot, e.g. 20+, until the values start to converge.
Maybe this was done in the spreadsheet? Not clear to me.
If you run one of those jobs again I think you will see that the results will be wildly different from the previous iteration.
The settle time is not the only variable affecting the noise (unfortunately), so we need to collect a large number of results to effectively reduce the standard error.

Flags: needinfo?(acreskey)

Marian: I believe you should be able to retrigger the jobs in Treeherder to build up results and show the relative noise. I think there's also a way to trigger rebuids via the command line when pushing to try. jmaher: could you confirm or direct marian to the relevant docs?

Flags: needinfo?(jmaher)

./mach try fuzzy -q '...' --rebuild X

so X is the rebuild you need.

you can also retrigger the jobs in the treeherder UI.

Flags: needinfo?(jmaher)

I have retriggered all the jobs for this platform : "Windows 10 x64 Shippable opt"
on the URLs from comment 9:
https://bugzilla.mozilla.org/show_bug.cgi?id=1536090#c9

I finished the cold-measurements.

https://docs.google.com/spreadsheets/d/1BKmUphvrCWoDuz0Tih5b83znOpLkqPS0vXeAVLq3_RE/edit#gid=0

On the “loadtime“ column i have selected the smallest two values for each test.
Darker green is for the smallest value, lighter green is for the next smallest value - but bigger than the first.

I cannot pick one browser settle time that will fit all the tests, because the results are not conclusive.

acreskey: What's your opinion on the results ?

Flags: needinfo?(acreskey)

I have updated the excel document with colors for each standard deviation column and loadtime field.

Green - smallest value
Pink / Magenta: next smallest value (but bigger than the green one)

https://docs.google.com/spreadsheets/d/1BKmUphvrCWoDuz0Tih5b83znOpLkqPS0vXeAVLq3_RE/edit#gid=0

Hi, Marian. Sorry for the delay.

To be clear, when looking at the results in the spreadsheet from Comment 15,
for instance

Browser Settle Time:
30000

Test: raptor-tp6m-5
raptor-tp6m-amazon-search-geckoview	2344.5	1088	227.2476985	1298.5	239.066603	1198	235.7633848	4448.5	1677.659432	9401.5	2479.269198

Are those the results from a single run of the given test?

Flags: needinfo?(acreskey) → needinfo?(marian.raiciof)

To get more datapoints for this I kicked off two tests, one where the raptor post_startup_delay was increased from 30s to 90s and another where it was reduced to 15s.

I'm comparing against an android baseline revision that I made last week:

Increased raptor settle time from 30seconds to 90seconds
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c059749d487e8668fb006b1f247de5f34edc5897&newProject=try&newRevision=7d4ad5440b37b53f8901ba65eae638649657542d&framework=10

Reduced raptor settle time from 30seconds to 15seconds
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c059749d487e8668fb006b1f247de5f34edc5897&newProject=try&newRevision=c68660a907b65fa44aecef508c860fea3271e9d3&framework=10

So there are still jobs to complete.
I'm only looking at results where there are 20 retries, because otherwise I feel the single job results are too noisy.

So far I'm not seeing anything conclusive here. Strangely the performance of amazon.com increases by 10% and 8% on each test (over 20 retries...)

Hi Andrew,

Yes, a single run of the main test with post_startup_delay set to 30000.
There were 15 page cycles for each subtest.
On the following tabs from the spreadsheet document i measured the same tests but with 25 pagecycles.

Flags: needinfo?(marian.raiciof)

Marian,

As a quick test, can you redo one of the tests and share the results?
e.g.

30000 

Test: raptor-tp6m-1

For example, the one with the 25 pagecycles.

My concern is that the results will be very different from what was recorded on the spreadsheet.
It's not uncommon to see the raptor medians vary by 20% one run to another.

You can see a good example of how the std dev for these metrics will vary run to run from Rob's results here.

I think this would be valuable to see here, otherwise it's hard to draw any conclusions.

Flags: needinfo?(marian.raiciof)

Hi Andrew,

Here are the results:

Test:
raptor-tp6m-1 : 25 pagecycles, browser settle time : 30000

raptor-tp6m-amazon-geckoview

geomean, dcf, dcf-stdev, fcp, fcp-stdev, fnbpaint, fnbpaint-stdev, loadtime, loadtime-stdev
792.22, 716.5, 170.9966469287324, 825.5, 173.31381753289634, 747.0, 173.81497325471196, 891.5, 356.40194445947225

raptor-tp6m-facebook-geckoview

geomean, dcf, dcf-stdev, fcp, fcp-stdev, fnbpaint, fnbpaint-stdev, loadtime, loadtime-stdev
996.5, 1051.5, 97.66148833776714, 681.5, 72.88793015526535, 657.5, 69.01605305803966, 2092.0, 943.2073365438508

raptor-tp6m-google-geckoview

geomean, dcf, dcf-stdev, fcp, fcp-stdev, fnbpaint, fnbpaint-stdev, loadtime, loadtime-stdev
165.89, 166.0, 29.481724872355677, 190.5, 33.25690513457972, 153.5, 30.64133326648542, 156.0, 27.521796368738702

raptor-tp6m-youtube-geckoview

geomean, dcf, dcf-stdev, fcp, fcp-stdev, fnbpaint, fnbpaint-stdev, loadtime, loadtime-stdev
427.96, 606.5, 95.78076377929405, 485.0, 61.79067046506081, 159.5, 30.540748498368202, 713.5, 287.6285345946112

raptor-tp6m-amazon-geckoview

geomean, dcf, dcf-stdev, fcp, fcp-stdev, fnbpaint, fnbpaint-stdev, loadtime, loadtime-stdev
806.6, 730.0, 259.2328111222012, 803.0, 273.1071583234981, 756.5, 263.7955263270113, 954.5, 1741.0053312399125

raptor-tp6m-facebook-geckoview

geomean, dcf, dcf-stdev, fcp, fcp-stdev, fnbpaint, fnbpaint-stdev, loadtime, loadtime-stdev
1004.67, 1074.0, 651.7242014706901, 681.5, 656.4975895956721, 656.0, 650.9279607951463, 2121.0, 1219.565698416294

raptor-tp6m-google-geckoview

geomean, dcf, dcf-stdev, fcp, fcp-stdev, fnbpaint, fnbpaint-stdev, loadtime, loadtime-stdev
157.02, 160.5, 15.763652927643022, 182.0, 21.756891295416718, 144.0, 19.27710777212108, 144.5, 18.20609668549904

raptor-tp6m-youtube-geckoview

geomean, dcf, dcf-stdev, fcp, fcp-stdev, fnbpaint, fnbpaint-stdev, loadtime, loadtime-stdev
417.4, 565.5, 337.33996742631666, 495.0, 313.3756832311335, 163.0, 17.665521565295293, 664.0, 377.3860803251604

Flags: needinfo?(marian.raiciof)

Thanks Marian.

I think that with the large variations from those two runs you would agree that we will need to collect data from many runs into order to have confidence in our conclusions.

I'm comparing your initial patches against each other:

30s settle (left) to 40s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=83981a0af0b33970ff26943d92830fa65e750d76&newProject=try&newRevision=9edc0c659faef8c38346db9b5e5101dc964d3f34&framework=10
Looking at the standard deviation for windows10-64-shippable metrics (the only one with more than 1 retry), I don't see any consistent improvements.

30s settle (left) to 50s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=83981a0af0b33970ff26943d92830fa65e750d76&newProject=try&newRevision=5d4d5757a917930402eaef67e6930b6038d1b712&framework=10
In this case the std dev looks to be worse in general...

30s settle (left) to 60s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=83981a0af0b33970ff26943d92830fa65e750d76&newProject=try&newRevision=570d2d2402ccfc8e61b689002b14e670ba8ed236&framework=10
This one is interesting because it's showing a high-confidence improvement of 18.84% on raptor-tp6-reddit-firefox opt
Otherwise not particularly helpful for noise.

30s settle (left) to 70s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=83981a0af0b33970ff26943d92830fa65e750d76&newProject=try&newRevision=b112e20b46ba8b2b250a9e105128e43fd7a9dfdb&framework=10
Still seeing the improvement on raptor-tp6-reddit-firefox opt (it's actually a ~42% improvement of loadtime)
And in general I would say there's a small improvement in std dev.

30s settle (left) to 80s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=83981a0af0b33970ff26943d92830fa65e750d76&newProject=try&newRevision=83da21b9a0eb9881e6bed00be17fa7d69f8aadaa&framework=10
Similar to 60s, 70s

30s settle (left) to 90s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=83981a0af0b33970ff26943d92830fa65e750d76&newProject=try&newRevision=05de7e37202f39d4ad150c1efb1b88d785052dcd&framework=10
This one adds a high-confidence ~5% performance improvement on raptor-tp6-instagram-firefox opt

30s settle (left) to 100s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=83981a0af0b33970ff26943d92830fa65e750d76&newProject=try&newRevision=9ef51c3d0f987587ca0ee9b44207dba775d7339f&framework=10
Improvement on raptor-tp6-instagram-firefox opt still present.

30s settle (left) to 120s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=83981a0af0b33970ff26943d92830fa65e750d76&newProject=try&newRevision=0b94dd25bf4206a11bb524386039b53efe747a64&framework=10
Similar

30s settle (left) to 140s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=83981a0af0b33970ff26943d92830fa65e750d76&newProject=try&newRevision=cc4088c1bd6f2b926e5003e6c25562817fc0b440&framework=10
Similar

30s settle (left) to 160s settle
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=83981a0af0b33970ff26943d92830fa65e750d76&newProject=try&newRevision=aed1e9993ae6383460c58061bef6b91e3e6a63e2&framework=10
Similar

So overall, at least on windows10-64-shippable, I see almost no evidence that increasing the browser settle time reduces noise in any significant way.

It might be useful to add jobs to the reference laptop tests (windows10-64-ux) as we know that the hardware is strained and the results could be different there.

What I find most interesting is that there is a consistent loadtime improvement on raptor-tp6-reddit-firefox opt by 40-45% once the settle time has increased. And a less improvement of perhaps 3-5% on instagram.
This I will log and profile to find out what's happening.

Reddit loadtime significantly improved by browser settle time: Bug 1549594

:marauder could you schedule jobs testing 30s, 60s and 90s settle times on windows10-64-ux?

Flags: needinfo?(marian.raiciof)

Hi Dave, Andrew,

Push to try for windows10-64-ux :

Desktop websites running with post_startup_delay set to 90s
https://treeherder.mozilla.org/#/jobs?repo=try&revision=fd1b2b027592f817d4f07f0ddff00f5fcd776a03

Desktop websites running with post_startup_delay set to 60s
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cdd475d25db60ff05925c13d5cee7c2b04916355

Desktop websites running with post_startup_delay set to 30s
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1ed5f45529e9619cc4fc9a3b74f22b5bd3b97184

Flags: needinfo?(marian.raiciof)

Hi Marian, for reasons unknown to me the tp6 jobs on those pushes all failed with exception?

Flags: needinfo?(marian.raiciof)

Hi Andrew,

I pushed to try again. Let's see how these goes:

Platform: windows10-64-ux

Desktop websites running with post_startup_delay set to 30s (default value)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6fb140a572c11c7261d9732345c8096fb192ba15

Desktop websites running with post_startup_delay set to 60s
https://treeherder.mozilla.org/#/jobs?repo=try&revision=55b37916ae7439380e2681b5e45b06060d171501

(2nd push to try for 60s but with the post_startup_delay modified from cmdline.py)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cb4e5d4d2da1ba48993ff699648db62d453efb6c

Desktop websites running with post_startup_delay set to 90s (from cmdline.py)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=62e2e2a13aacd49c2ae2c50e1ad754cded21fd1c

Flags: needinfo?(marian.raiciof)
Flags: needinfo?(acreskey)

Hmm... those retry jobs completed as "exception" ... soft freeze?

Marian, I don't know why, but your try jobs from Comment 27 have all failed. Can you trigger repeat runs on them?

Flags: needinfo?(acreskey) → needinfo?(marian.raiciof)

Hi Andrew,

I have retrigger a few jobs -

  • 10 retriggers for tp6-5 on first URL,
  • 5 for each test on the 2nd url
  • 3 for each test on the 3rd url
  • 2 for each test on the 4th url.

I tried small numbers because these tests fail very often.
Maybe the machines that are running windows10-64-ux are just a few and get blocked quickly.

Flags: needinfo?(marian.raiciof)
Flags: needinfo?(acreskey)

I added 5 to each set so we can get a better view.

I'm looking at the comparisons from comment 28 .
Is the Noise Metric valid when the number of jobs for each revision is different? I'm not sure about that.

But, test for test, I see a general drop in noise as the settle time is increased to 60 seconds.
Not clear that it's improved at all going from 60 to 90 seconds.

I created this bug a while back. Since then there are other possible solutions that I've learnt about:
• Increase the browser settle time, as in the bug title
• Use a conditioned profile: A new profile is made, waits for a long settle time, e.g. 2 minutes, and then is copied so that it can be used as the basis for each test. We've had a lot of success with this route in local browsertime testing. It also reduces the amount of time it takes to run tests since the browser only has to "settle" once.
• Profile the early tp6 pageloads on the reference laptop and find more root causes of early noise. One problem is that these cannot always be solved -- e.g. the GC that tends to run during the first loads.

From these try experiments I would say that increasing the browser settle time would help noise on the UX hardware, but I think it would have to be part of an overall strategy to improve the runtime/test configuration for these devices.

From looking at the two 60-second settle comparisons (which should be the same effective code), we see that many tests differ in reported geomeans by 6+%, even over 7+ runs.
60s settle (left) to 60s settle via cmdline.py
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=55b37916ae7439380e2681b5e45b06060d171501&newProject=try&newRevision=cb4e5d4d2da1ba48993ff699648db62d453efb6c&framework=10

Flags: needinfo?(acreskey)
Priority: P1 → P2
Status: NEW → ASSIGNED

Let's implement the conditioned profiles (bug 1537944) first, and then revisit the settle time so increasing it doesn't exponentially increase the run time of these tests.

Depends on: 1537944
See Also: → 1561324

Marian, are you still working on this bug? If not please unassign yourself and reset the priority.

Flags: needinfo?(marian.raiciof)
Priority: P2 → P1

I finished the investigation on this a while ago.
As i remember it was a plan to retest these measurements when conditioned profile is landed.

I will perform those push to try and generate the compare links to see how it looks.

Thanks!

Flags: needinfo?(marian.raiciof)

(In reply to Marian Raiciof [:marauder] from comment #37)

I finished the investigation on this a while ago.
As i remember it was a plan to retest these measurements when conditioned profile is landed.

I will perform those push to try and generate the compare links to see how it looks.

The change here would now be to the conditioned profiles themselves. Currently the "settled" profile waits 30 seconds. We would want to experiment with variations on this, but we should wait for bug 1602657 to land.

Depends on: 1602657
Flags: needinfo?(marian.raiciof)

The base is a push to try for the default value : post_startup_delay = 1000ms

The latest results:

Base vs 15s
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=eda488bfcd601a9b1ee058630ccccedec121cc83&newProject=try&newRevision=148d5ba73399365b147df43edfd0ca438abd73c3&framework=10

Base vs 30s
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=eda488bfcd601a9b1ee058630ccccedec121cc83&newProject=try&newRevision=9fd16dd811a84669eca321526a681fad740181dc&framework=10

Base vs 40s
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=eda488bfcd601a9b1ee058630ccccedec121cc83&newProject=try&newRevision=72f710024e557baa8bf906c5bf8cc559b71a0be6&framework=10

Base vs 50s
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=eda488bfcd601a9b1ee058630ccccedec121cc83&newProject=try&newRevision=5216609dc73ff2d9aa158b7ce1185e16351a302c&framework=10

Base vs 60s
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=eda488bfcd601a9b1ee058630ccccedec121cc83&newProject=try&newRevision=5ad27b464714c926271e206b8696915f518c13d2&framework=10

Base vs 70s
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=eda488bfcd601a9b1ee058630ccccedec121cc83&newProject=try&newRevision=12f34c4d94623c90ca0db01a256ef90e692ff4f6&framework=10

Base vs 80s
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=eda488bfcd601a9b1ee058630ccccedec121cc83&newProject=try&newRevision=1d24194840d6279b3022ece54ce9fdfc0ac1ee25&framework=10

Base vs 90s
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=eda488bfcd601a9b1ee058630ccccedec121cc83&newProject=try&newRevision=bc5d2f3dace208620136c2c61acb6e535dab689d&framework=10

Flags: needinfo?(marian.raiciof)

Marian: Could you retrigger base and new so we have at least 5 runs to improve the confidence of the comparison? At the moment there doesn't appear to be any clear signal.

Flags: needinfo?(marian.raiciof)
Version: Version 3 → unspecified

For any work here we might want to wait until bug 1626604 has been fixed. As we know conditioned profile usage is broken right now.

Depends on: 1626604

Mass-removing myself from cc; search for 12b9dfe4-ece3-40dc-8d23-60e179f64ac1 or any reasonable part thereof, to mass-delete these notifications (and sorry!)

Assignee: marian.raiciof → nobody
Status: ASSIGNED → NEW
Priority: P1 → P2

Looks like we should wait for the stabilization of conditioned profiles.

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:marauder, could you have a look please?
For more information, please visit auto_nag documentation.

Flags: needinfo?(marian.raiciof)

Tarek, what do you think about Henrik's comment https://bugzilla.mozilla.org/show_bug.cgi?id=1536090#c47
is the patch good for landing or should we wait ?
Thanks!

Flags: needinfo?(marian.raiciof) → needinfo?(tarek)

Yeah it does not hurt to wait a bit I guess?

Flags: needinfo?(tarek)
No longer blocks: 1607511
Severity: normal → S3

@tarek is this patch still valid? If not maybe we can remove it from the review qeue

Flags: needinfo?(tarek)
Whiteboard: [perftest:triage]

Blocking on getting more reproducibility for conditioned profiles (or hardening them): https://bugzilla.mozilla.org/show_bug.cgi?id=1626604

Priority: P2 → P3
Whiteboard: [perftest:triage]

Clear a needinfo that is pending on an inactive user.

Inactive users most likely will not respond; if the missing information is essential and cannot be collected another way, the bug maybe should be closed as INCOMPLETE.

For more information, please visit auto_nag documentation.

Flags: needinfo?(tarek)

Adding the triage flag to discuss if we can investigate this some more now given that we create conditioned profiles on the fly during the test.

Whiteboard: [perftest:triage]
Assignee: nobody → marian.raiciof
Status: NEW → ASSIGNED

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: marian.raiciof → nobody
Status: ASSIGNED → NEW
Whiteboard: [perftest:triage]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: