Closed Bug 1150656 Opened 9 years ago Closed 9 years ago

[meta] Performance regressions with Reader Mode / Reading List

Categories

(Toolkit :: Reader Mode, defect, P1)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: jmaher, Unassigned)

References

Details

as reader mode is seen on aurora and beta now it is obvious there are remaining performance regressions as a side effect.  Most of the regressions seen originally have been fixed or greatly reduced - lets track what is left.

win8 tp5o beta - 4.77%:
* http://graphs.mozilla.org/graph.html#tests=[[255,53,31],[255,64,31],[255,52,31],[255,63,31]]&sel=1422970123925.2354,1428001653337,163.9344262295082,200&displayrange=90&datatype=geo
* notice that aurora had a regression when we landed on the 26th, then appeared to go away when firefox 39 was uplifted to aurora
Blocks: 1138995
And bug 1147487 which showed ~20% regression at tsvgr_opacity on Linux 32 PGO, and which we're not sure yet if it's fixed or not.
Summary: Performance regressions with Reader Mode → [meta] Performance regressions with Reader Mode
Priority: -- → P1
After speaking with Joel on IRC, it became clear that it's hard to track reader-related performance impact on mozilla-central since there were other recent changes which added noise.

The regressions listed on this bug are mostly beta/aurora which do provide cleaner isolation of the reader-related impact, however they're harder to track and assess since we need to wait for the fixes to be uplifted before we can get new numbers.

Therefore, I've done several try pushes to isolate reader-related changes as best as I can, based on m-c changeset 237834:ab0490972e1e with every patch incremental over the previous one:

1. "base": both icon/parsing and reading list disabled:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=76b999af7321

2. "icon": only icon/parsing is enabled:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4dff56e21e1e

3(*). "both": both icon/parsing and reading list enabled:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=59b75c0e8e97

4. "bug-1147487": applies the fix from bug 1147487 (didn't land yet):
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9372c6053b54


(*) this is essentially the same as the original pre-base m-c, but pushed as a new changeset such that its results don't mix with pgo results from the m-c changeset.

The try pushes test all desktop platforms "opt" builds without PGO, since PGO could affect the results in ways which we can't control anyway. This should provide us enough info to understand what issues exist.

Also note that this reflects m-c but not beta-aurora, and we'll need to make sure all the fixes are uplifted as well.
- tp5o regression (fixed): bug 1139678
- AWSY 8MiB regression (fixed): bug 1142183
Depends on: 1142183, 1139678
Here are results from compare talos with 5 runs per test for each platform, Notes:
- These are not e10s results.
- These are not PGO results.
- I'm disregarding OS X 10.10 and 10.8 results since they seem very noisy and often inconsistent with results on other platforms. This is probably worth some investigation on the talos side.

1. "icon" vs "base": https://compare-talos.paas.mozilla.org/?oldBranch=Try&oldRevs=76b999af7321&newBranch=Try&newRev=4dff56e21e1e&submit=true

> dromaeo dom    : 4% improvement(?!) on linux 64
> session restore: 4% regression on windows 7
> tp5o           : 2% regression on linux 32
> tsvgr_opacity  : 2% regression on linux 64


2. "both" vs "base": https://compare-talos.paas.mozilla.org/?oldBranch=Try&oldRevs=76b999af7321&newBranch=Try&newRev=59b75c0e8e97&submit=true

> session restore: 2-5% regressions on most platforms.
> tp5o           : 2-3% regression on linux (all) and windows (all)
> tp5o_scroll    : 2% regression on windows 7
> tpaint         : 3% regression on linux 64


3. "bug-1147487" vs "both": https://compare-talos.paas.mozilla.org/?oldBranch=Try&oldRevs=59b75c0e8e97&newBranch=Try&newRev=9372c6053b54&submit=true

> session restore: 3% regression on windows 8 64


I'm not sure I'm able to make full sense of these results. I did verify with jaws and Gijs that the patches disable and enable features correctly before I pushed.

My summary of the results (feel free to correct me):
- Enabling only the icon/parsing has minor impact on page load.
- Adding the reading list clearly regresses page load and session restore.
- The fix at bug 1147487 doesn't seem to improve things.

jaws, Gijs, could you please verify again that "base" ( https://treeherder.mozilla.org/#/jobs?repo=try&revision=76b999af7321 ) should indeed reasonably disable all reader-more related code and impact?

If "base" does disable the code correctly, then my conclusion is that the main impact comes from the reading list rather than from the page parsing/icon.

jaws, Gigs, do these results make sense to you? Are they consistent with other info you may have?
Flags: needinfo?(jaws)
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Avi Halachmi (:avih) from comment #10)
> Here are results from compare talos with 5 runs per test for each platform,
> Notes:
> - These are not e10s results.
> - These are not PGO results.
> - I'm disregarding OS X 10.10 and 10.8 results since they seem very noisy
> and often inconsistent with results on other platforms. This is probably
> worth some investigation on the talos side.
> 
> 1. "icon" vs "base":
> https://compare-talos.paas.mozilla.org/
> ?oldBranch=Try&oldRevs=76b999af7321&newBranch=Try&newRev=4dff56e21e1e&submit=
> true
> 
> > dromaeo dom    : 4% improvement(?!) on linux 64
This says it's a 4.63% regression, not improvement, AFAICT (am I looking at the wrong result?), but the noise level here is so ginormous (here the base rev has a variation that is 3 times as high as the difference) that I wouldn't trust it.


> > session restore: 4% regression on windows 7

Assuming you mean win 6.2.9200 x86_64, that's windows 8 x64, and the variation in the "icon" changeset's results is such (3.5 times as big as the difference between the two changesets) that I wouldn't trust that number.

> > tp5o           : 2% regression on linux 32
This is well within noise levels, seems like just that to me.

> > tsvgr_opacity  : 2% regression on linux 64
This is above the noise, but only just. Interestingly, the other linux and all the windows changesets register an improvement.

Maybe we can do a PGO trypush against fx-team tip (which now has the fix mentioned below) because those numbers might be more stable?

 
> 2. "both" vs "base":
> https://compare-talos.paas.mozilla.org/
> ?oldBranch=Try&oldRevs=76b999af7321&newBranch=Try&newRev=59b75c0e8e97&submit=
> true
> 
> > session restore: 2-5% regressions on most platforms.
> > tp5o           : 2-3% regression on linux (all) and windows (all)
> > tp5o_scroll    : 2% regression on windows 7
> > tpaint         : 3% regression on linux 64
> 
> 
> 3. "bug-1147487" vs "both":
> https://compare-talos.paas.mozilla.org/
> ?oldBranch=Try&oldRevs=59b75c0e8e97&newBranch=Try&newRev=9372c6053b54&submit=
> true
> 
> > session restore: 3% regression on windows 8 64

Huh. That doesn't really make sense; the patch works the same way everywhere, and some of the other platforms improved. See also the try push for the bug itself, http://compare-talos.mattn.ca/?oldRevs=ac2685723d47&newRev=5a2158b4efcc&server=graphs.mozilla.org&submit=true which shows the expected 30% improvements on Linux32 PGO. I mean, the patch results in net less work done, and clearly significantly improves linux32 PGO, so I'm inclined to leave it in, but it's disappointing it doesn't help with any of the other stuff.


> I'm not sure I'm able to make full sense of these results. I did verify with
> jaws and Gijs that the patches disable and enable features correctly before
> I pushed.
> 
> My summary of the results (feel free to correct me):
> - Enabling only the icon/parsing has minor impact on page load.

I would say that with the 30% linux PGO tsvgr_opacity thing addressed, there is little else we can do for that part, yeah.

> - Adding the reading list clearly regresses page load and session restore.

We could and maybe should investigate this more, though the magnitude of the regressions means I personally don't think it should block shipping. That is debatable, of course.

> - The fix at bug 1147487 doesn't seem to improve things.

See above. I think it did improve *something*, though I would have liked it to improve matters more, I guess.

> jaws, Gijs, could you please verify again that "base" (
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=76b999af7321 )
> should indeed reasonably disable all reader-more related code and impact?
> 
> If "base" does disable the code correctly, then my conclusion is that the
> main impact comes from the reading list rather than from the page
> parsing/icon.
> 
> jaws, Gigs, do these results make sense to you? Are they consistent with
> other info you may have?


A little bird told me bug 1131415 might be involved, but equally, I would have thought that the profiles here don't even have reading list set up to sync with anything, so I don't know why that'd have an impact. I'm less involved with and knowledgeable about the "list" part of things, so I can't really say much else on that part of it.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to :Gijs Kruitbosch from comment #11)
> > > dromaeo dom    : 4% improvement(?!) on linux 64
> This says it's a 4.63% regression, not improvement, AFAICT (am I looking at
> the wrong result?), but the noise level here is so ginormous (here the base
> rev has a variation that is 3 times as high as the difference) that I
> wouldn't trust it.

I'm seeing +4.68%, but dromaeo tests are higher-is-better, so supposedly it's an improvement.


> > > session restore: 4% regression on windows 7

> Assuming you mean win 6.2.9200 x86_64, that's windows 8 x64

Correct. I accidentally misread the platform.

> and the
> variation in the "icon" changeset's results is such (3.5 times as big as the
> difference between the two changesets) that I wouldn't trust that number.

What I see is that the average went up ~4%, and the noise level increased by a lot compared to the very low variation of base. So in my book that's a regression.

> > > tp5o           : 2% regression on linux 32
> This is well within noise levels, seems like just that to me.

Marginally, but the regression is still clear to me, especially if you look at the "details" link and see than the vast majority of results have regressed: https://compare-talos.paas.mozilla.org/breakdown.htm?test_name=tp5o&oldRevs=76b999af7321&newRev=4dff56e21e1e&oldBranch=Try-Non-PGO&newBranch=Try-Non-PGO&os_name=linux&os_version=Ubuntu%2012.04&processor=x86


> > > tsvgr_opacity  : 2% regression on linux 64
> This is above the noise, but only just. Interestingly, the other linux and
> all the windows changesets register an improvement.

Like I summarized, seems me like regressions exist, but they're relatively minor.

> Maybe we can do a PGO trypush against fx-team tip (which now has the fix
> mentioned below) because those numbers might be more stable?

I don't know why they would be more stable, but if you think it would help, sounds great. I wanted to get some base overview numbers on the board because it seemed we didn't have concrete overview numbers. Now that we have those, if you think other try pushes could help us further, go ahead.


> We could and maybe should investigate this more, though the magnitude of the
> regressions means I personally don't think it should block shipping. That is
> debatable, of course.

Yeah, we don't use/have thresholds for blocking - it's a judgment call which needs to take many factors into account, with the performance impact numbers being one of those factors.

We both agree that the impact is relatively clear and also not negligible, so I guess at the very least it's worth some further investigation.
(In reply to Avi Halachmi (:avih) from comment #12)
> (In reply to :Gijs Kruitbosch from comment #11)
> > > > dromaeo dom    : 4% improvement(?!) on linux 64
> > This says it's a 4.63% regression, not improvement, AFAICT (am I looking at
> > the wrong result?), but the noise level here is so ginormous (here the base
> > rev has a variation that is 3 times as high as the difference) that I
> > wouldn't trust it.
> 
> I'm seeing +4.68%, but dromaeo tests are higher-is-better, so supposedly
> it's an improvement.

Ah, I was not aware this was higher-is-better. That seems... weird, I guess? Not sure what would cause that improvement...

> > > > session restore: 4% regression on windows 7
> 
> > Assuming you mean win 6.2.9200 x86_64, that's windows 8 x64
> 
> Correct. I accidentally misread the platform.
> 
> > and the
> > variation in the "icon" changeset's results is such (3.5 times as big as the
> > difference between the two changesets) that I wouldn't trust that number.
> 
> What I see is that the average went up ~4%, and the noise level increased by
> a lot compared to the very low variation of base. So in my book that's a
> regression.

You are assuming that the sample size is adequate to establish the increase in noise (which would require a larger size than establishing a change in mean/median value), rather than the test being noisy all the time and this merely being "unlucky". http://graphs.mozilla.org/graph.html#tests=[[313,132,31]]&sel=none&displayrange=30&datatype=running seems to indicate that the Windows results are pretty noisy anyway (certainly significantly more noisy than the 6ms variation in the base cset).

What I guess makes this regression more likely to be real is that linux64 also regressed, albeit "only" 1%. I'll see if I can do some more trypushes later today.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to :Gijs Kruitbosch from comment #13)
> > I'm seeing +4.68%, but dromaeo tests are higher-is-better, so supposedly
> > it's an improvement.
> 
> Ah, I was not aware this was higher-is-better. That seems... weird, I guess?
> Not sure what would cause that improvement...

Yeah, I also think it's a bit weird

> You are assuming that the sample size is adequate to establish the increase
> in noise (which would require a larger size than establishing a change in
> mean/median value), rather than the test being noisy all the time and this
> merely being "unlucky".
> http://graphs.mozilla.org/graph.html#tests=[[313,132,
> 31]]&sel=none&displayrange=30&datatype=running seems to indicate that the
> Windows results are pretty noisy anyway (certainly significantly more noisy
> than the 6ms variation in the base cset).
> 
> What I guess makes this regression more likely to be real is that linux64
> also regressed, albeit "only" 1%. I'll see if I can do some more trypushes
> later today.

Well, the graph server shows results over different builds, so it's possible that the variance is between builds, while the try push shows the variance over 5 test runs of an identical build. It's not weightless.

But also correct that it's open for interpretation, and that more data could help.
Flags: needinfo?(jaws)
Summary: [meta] Performance regressions with Reader Mode → [meta] Performance regressions with Reader Mode / Reading List
Summary of a meeting just now on vidyo (Margaret, Gijs, Jared, Joel and myself):

- We're not at a terrible position. Regressions are visible but not huge.

- This bug will be used as meta for both reading mode and reading list perf impact.

- Parsing/icon has already gone through meaningful perf improvements, which probably explains the fact that it now only shows mostly minor regressions (the m-c changeset I pushed on comment 8 is from yesterday).

- Reading list has gone through some improvements already, but there's more stuff to handle (e.g. bug 1131415, bug 1152327).

- Most of the work on reading list/parsing is intended to uplift up to beta 38, and frequently.

- It's not yet clear how easy it would be to track the changes on m-c, but it's likely to be easier on aurora/beta once the patches get uplifted.

- fx-team will try to use try pushes to independently assess perf impact/improvement of their patches such that it's isolated from noise on m-c, and this could help understand issues better once/if they show up also on official trees.

- Most of the regressions are similar between pgo/non-pgo, but we do eventually ship pgo, and it is important to not ignore the pgo regressions.

That being said, it's probably easier to start with the more common regressions which can be reproduced more easily than odd pgo regressions, and hopefully improving the common regressions would also improve the pgo performance.

Overall, things don't look too bad, and work in underway to handle these regressions.
Linux slaves have been busted so trypushing didn't seem useful until that's fixed... but also, I've forgotten what I was thinking to push.

I've been trying to repro a sessionrestore impact for reader mode on both my mac and windows machines locally and failing to see any meaningful difference. Also, passing --tpcycles 30 to talos doesn't seem to do anything, I just keep getting 10 cycles (which differs by 2-3ms from one run to the next, and disabling reader mode stays within those same bounds, so I can't see any real impact).
Flags: needinfo?(gijskruitbosch+bugs)
It must have been my misunderstanding, but I thought Jared owns the reading list issues. I just talked to him now and he clarified he doesn't own it or work on it (or the issues mentioned at this bug).

Gavin, could you please nominate someone to own these issues (which currently appear to be mostly reading-list related) and decide what to do about them?
Flags: needinfo?(gavin.sharp)
(In reply to Avi Halachmi (:avih) from comment #17)
> It must have been my misunderstanding, but I thought Jared owns the reading
> list issues. I just talked to him now and he clarified he doesn't own it or
> work on it (or the issues mentioned at this bug).

For clarification, I have been working on reading list, but not any parts that will meaningfully impact startup.

As Drew and Mark worked on the backend for reading list and reading list sync, I think it is proper that they investigate this.
Flags: needinfo?(mhammond)
Flags: needinfo?(gavin.sharp)
Flags: needinfo?(adw)
Depends on: 1147487
(In reply to Avi Halachmi (:avih) from comment #15)
> - Reading list has gone through some improvements already, but there's more
> stuff to handle (e.g. bug 1131415

I'd be *extremely* surprised if this is related at all.

> bug 1152327).

My money is on this, and it should land on central very soon (it's already on fx-team). Avi, can you see how that landing impacts results?
Flags: needinfo?(mhammond)
Flags: needinfo?(avihpit)
Flags: needinfo?(adw)
(In reply to Mark Hammond [:markh] from comment #19)
> Avi, can you see how that landing impacts results?

(In reply to Avi Halachmi (:avih) from comment #15)
> Summary of a meeting just now on vidyo (Margaret, Gijs, Jared, Joel and
> myself):
> ...
> - fx-team will try to use try pushes to independently assess perf
> impact/improvement of their patches such that it's isolated from noise on
> m-c, and this could help understand issues better once/if they show up also
> on official trees.

Please start by doing so yourself, and I'll help you if you get stuck.
Flags: needinfo?(avihpit)
(In reply to :Gijs Kruitbosch from comment #16)
> Linux slaves have been busted so trypushing didn't seem useful until that's
> fixed... but also, I've forgotten what I was thinking to push.
> 
> I've been trying to repro a sessionrestore impact for reader mode on both my
> mac and windows machines locally and failing to see any meaningful
> difference. Also, passing --tpcycles 30 to talos doesn't seem to do
> anything, I just keep getting 10 cycles (which differs by 2-3ms from one run
> to the next, and disabling reader mode stays within those same bounds, so I
> can't see any real impact).

Avi, can you help figure out what's going on here and/or to repro this locally?
Flags: needinfo?(avihpit)
session restore is not a tp test (agree that it is confusing), so instead of --tpcycles, you want --cycles.

Let me know if that helps.  As for the difference in data you see, I am not sure why that is the case :(  Maybe this is hardware dependent?
Flags: needinfo?(avihpit)
Talos email says that bug 1152327 may have contributed towards the following improvement:
Improvement: Fx-Team-Non-PGO - Session Restore no Auto Restore Test - Ubuntu HW 12.04 - 5.71% decrease
(In reply to Joel Maher (:jmaher) from comment #22)
> session restore is not a tp test (agree that it is confusing), so instead of
> --tpcycles, you want --cycles.
> 
> Let me know if that helps.  As for the difference in data you see, I am not
> sure why that is the case :(  Maybe this is hardware dependent?

I will look at this again later today or tomorrow. Note that it is mostly confusing that IIRC, the actual test output point that says "10" is prefixed "tpcycles", not "cycles". I'll look at this again and file followup bugs against talos where appropriate.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to :Gijs Kruitbosch from comment #24)
> (In reply to Joel Maher (:jmaher) from comment #22)
> > session restore is not a tp test (agree that it is confusing), so instead of
> > --tpcycles, you want --cycles.
> > 
> > Let me know if that helps.  As for the difference in data you see, I am not
> > sure why that is the case :(  Maybe this is hardware dependent?
> 
> I will look at this again later today or tomorrow. Note that it is mostly
> confusing that IIRC, the actual test output point that says "10" is prefixed
> "tpcycles", not "cycles". I'll look at this again and file followup bugs
> against talos where appropriate.

I can confirm this does work to change the number of iterations, but then the output looks like e.g.:

INFO : TALOSDATA: [{"test_machine": {"platform": "x86_64", "osversion": "6.2.9200", "os": "win", "name": "qm-pxp01"}, "testrun": {"date": 1429179460, "suite": "sessionrestore", "options": {"responsiveness": false, "tpmozafterpaint": false, "tpchrome": true, "tppagecycles": 1, "tpcycles": 10, "tprender": false, "shutdown": false, "rss": false}}, "results": {"sessionrestore": [1526.0, 1440.0, 1426.0, 1450.0, 1425.0, 1446.0, 1437.0, 1449.0, 1427.0, 1440.0, 1453.0, 1455.0, 1430.0, 1435.0, 1432.0, 1451.0, 1433.0, 1439.0, 1443.0, 1442.0, 1441.0, 1437.0, 1441.0, 1448.0, 1447.0, 1435.0, 1439.0, 1440.0, 1438.0, 1439.0]}, "test_build": {"version": "40.0a1", "revision": "2f8060bb3f94", "id": "20150416095104", "branch": "", "name": "Firefox"}}]

Note in particular:

"tppagecycles": 1, "tpcycles": 10,

and 30 output values (because I passed --cycles 30).

I filed bug 1155121 for this.

Leaving needinfo because I need to actually run the tests some more to see what this gives me in terms of actually being able to reproduce anything.

Note that it'd be helpful if the output included the mean/median values instead of leaving the calculation of those up to me... :-)
So I ran this again locally, and got this output:

Disabled (x50):

- 1440.96 (min: 1411, max: 1490 (excl. 1531 initial run, mean is: 1439.1224489795918))
Raw: [1531.0, 1443.0, 1428.0, 1430.0, 1436.0, 1451.0, 1475.0, 1490.0, 1431.0, 1452.0, 1439.0, 1435.0, 1431.0, 1448.0, 1427.0, 1433.0, 1439.0, 1432.0, 1433.0, 1439.0, 1437.0, 1448.0, 1429.0, 1439.0, 1446.0, 1441.0, 1436.0, 1451.0, 1435.0, 1449.0, 1431.0, 1438.0, 1463.0, 1441.0, 1451.0, 1438.0, 1445.0, 1445.0, 1422.0, 1440.0, 1433.0, 1440.0, 1414.0, 1436.0, 1428.0, 1441.0, 1411.0, 1432.0, 1427.0, 1438.0]

- 1436.24 (min: 1418, max: 1473 (excl. 1556 initial run, mean is: 1433.795918367347))
Raw: [1556.0, 1440.0, 1438.0, 1429.0, 1442.0, 1434.0, 1426.0, 1440.0, 1425.0, 1425.0, 1421.0, 1432.0, 1434.0, 1435.0, 1433.0, 1443.0, 1427.0, 1418.0, 1427.0, 1428.0, 1435.0, 1424.0, 1431.0, 1421.0, 1431.0, 1425.0, 1423.0, 1433.0, 1435.0, 1423.0, 1424.0, 1439.0, 1447.0, 1457.0, 1449.0, 1452.0, 1473.0, 1424.0, 1425.0, 1437.0, 1452.0, 1445.0, 1422.0, 1421.0, 1440.0, 1442.0, 1433.0, 1418.0, 1434.0, 1444.0]

- 1435.82 (min: 1413, max: 1532.0 (excl. 1572 initial run, mean is: 1433.0408163265306))
Raw: [1572.0, 1441.0, 1438.0, 1443.0, 1431.0, 1423.0, 1436.0, 1419.0, 1434.0, 1419.0, 1426.0, 1425.0, 1429.0, 1435.0, 1420.0, 1413.0, 1429.0, 1434.0, 1424.0, 1419.0, 1439.0, 1430.0, 1424.0, 1431.0, 1425.0, 1417.0, 1433.0, 1437.0, 1434.0, 1430.0, 1447.0, 1449.0, 1434.0, 1424.0, 1438.0, 1445.0, 1438.0, 1419.0, 1442.0, 1444.0, 1422.0, 1532.0, 1441.0, 1428.0, 1434.0, 1427.0, 1444.0, 1432.0, 1419.0, 1422.0]


Avg: 1437.67 (without initial outlier: 1435.32)


Enabled (x50):

- 1436.66 (min: 1416, max: 1479 (excl. 1534 initial run, mean is: 1434.6734693877552))
Raw: [1534.0, 1422.0, 1425.0, 1429.0, 1423.0, 1442.0, 1445.0, 1429.0, 1428.0, 1439.0, 1441.0, 1416.0, 1425.0, 1428.0, 1440.0, 1416.0, 1424.0, 1427.0, 1432.0, 1437.0, 1434.0, 1438.0, 1439.0, 1439.0, 1428.0, 1432.0, 1441.0, 1452.0, 1430.0, 1442.0, 1434.0, 1440.0, 1432.0, 1439.0, 1440.0, 1420.0, 1438.0, 1439.0, 1450.0, 1428.0, 1438.0, 1479.0, 1444.0, 1424.0, 1434.0, 1426.0, 1450.0, 1428.0, 1432.0, 1441.0]

- 1442.0 (min: 1419, max: 1544 (excl. 1550 initial run, mean is: 1439.795918367347))
Raw: [1550.0, 1434.0, 1419.0, 1457.0, 1439.0, 1441.0, 1433.0, 1440.0, 1439.0, 1429.0, 1429.0, 1435.0, 1427.0, 1436.0, 1432.0, 1439.0, 1438.0, 1439.0, 1424.0, 1441.0, 1433.0, 1544.0, 1427.0, 1429.0, 1444.0, 1440.0, 1426.0, 1439.0, 1453.0, 1439.0, 1434.0, 1433.0, 1455.0, 1436.0, 1426.0, 1445.0, 1441.0, 1439.0, 1438.0, 1442.0, 1431.0, 1444.0, 1442.0, 1454.0, 1442.0, 1437.0, 1444.0, 1452.0, 1424.0, 1446.0]

- 1444.66 (min: 1421, max: 1507 (excl. 1553 initial run, mean is: 1442.4489795918366))
Raw: [1553.0, 1429.0, 1498.0, 1422.0, 1432.0, 1433.0, 1439.0, 1439.0, 1421.0, 1439.0, 1439.0, 1449.0, 1444.0, 1434.0, 1437.0, 1434.0, 1472.0, 1450.0, 1444.0, 1443.0, 1427.0, 1425.0, 1439.0, 1445.0, 1507.0, 1435.0, 1424.0, 1438.0, 1439.0, 1458.0, 1431.0, 1439.0, 1443.0, 1436.0, 1498.0, 1443.0, 1439.0, 1432.0, 1430.0, 1436.0, 1442.0, 1459.0, 1449.0, 1447.0, 1438.0, 1443.0, 1426.0, 1433.0, 1439.0, 1442.0]

Avg: 1441.11 (without initial outlier: 1438.97)


Without doing detailed statistics on this (there is considerable overlap and a pretty significant amount of noise in the raw data), it seems like there might (on this machine) be a 3.5ms regression from turning reader mode on, which is approximately 0.3%.

I don't think this warrants further investigation, and unless further details come to light, I think we're effectively done here for the reader mode icon.

For reading list, see comment #23 .

Unless there is further substantial up-to-date data from the perf team that indicates we're not out of the woods here, I don't think the engineering folks should spend more time on this for now, as the evidence points to this now being addressed sufficiently.
Flags: needinfo?(gijskruitbosch+bugs)
Ah, to be clear - comment #26 is output from running sessionstore locally on my win8 machine, which was the remaining >2%-on-try regression for reader mode.
keep in mind we are probably talking about pgo regressions and on specific machines- although I agree that what you posted in comment #26 show that what difference we have is very very small, not 2%.
(In reply to Joel Maher (:jmaher) from comment #28)
> keep in mind we are probably talking about pgo regressions and on specific
> machines- although I agree that what you posted in comment #26 show that
> what difference we have is very very small, not 2%.

Earlier discussions with Avi seemed to indicate that we should worry about non-PGO first, and his try pushes were non-PGO. If there is data that shows PGO is still significantly impacted, we can reconsider not investigating further, but we would really need to see data indicating that that is the case.
(In reply to :Gijs Kruitbosch from comment #26)
> So I ran this again locally, and got this output: ...

Thanks for the investigation.

> For reading list, see comment #23 .

(In reply to Jared Wein [:jaws] (please needinfo? me) from comment #23)
> Talos email says that bug 1152327 may have contributed towards the following
> improvement:
> Improvement: Fx-Team-Non-PGO - Session Restore no Auto Restore Test - Ubuntu
> HW 12.04 - 5.71% decrease

While this is promising, per comment 10, there were also page load regressions across the board. and also:

(In reply to Avi Halachmi (:avih) from comment #15)
> Summary of a meeting just now on vidyo (Margaret, Gijs, Jared, Joel and
> myself):
> ...
> - fx-team will try to use try pushes to independently assess perf
> impact/improvement of their patches such that it's isolated from noise on
> m-c, and this could help understand issues better once/if they show up also
> on official trees.
update on regressions from above:
4.77% win8 tp5o - beta, aurora was fixed, but regressed in a future patch, could be unrelated
http://graphs.mozilla.org/graph.html#tests=[[255,53,31],[255,64,31],[255,52,31],[255,63,31]]&sel=1422970123925.2354,1428001653337,163.9344262295082,200&displayrange=90&datatype=geo

2% linux32 ts,paint - beta+aurora:
http://graphs.mozilla.org/graph.html#tests=[[83,53,33],[83,52,33]]&sel=1426712497787.5408,1429194521082.0498,704.9180327868853,934.4262295081967&displayrange=30&datatype=geo

3% linux64 tp5o private bytes - beta/aurora:
http://graphs.mozilla.org/graph.html#tests=[[257,53,35],[257,52,35],[257,63,35],[257,64,35]]&sel=1426502992141.172,1429205383964,524590163.93442625,629508196.7213115&displayrange=90&datatype=geo

5.4% winxp tp5o - beta/aurora:
http://graphs.mozilla.org/graph.html#tests=[[255,53,37],[255,52,37],[255,64,37],[255,63,37]]&sel=1426125991235.2617,1429205467623,155.7377049180328,209.01639344262296&displayrange=90&datatype=geo

3.9% linux32 tp5o - beta/aurora:
http://graphs.mozilla.org/graph.html#tests=[[255,53,33],[255,52,33],[255,64,33],[255,63,33]]&sel=1425874786656.6345,1429199074974.7788,201.63934426229508,275.40983606557376&displayrange=90&datatype=geo

4.85% linux32 tp5o private bytes - beta/aurora:
http://graphs.mozilla.org/graph.html#tests=[[257,53,33],[257,52,33],[257,64,33],[257,63,33]]&sel=1426529678893.651,1429192329660.2607,442622950.8196721,600000000&displayrange=90&datatype=geo

4.51% win8 svg-asap - beta/aurora:
http://graphs.mozilla.org/graph.html#tests=[[281,53,31],[281,52,31],[281,64,31],[281,63,31]]&sel=1426261479273.9072,1429205664858,100.81967213114754,130.327868852459&displayrange=90&datatype=geo

20% svg-opacity - beta/aurora:
http://graphs.mozilla.org/graph.html#tests=[[225,52,33],[225,53,33]]&sel=1425617320503.7861,1429205748010,98.36065573770492,334.42622950819674&displayrange=90&datatype=geo

I have not seen any of these issues fixed on beta/aurora since the uplift 2+ weeks ago.  :gijs, I know you have done a lot of work here, but that has been on session restore, what is the plan for the other regressions which I have just reverified still exist?
Flags: needinfo?(gijskruitbosch+bugs)
I don't think this is "reverifying" these regressions are related to reading list. That would need to happen with new try pushes. For instance, just looking at the first one in your list, tp5o going back up on fx-team happened on a merge cset which included e.g. OMTA being enabled: http://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=973f507abcf0&tochange=fd2dfe2228b6

(the reading list change at the top of that pushlog is extremely unlikely to be related as it is a 2-line correctness fix that would only reduce the amount of work done there, and there are no individual datapoints to include/exclude it, presumably because pgo)


I do not have the time and energy available right now to go through every single graph link and dig into it. I think a more productive use of time would be renewing try pushes, with pgo if you prefer, and verifying whether disabling reading list + reader mode makes a serious difference in terms of perf.

Personally, I think at this point the responsibility for that is with the perf team, but if you lack time, I can do this tomorrow, in which case please needinfo me.
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(jmaher)
Gijs, I am focusing specifically on aurora/beta, not trunk- OMTA was not on trunk, these regressions showed up during the uplift- none of them have been fixed, we can do try pushes, but the reality is we are going to ship code off of BETA with regressions and we should be able to justify those.
Flags: needinfo?(jmaher)
(In reply to Joel Maher (:jmaher) from comment #33)
> Gijs, I am focusing specifically on aurora/beta, not trunk- OMTA was not on
> trunk, these regressions showed up during the uplift-

Then I do not understand your graph links which include fx-team and inbound, as well as zoom windows which are 10-15 days in the past, and so don't include fixes that have been made in that time.

> none of them have been fixed

Engineering has worked on the performance impact of these features and the fixes did show up on the graphs and in talos alerts. Not all of them have been uplifted, but I do not think it is fair to say "none of them have been fixed".

> we can do try pushes, but the reality is we are going to ship code
> off of BETA with regressions and we should be able to justify those.

I don't think this is the right conclusion at this point in time.
Gijs, since AFAIK the reading list will not make it into release, but reading mode will, I'd appreciate if you let us know as soon as the relevant patches are uplifted (to beta I guess?) and the tree is close to its intended release state, and we'll try to take some more measurements on beta.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Avi Halachmi (:avih) from comment #35)
> Gijs, since AFAIK the reading list will not make it into release, but
> reading mode will, I'd appreciate if you let us know as soon as the relevant
> patches are uplifted (to beta I guess?) and the tree is close to its
> intended release state, and we'll try to take some more measurements on beta.

Both reader mode and reading list are disabled for 38, and that has already happened, but that's now on mozilla-release, not mozilla-beta. In fact, it seems both are still enabled on the mozilla-beta repository. I don't know why that is and when that will change. Dolske?

Reader mode will be enabled for 38.0.5.

See also http://release.mozilla.org/38/schedule/2015/04/23/update-branch.html .

I don't know what "close to its intended release state" means in this context. We intend to keep improving reader mode until it is released, including on the mozilla-beta repository.
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(dolske)
(In reply to :Gijs Kruitbosch from comment #36)
> I don't know what "close to its intended release state" means in this
> context.

Reading mode enabled and reading list disabled on a branch which will be the first to be released with reading mode enabled. So that's probably:

> Reader mode will be enabled for 38.0.5.

Once that branch has RL disabled and RM enabled, let us know and we'll try collect performance data with this branch like comment 0 - 7 and comment 31 did.
(In reply to :Gijs Kruitbosch from comment #36)
> In fact, it
> seems both are still enabled on the mozilla-beta repository. I don't know
> why that is and when that will change. Dolske?

This seems to just have been a relman branch mixup. As you do. r?dolske patch in bug 1157702. Clearing needinfo here.
Flags: needinfo?(dolske)
(In reply to Avi Halachmi (:avih) from comment #37)
> (In reply to :Gijs Kruitbosch from comment #36)
> > I don't know what "close to its intended release state" means in this
> > context.
> 
> Reading mode enabled and reading list disabled on a branch which will be the
> first to be released with reading mode enabled. So that's probably:
> 
> > Reader mode will be enabled for 38.0.5.
> 
> Once that branch has RL disabled and RM enabled, let us know and we'll try
> collect performance data with this branch like comment 0 - 7 and comment 31
> did.

bug 1157702 was fixed so this should be possible now. Don't know how many builds there have been since then, though, and if that's enough data to adequately compare.
The trypush messages say this already, but just to remove doubt, this was a PGO-only push.

https://compare-talos.paas.mozilla.org/?oldBranch=Try&oldRevs=3254619ee3ce&newBranch=Try&newRev=30fca006fba6&submit=true

(old: without reader mode on, new: with reader mode on)

everything that's been retriggered there is a wash with no significant ups or downs, including everything previously discussed (tpaint, ts_paint, tsvgr_opacity, sessionrestore, tp5o, tp5o private bytes). Going to mark this fixed.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Thanks for looking after this.
(In reply to :Gijs Kruitbosch from comment #39)
> > Once that branch has RL disabled and RM enabled, let us know and we'll try
> > collect performance data with this branch like comment 0 - 7 and comment 31
> > did.
> 
> bug 1157702 was fixed so this should be possible now. Don't know how many
> builds there have been since then, though, and if that's enough data to
> adequately compare.

Joel, beta should be in feature freeze now WRT to reading mode (though obviously some changes might still get uplifted).

Can we collect more recent numbers from beta and hopefully most of regressions from comment 31 should be fixed?
Flags: needinfo?(jmaher)
(In reply to Avi Halachmi (:avih) from comment #43)
> (In reply to :Gijs Kruitbosch from comment #39)
> > > Once that branch has RL disabled and RM enabled, let us know and we'll try
> > > collect performance data with this branch like comment 0 - 7 and comment 31
> > > did.
> > 
> > bug 1157702 was fixed so this should be possible now. Don't know how many
> > builds there have been since then, though, and if that's enough data to
> > adequately compare.
> 
> Joel, beta should be in feature freeze now WRT to reading mode (though
> obviously some changes might still get uplifted).
> 
> Can we collect more recent numbers from beta and hopefully most of
> regressions from comment 31 should be fixed?

The try pushes I posted are from beta.
doing a big set of retriggers, we will verify regressions are cleaned up, etc.
Flags: needinfo?(jmaher)
there were 8 regressions posted above, some are fixed completely, some are partially fixed, some are not.  I am focusing specifically on aurora/beta.

fixed:
linu32 tp5o private bytes:
http://graphs.mozilla.org/graph.html#tests=[[257,53,33],[257,52,33]]&sel=1425452869755.9182,1430327772652,442622950.8196721,560655737.704918&displayrange=90&datatype=geo

win8 svg asap:
http://graphs.mozilla.org/graph.html#tests=[[281,53,31],[281,52,31]]&sel=1425856748137.4805,1430327787077,105.73770491803279,132.78688524590163&displayrange=90&datatype=geo

linux32 svg opacity:
http://graphs.mozilla.org/graph.html#tests=[[225,53,33],[225,52,33]]&sel=none&displayrange=90&datatype=geo

win8 tp5o:
http://graphs.mozilla.org/graph.html#tests=[[255,53,31],[255,52,31]]&sel=1424892555800.0269,1430327589639,137.14285714285714,200&displayrange=90&datatype=geo

winxp tp5o:
http://graphs.mozilla.org/graph.html#tests=[[255,53,37],[255,52,37]]&sel=1424742125553.2112,1430327703018,150,206.89655172413794&displayrange=90&datatype=geo

linux 64 tp5o private bytes:
http://graphs.mozilla.org/graph.html#tests=[[257,53,35],[257,52,35]]&sel=1426340306135.4548,1430321035266.6304,472131147.5409836,642622950.8196721&displayrange=90&datatype=geo


partial fix:
linux32 ts paint:
http://graphs.mozilla.org/graph.html#tests=[[83,53,33],[83,52,33]]&sel=1425766832437.1147,1430327631106,672.1311475409836,934.4262295081967&displayrange=90&datatype=geo

linux32 tp5o:
http://graphs.mozilla.org/graph.html#tests=[[255,53,33],[255,52,33]]&sel=1425296191208.2354,1430327720620,196.72131147540983,245.9016393442623&displayrange=90&datatype=geo

overall, we could have another reason for these linux32 partial fixes
Is there a summary of Reading Mode regressions on Beta somewhere?
*i mean a summary of the magnitudes
You need to log in before you can comment on or make changes to this bug.