Closed Bug 778718 Opened 13 years ago Closed 11 years ago

30% Windows Ts regression since 1st March

Tracking

()

Status:

RESOLVED INCOMPLETE

Tracking Flags:

Tracking

Status

firefox15

---

firefox16

---

firefox17

---

firefox19

---

affected

People

(Reporter: emorley, Unassigned)

Details

(Keywords: perf, regression, Whiteboard: [ts][snappy])

Attachments

(4 files)

Talos Results FF15b2 Nightly17a1 and FF19b6 12 years ago Mihai Morar, (:MihaiMorar) 881 bytes, text/plain		Details
Talos Results FF15b2 12 years ago Mihai Morar, (:MihaiMorar) 1.51 KB, text/plain		Details
Talos Results Nightly 17.a1 12 years ago Mihai Morar, (:MihaiMorar) 1.51 KB, text/plain		Details
Talos Results FF19b6 12 years ago Mihai Morar, (:MihaiMorar) 1.51 KB, text/plain		Details

Ed Morley [:emorley]

Reporter

Description

•

13 years ago

We appear to have gradually regressed Ts on Windows by ~30% since 1st March 2012. ie: http://graphs.mozilla.org/graph.html#tests=[[53,131,12]]&sel=none&displayrange=180&datatype=running (Non-pgo & inbound so as to give as few changesets coalesced between each data point as possible) I've done some preliminary poking through the graph, but with coalescing, large merges & the fact this is due to many smaller increases rather than one massive regression, I don't think this is going to be straight-forwards. I also can't remember off the top of my head what talos changes might have landed in this timeframe, that may have caused Ts baselines to change. CCing release drivers and a few talos-ish people; marking tracking given the size of the regression.

Alex Keybl [:akeybl]

Comment 1

•

13 years ago

Adding Lawrence and Taras since this feels like something that would be covered by the snappy effort. If this is truly an iterative slow down, it's not clear that we should be tracking for a specific Firefox version, and definitely not the version currently stabilizing on Beta (15). Even tracking for 16/17 is a bit dubious, but we'll + for visibility.

tracking-firefox15: ? → -

tracking-firefox16: ? → +

tracking-firefox17: ? → +

(dormant account)

Comment 2

•

13 years ago

I'm not sure I believe these numbers. Looks like the machines are iteratively slowing down :) Telemetry showed a similar regression, but it also shows a dip recently to previous levels https://metrics.mozilla.com/data/content/pentaho-cdf-dd/Render?solution=metrics2&path=%2Ftelemetry&file=telemetryEvolution.wcdf&bookmarkState={%22impl%22%3A%22client%22%2C%22params%22%3A{%22referenceDate%22%3A%222012-07-30%22%2C%22appNameParameter%22%3A%22[Application].[Firefox]%22%2C%22osParameter%22%3A%22[OS].[WINNT]%22%2C%22channelParameter%22%3A%22[Channel].[nightly]%22%2C%22reasonParameter%22%3A%22[Reason].[idle-daily]%22%2C%22histogramParameter%22%3A%22[Histogram].[SIMPLE_MEASURES_FIRSTPAINT]%22%2C%22histogramPopupTools%22%3A%22%22%2C%22duplicateHistogram%22%3A%22%23duplicateHistogram%22%2C%22medianButtonParam%22%3A1%2C%22scatterChart%22%3A%22%22}}

Jim Mathies [:jimm]

Comment 3

•

13 years ago

Wouldn't slower machines be less likely to be idle? Which would mean newer data might be skewed in telemetry. In the perfstats, I make out three distinct jumps with some gradual rise. Mar 6, Apr 27, and then the stair step run up starting around Jul 16th.

Ed Morley [:emorley]

Reporter

Updated

•

13 years ago

Whiteboard: [ts]

(dormant account)

Comment 4

•

13 years ago

(In reply to Jim Mathies [:jimm] from comment #3) > Wouldn't slower machines be less likely to be idle? Which would mean newer > data might be skewed in telemetry. This would result in other measures being significantly better in newer telemetry too. > > In the perfstats, I make out three distinct jumps with some gradual rise. > Mar 6, Apr 27, and then the stair step run up starting around Jul 16th. Now that I think about it, ts is more focused on hot startups, so this could get lost in telemetry data due to noise from cold startups.

Whiteboard: [ts] → [ts][snappy]

Justin Dolske [:Dolske]

Comment 5

•

13 years ago

FWIW, the jump-and-fall at the end of July is most likely form bug 778855.

Alex Keybl [:akeybl]

Comment 6

•

13 years ago

Sounds like we don't fully trust the data, and Taras let us know that there isn't much work here that we'll be able to uplift into branches prior to release. No need to track for release in that case.

tracking-firefox16: + → -

tracking-firefox17: + → -

Joel Maher ( :jmaher ) (UTC -8)

Comment 7

•

13 years ago

We should be able to take a build from March and one from today, run talos locally on a local desktop and see a similar difference. TS is a very easy to run test.

(no longer active)

Comment 8

•

13 years ago

(In reply to comment #7) > We should be able to take a build from March and one from today, run talos > locally on a local desktop and see a similar difference. TS is a very easy to > run test. It is also very easy to profile. I don't agree with the assertion that we cannot do anything about this at all. Has anybody tried yet?

Nathan Froyd [:froydnj]

Comment 9

•

13 years ago

Taras asked me to look into this earlier this week. I'll try to get builds and play them off against each other.

Assignee: nobody → nfroyd

(no longer active)

Comment 10

•

13 years ago

Setting the tracking flags back on until someone says why comment 6 is correct.

tracking-firefox16: - → +

tracking-firefox17: - → +

Alex Keybl [:akeybl]

Comment 11

•

13 years ago

(In reply to Ehsan Akhgari [:ehsan] from comment #10) > Setting the tracking flags back on until someone says why comment 6 is > correct. This was based upon Comment 2 and follow up in email with Taras. Glad to see we now think we can make some gains here.

Alex Keybl [:akeybl]

Comment 12

•

13 years ago

This bug hasn't become actionable and we're a couple of weeks from release, so untracking for FF16 given that.

tracking-firefox16: + → -

Lukas Blakk [:lsblakk] use ?needinfo

Comment 13

•

13 years ago

We're just over a week away from merging 17 to Beta channel. Nathan can you look into those builds and see if this bug can become actionable for 17's release?

Nathan Froyd [:froydnj]

Comment 14

•

13 years ago

(In reply to Lukas Blakk [:lsblakk] from comment #13) > We're just over a week away from merging 17 to Beta channel. Nathan can you > look into those builds and see if this bug can become actionable for 17's > release? I am uncertain of how much can actually be done here. I've been looking at a smaller startup regression that happened in the FF 15 timeframe (bug 792939) and it takes a couple of days to analyze regressions over a much smaller range of changes. If you want something before it goes to Beta, I'd say that's a very very tall task.

Alex Keybl [:akeybl]

Updated

•

13 years ago

tracking-firefox17: + → -

Mihai Morar, (:MihaiMorar)

Comment 15

•

13 years ago

I couldn't reproduce this issue on Latest Nightly (2013-01-28) and FF 19b3 on Windows 7 x64. Can anyone still reproduce this issue on Latest builds of Beta, Aurora or Nightly?

(no longer active)

Comment 16

•

13 years ago

(In reply to comment #15) > I couldn't reproduce this issue on Latest Nightly (2013-01-28) and FF 19b3 on > Windows 7 x64. How did you try to reproduce this?

Mihai Morar, (:MihaiMorar)

Comment 17

•

12 years ago

(In reply to :Ehsan Akhgari (Away 2/7-2/15) from comment #16) > How did you try to reproduce this? I compared telemetry histograms between build from 1st March 2012 and 2013-01-29 and I have also created a telemetry metric using URL from comment 2 for all data after 1st March 2012. Results in both cases differences are similar. I don't know exactly how to test this using Talos on Windows, in facts I don't know exactly what to follow but I can try it using: https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code Can you help me in doing that?

Joel Maher ( :jmaher ) (UTC -8)

Comment 18

•

12 years ago

that link to "running locally - source code", should tell you all you need for running Ts.

Mihai Morar, (:MihaiMorar)

Comment 19

•

12 years ago

(In reply to Joel Maher (:jmaher) from comment #18) > that link to "running locally - source code", should tell you all you need > for running Ts. What should I have installed on Windows without Python, Mercurial and Mingw so I can complete all steps from URL mentioned in comment 17 ?

Joel Maher ( :jmaher ) (UTC -8)

Comment 20

•

12 years ago

you need mercurial, python, pywin32 package and that should work. The tests run in a standard windows prompt, not inside Mingw or some other unix'ish shell. What problems are you seeing? I can help you on irc or if you reply with what part of the instructions is not clear or failing.

Mihai Morar, (:MihaiMorar)

Comment 21

•

12 years ago

Attached file Talos Results FF15b2 Nightly17a1 and FF19b6 — Details

I'd test this issue using Talos tool on FF 15b2, Nightly 17a1 from same date as FF15b2 and on FF 19b6. In the attachment you can see are results I have got.

Mihai Morar, (:MihaiMorar)

Comment 22

•

12 years ago

(In reply to Joel Maher (:jmaher) from comment #20) Thanks for helping me in working with talos.

Mihai Morar, (:MihaiMorar)

Comment 23

•

12 years ago

Attached file Talos Results FF15b2 — Details

Sorry for wrong attach. Theese are the corect ones:

Mihai Morar, (:MihaiMorar)

Comment 24

•

12 years ago

Attached file Talos Results Nightly 17.a1 — Details

Mihai Morar, (:MihaiMorar)

Comment 25

•

12 years ago

Attached file Talos Results FF19b6 — Details

Jeff Hammel

Comment 26

•

12 years ago

> you need mercurial, python, pywin32 package and that should work. (talos's setup.py *should* install pywin32 appropriately)

Joel Maher ( :jmaher ) (UTC -8)

Comment 27

•

12 years ago

the numbers between 17 and 19 are very noticeable. There could be issues prior to 17, but with these posted numbers it is large enough that it is worthwhile to look into the issue more.

Mihai Morar, (:MihaiMorar)

Comment 28

•

12 years ago

Joel is there any way to find the real causing Build automatic using Talos? The only way I know to find regression range is using Mozregression and hg bisect.

Mihai Morar, (:MihaiMorar)

Comment 29

•

12 years ago

In this case as I can see from Comment 27 I can use as edge, last good FF17 and first bad FF 19.

Mihai Morar, (:MihaiMorar)

Updated

•

12 years ago

status-firefox19: --- → affected

Mihai Morar, (:MihaiMorar)

Comment 30

•

12 years ago

Should the regression range be restricted ? As Joel said in Comment 27 the numbers between FF 17 and FF 19 are very noticeable. If it should be, how can I restrict it?

Flags: needinfo?

Joel Maher ( :jmaher ) (UTC -8)

Comment 31

•

11 years ago

I think at this point, we have looked into this bug as much as reasonably possible- I vote to close this and focus on current regressions and fixes/enhancements to other parts of the code!

Flags: needinfo?

Nathan Froyd [:froydnj]

Comment 32

•

11 years ago

(In reply to Joel Maher (:jmaher) from comment #31) > I think at this point, we have looked into this bug as much as reasonably > possible- I vote to close this and focus on current regressions and > fixes/enhancements to other parts of the code! WFM!

Assignee: nfroyd → nobody

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → INCOMPLETE

Ryan VanderMeulen [:RyanVM]

Updated

•

10 years ago

Keywords: regressionwindow-wanted

You need to log in before you can comment on or make changes to this bug.