Closed
Bug 778718
Opened 12 years ago
Closed 10 years ago
30% Windows Ts regression since 1st March
Categories
(Firefox :: General, defect)
Tracking
()
People
(Reporter: emorley, Unassigned)
Details
(Keywords: perf, regression, Whiteboard: [ts][snappy])
Attachments
(4 files)
We appear to have gradually regressed Ts on Windows by ~30% since 1st March 2012. ie: http://graphs.mozilla.org/graph.html#tests=[[53,131,12]]&sel=none&displayrange=180&datatype=running (Non-pgo & inbound so as to give as few changesets coalesced between each data point as possible) I've done some preliminary poking through the graph, but with coalescing, large merges & the fact this is due to many smaller increases rather than one massive regression, I don't think this is going to be straight-forwards. I also can't remember off the top of my head what talos changes might have landed in this timeframe, that may have caused Ts baselines to change. CCing release drivers and a few talos-ish people; marking tracking given the size of the regression.
Comment 1•12 years ago
|
||
Adding Lawrence and Taras since this feels like something that would be covered by the snappy effort. If this is truly an iterative slow down, it's not clear that we should be tracking for a specific Firefox version, and definitely not the version currently stabilizing on Beta (15). Even tracking for 16/17 is a bit dubious, but we'll + for visibility.
Comment 2•12 years ago
|
||
I'm not sure I believe these numbers. Looks like the machines are iteratively slowing down :) Telemetry showed a similar regression, but it also shows a dip recently to previous levels https://metrics.mozilla.com/data/content/pentaho-cdf-dd/Render?solution=metrics2&path=%2Ftelemetry&file=telemetryEvolution.wcdf&bookmarkState={%22impl%22%3A%22client%22%2C%22params%22%3A{%22referenceDate%22%3A%222012-07-30%22%2C%22appNameParameter%22%3A%22[Application].[Firefox]%22%2C%22osParameter%22%3A%22[OS].[WINNT]%22%2C%22channelParameter%22%3A%22[Channel].[nightly]%22%2C%22reasonParameter%22%3A%22[Reason].[idle-daily]%22%2C%22histogramParameter%22%3A%22[Histogram].[SIMPLE_MEASURES_FIRSTPAINT]%22%2C%22histogramPopupTools%22%3A%22%22%2C%22duplicateHistogram%22%3A%22%23duplicateHistogram%22%2C%22medianButtonParam%22%3A1%2C%22scatterChart%22%3A%22%22}}
Comment 3•12 years ago
|
||
Wouldn't slower machines be less likely to be idle? Which would mean newer data might be skewed in telemetry. In the perfstats, I make out three distinct jumps with some gradual rise. Mar 6, Apr 27, and then the stair step run up starting around Jul 16th.
Reporter | ||
Updated•12 years ago
|
Whiteboard: [ts]
Comment 4•12 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #3) > Wouldn't slower machines be less likely to be idle? Which would mean newer > data might be skewed in telemetry. This would result in other measures being significantly better in newer telemetry too. > > In the perfstats, I make out three distinct jumps with some gradual rise. > Mar 6, Apr 27, and then the stair step run up starting around Jul 16th. Now that I think about it, ts is more focused on hot startups, so this could get lost in telemetry data due to noise from cold startups.
Whiteboard: [ts] → [ts][snappy]
Comment 5•12 years ago
|
||
FWIW, the jump-and-fall at the end of July is most likely form bug 778855.
Comment 6•12 years ago
|
||
Sounds like we don't fully trust the data, and Taras let us know that there isn't much work here that we'll be able to uplift into branches prior to release. No need to track for release in that case.
Comment 7•12 years ago
|
||
We should be able to take a build from March and one from today, run talos locally on a local desktop and see a similar difference. TS is a very easy to run test.
Comment 8•12 years ago
|
||
(In reply to comment #7) > We should be able to take a build from March and one from today, run talos > locally on a local desktop and see a similar difference. TS is a very easy to > run test. It is also very easy to profile. I don't agree with the assertion that we cannot do anything about this at all. Has anybody tried yet?
Comment 9•12 years ago
|
||
Taras asked me to look into this earlier this week. I'll try to get builds and play them off against each other.
Assignee: nobody → nfroyd
Comment 10•12 years ago
|
||
Setting the tracking flags back on until someone says why comment 6 is correct.
Comment 11•12 years ago
|
||
(In reply to Ehsan Akhgari [:ehsan] from comment #10) > Setting the tracking flags back on until someone says why comment 6 is > correct. This was based upon Comment 2 and follow up in email with Taras. Glad to see we now think we can make some gains here.
Comment 12•12 years ago
|
||
This bug hasn't become actionable and we're a couple of weeks from release, so untracking for FF16 given that.
Comment 13•12 years ago
|
||
We're just over a week away from merging 17 to Beta channel. Nathan can you look into those builds and see if this bug can become actionable for 17's release?
Comment 14•12 years ago
|
||
(In reply to Lukas Blakk [:lsblakk] from comment #13) > We're just over a week away from merging 17 to Beta channel. Nathan can you > look into those builds and see if this bug can become actionable for 17's > release? I am uncertain of how much can actually be done here. I've been looking at a smaller startup regression that happened in the FF 15 timeframe (bug 792939) and it takes a couple of days to analyze regressions over a much smaller range of changes. If you want something before it goes to Beta, I'd say that's a very very tall task.
Updated•12 years ago
|
Comment 15•11 years ago
|
||
I couldn't reproduce this issue on Latest Nightly (2013-01-28) and FF 19b3 on Windows 7 x64. Can anyone still reproduce this issue on Latest builds of Beta, Aurora or Nightly?
Comment 16•11 years ago
|
||
(In reply to comment #15) > I couldn't reproduce this issue on Latest Nightly (2013-01-28) and FF 19b3 on > Windows 7 x64. How did you try to reproduce this?
Comment 17•11 years ago
|
||
(In reply to :Ehsan Akhgari (Away 2/7-2/15) from comment #16) > How did you try to reproduce this? I compared telemetry histograms between build from 1st March 2012 and 2013-01-29 and I have also created a telemetry metric using URL from comment 2 for all data after 1st March 2012. Results in both cases differences are similar. I don't know exactly how to test this using Talos on Windows, in facts I don't know exactly what to follow but I can try it using: https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code Can you help me in doing that?
Comment 18•11 years ago
|
||
that link to "running locally - source code", should tell you all you need for running Ts.
Comment 19•11 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #18) > that link to "running locally - source code", should tell you all you need > for running Ts. What should I have installed on Windows without Python, Mercurial and Mingw so I can complete all steps from URL mentioned in comment 17 ?
Comment 20•11 years ago
|
||
you need mercurial, python, pywin32 package and that should work. The tests run in a standard windows prompt, not inside Mingw or some other unix'ish shell. What problems are you seeing? I can help you on irc or if you reply with what part of the instructions is not clear or failing.
Comment 21•11 years ago
|
||
I'd test this issue using Talos tool on FF 15b2, Nightly 17a1 from same date as FF15b2 and on FF 19b6. In the attachment you can see are results I have got.
Comment 22•11 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #20) Thanks for helping me in working with talos.
Comment 23•11 years ago
|
||
Sorry for wrong attach. Theese are the corect ones:
Comment 24•11 years ago
|
||
Comment 25•11 years ago
|
||
Comment 26•11 years ago
|
||
> you need mercurial, python, pywin32 package and that should work.
(talos's setup.py *should* install pywin32 appropriately)
Comment 27•11 years ago
|
||
the numbers between 17 and 19 are very noticeable. There could be issues prior to 17, but with these posted numbers it is large enough that it is worthwhile to look into the issue more.
Comment 28•11 years ago
|
||
Joel is there any way to find the real causing Build automatic using Talos? The only way I know to find regression range is using Mozregression and hg bisect.
Comment 29•11 years ago
|
||
In this case as I can see from Comment 27 I can use as edge, last good FF17 and first bad FF 19.
Updated•11 years ago
|
status-firefox19:
--- → affected
Comment 30•11 years ago
|
||
Should the regression range be restricted ? As Joel said in Comment 27 the numbers between FF 17 and FF 19 are very noticeable. If it should be, how can I restrict it?
Flags: needinfo?
Comment 31•10 years ago
|
||
I think at this point, we have looked into this bug as much as reasonably possible- I vote to close this and focus on current regressions and fixes/enhancements to other parts of the code!
Flags: needinfo?
Comment 32•10 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #31) > I think at this point, we have looked into this bug as much as reasonably > possible- I vote to close this and focus on current regressions and > fixes/enhancements to other parts of the code! WFM!
Assignee: nfroyd → nobody
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → INCOMPLETE
Updated•9 years ago
|
Keywords: regressionwindow-wanted
You need to log in
before you can comment on or make changes to this bug.
Description
•