Closed
Bug 1188510
Opened 9 years ago
Closed 5 years ago
3-5% Linux*/Win* damp/tps regression on Mozilla-Inbound-Non-PGO on July 27, 2015 from push 1fbefe335299
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: jmaher, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: perf, regression, Whiteboard: [talos_regression])
Talos has detected a Firefox performance regression from commit 1fbefe335299: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=1fbefe335299 This is a list of regressions/improvements: https://treeherder.allizom.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=e039b166890f&newProject=mozilla-inbound&newRevision=1fbefe335299 To learn more about the regressing test, please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#DAMP Reproducing and debugging the regression: If you would like to re-run this Talos test on a potential fix, use try with the following syntax: try: -b o -p linux,win64,win32 -u none -t g2 # add "mozharness: --spsProfile" to generate profile data To run the test locally and do a more in-depth investigation, first set up a local Talos environment: https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code Then run the following command from the directory where you set up Talos: talos --develop -e <path>/firefox -a damp Making a decision: As the patch author we need your feedback to help us handle this regression. *** Please let us know your plans by Friday, or the offending patch will be backed out! *** Our wiki page oulines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Comment 1•9 years ago
|
||
this was a checkin-needed bundle of patches landed by a sheriff. Ideally we can push each to try, maybe there is a chance that we can pick our best candidate or two? :bgrins, it looks like netmonitor suffered a lot: https://treeherder.allizom.org/perf.html#/comparesubtest?originalProject=mozilla-inbound&originalRevision=e039b166890f&newProject=mozilla-inbound&newRevision=1fbefe335299&originalSignature=e45b6c2d9826f236d63f5c810074472611535279&newSignature=e45b6c2d9826f236d63f5c810074472611535279 any thoughts on what in this range might be the culprit: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=1fbefe335299
Flags: needinfo?(bgrinstead)
Comment 2•9 years ago
|
||
I don't think this is bug 1187399 because that just adds something that isn't even used yet.
Comment 4•9 years ago
|
||
Bug 1181976 is just a string caption change so I'd be very surprise if it was it but I'll keep an eye out.
Comment 5•9 years ago
|
||
I would guess it is the bug 1148593 patch. There's also bug 1152893.
Reporter | ||
Comment 6•9 years ago
|
||
hg update 01675d584873: * https://treeherder.mozilla.org/#/jobs?repo=try&revision=7f9b250f597d * https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=e039b166890f&newProject=try&newRevision=7f9b250f597d hg update e741b5a17b19: * https://treeherder.mozilla.org/#/jobs?repo=try&revision=6a7783082a3d * https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=e039b166890f&newProject=try&newRevision=6a7783082a3d hg update c10dc4f9f3ec: * https://treeherder.mozilla.org/#/jobs?repo=try&revision=8d9faa85bbcf * https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=e039b166890f&newProject=try&newRevision=8d9faa85bbcf
Flags: needinfo?(bgrinstead)
Reporter | ||
Comment 7•9 years ago
|
||
the winner is revision 01675d584873 from bug 1148593!
No longer blocks: 1181976
Reporter | ||
Updated•9 years ago
|
Component: Talos → DOM
Product: Testing → Core
Reporter | ||
Comment 8•9 years ago
|
||
:tromey, it looks like one of your two patches is the winner. You can see the raw data of what is regressed by following links here: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=e039b166890f&newProject=try&newRevision=7f9b250f597d&hideMinorChanges=1 this appears to be all windows platforms and linux64, although windows 8 seems to be the largest damp regression and a higher tps regression. questions for you: * we should try to understand this and fix it as much as possible, could you weigh in here on your thoughts? * if we believe we can reduce/fix the regression, what is the general timeline for doing so?
Flags: needinfo?(ttromey)
Comment 9•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #8) > :tromey, it looks like one of your two patches is the winner. > > You can see the raw data of what is regressed by following links here: > https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla- > inbound&originalRevision=e039b166890f&newProject=try&newRevision=7f9b250f597d > &hideMinorChanges=1 > > this appears to be all windows platforms and linux64, although windows 8 > seems to be the largest damp regression and a higher tps regression. > > questions for you: > * we should try to understand this and fix it as much as possible, could you > weigh in here on your thoughts? > * if we believe we can reduce/fix the regression, what is the general > timeline for doing so? I believe that tromey will be unavailable for the remainder of the week. From what I understand, this regression is the same set of problems in Bug 1152893. Based on a conversation with fitzgen, here's what I gathered. There are two things going on that probably cause the regression: 1. Capturing the stack that will become the async stack for some later call * This got sped up by his work in Bug 1028418, if you are doing it a lot during the same JS run-to-completion. Which DAMP probably isn't doing, I'm not sure about TPS. 2. Trimming the length of the stack so that we don't get out-of-memory. * This is pretty expensive because it means you have to actually copy the n frames you want to keep around (rather than just hold a strong ref to the whole stack, and its async stack, and the async stack's async stack, etc) So we need to figure out how to fix part 2. Will leave the ni? open - I'm not sure of the possibility of coming up with a meaningful fix for that before the merge day on 8-10.
Blocks: async-stacks
Comment 10•9 years ago
|
||
FWIW IIUC then the async stacks pref is only set on Nightly, so it might not be too bad to let that commit ship with the uplift.
Comment 11•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #8) > questions for you: > * we should try to understand this and fix it as much as possible, could you > weigh in here on your thoughts? Brian was correct, this is bug 1152893. > * if we believe we can reduce/fix the regression, what is the general > timeline for doing so? Nick was correct that this is disabled on release. https://dxr.mozilla.org/mozilla-central/source/modules/libpref/init/all.js?from=all.js&case=true#1100 If the current state is unacceptable I would propose disabling the pref in the short term.
Flags: needinfo?(ttromey)
Comment 12•9 years ago
|
||
(In reply to Tom Tromey :tromey from comment #11) > If the current state is unacceptable I would propose disabling the pref in > the short term. It's not as much as unacceptable as asking for your opinion on this. E.g. what is the value from not disabling it in the short term? If there's a big value in keeping it enabled, then maybe we should.
Reporter | ||
Comment 13•9 years ago
|
||
:tromey, I think :avih was asking you for your advice on benefits of leaving it in vs disabling it temporarily.
Flags: needinfo?(ttromey)
Comment 14•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #13) > :tromey, I think :avih was asking you for your advice on benefits of leaving > it in vs disabling it temporarily. Sorry about that. (In reply to Avi Halachmi (:avih) from comment #12) > It's not as much as unacceptable as asking for your opinion on this. > > E.g. what is the value from not disabling it in the short term? If there's a > big value in keeping it enabled, then maybe we should. The benefit is that it gives more useful stack traces in some situations; in particular it can associate asynchronous events with their cause. This can be handy for debugging. Also, as Nick said, it is disabled on release. So I tend to think that the upsides outweigh the downsides.
Flags: needinfo?(ttromey)
Assignee | ||
Updated•5 years ago
|
Component: DOM → DOM: Core & HTML
Comment 15•5 years ago
|
||
Closing as a 4y old DAMP regression is probably not going to be picked up any time soon.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•