Closed Bug 993084 Opened 11 years ago Closed 11 years ago

14% regression in ts paint on linux64 seen on fx-team

Tracking

(firefox30 unaffected, firefox31- verified)

Status:

VERIFIED FIXED

Milestone:

Firefox 31

Tracking Flags:

Tracking

Status

firefox30

---

unaffected

firefox31

verified

People

(Reporter: jmaher, Assigned: gfritzsche)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression] p=5 s=it-31c-30a-29b.3 [qa-])

Attachments

(5 files)

Attempt lazy loading, rev. 1 11 years ago Benjamin Smedberg 2.04 KB, patch	gps : review+	Details \| Diff \| Splinter Review
Skip Experiments.jsm load if pref is off 11 years ago Georg Fritzsche [:gfritzsche] 2.51 KB, patch	benjamin : review+	Details \| Diff \| Splinter Review
Delay initializing Experiments if there is no experiment running 11 years ago Georg Fritzsche [:gfritzsche] 8.36 KB, patch	benjamin : review+	Details \| Diff \| Splinter Review
Consolidate pref name constants for tests for next patch 11 years ago Georg Fritzsche [:gfritzsche] 8.59 KB, patch	benjamin : review+	Details \| Diff \| Splinter Review
Fix missing initialization in experiments healtreporter test 11 years ago Georg Fritzsche [:gfritzsche] 1.97 KB, patch	benjamin : review+	Details \| Diff \| Splinter Review

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Description

•

11 years ago

here is a view of the graph: http://graphs.mozilla.org/graph.html#tests=[[83,132,35]]&sel=none&displayrange=7&datatype=running here is a view on tbpl, where I see did some retriggers: https://tbpl.mozilla.org/?tree=Fx-Team&fromchange=10a99a32304b&tochange=d22244fa46b4&jobname=Ubuntu%20HW%2012.04%20x64%20fx-team%20talos%20other Unfortunately there were a few pushes that didn't run tests due to a bad push prior. So we are missing details on 5 changesets. If anybody has ideas of why linux 64 would have a large hit on ts (launching the browser), I would appreciate learning more about it.

Mike Conley (:mconley) (:⚙️) (PTO - Sept 15 - 19)

Comment 1

•

11 years ago

Bug 989761 is not the culprit - it only messed with Windows Classic CSS.

Mike Conley (:mconley) (:⚙️) (PTO - Sept 15 - 19)

Comment 2

•

11 years ago

Looking at the graph, it looks like the regression was introduced somewhere on fx-team between 57a393d69f32 and ddc22f087bec. http://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=57a393d69f32&tochange=ddc22f087bec

Mike Conley (:mconley) (:⚙️) (PTO - Sept 15 - 19)

Comment 3

•

11 years ago

I'd also suspect that bug 989609 is not the culprit, since that patch was backed out.

Mike Conley (:mconley) (:⚙️) (PTO - Sept 15 - 19)

Comment 4

•

11 years ago

(In reply to Joel Maher (:jmaher) from comment #0) > here is a view of the graph: > http://graphs.mozilla.org/graph.html#tests=[[83,132, > 35]]&sel=none&displayrange=7&datatype=running > > here is a view on tbpl, where I see did some retriggers: > https://tbpl.mozilla.org/?tree=Fx- > Team&fromchange=10a99a32304b&tochange=d22244fa46b4&jobname=Ubuntu%20HW%2012. > 04%20x64%20fx-team%20talos%20other > > Unfortunately there were a few pushes that didn't run tests due to a bad > push prior. So we are missing details on 5 changesets. > > If anybody has ideas of why linux 64 would have a large hit on ts (launching > the browser), I would appreciate learning more about it. Nothing else really jumps out at me. :/ Bisecting might be our only choice here, unless somebody else has another suggestion.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Updated

•

11 years ago

No longer blocks: 989609, 989761

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 5

•

11 years ago

That leaves a few suspects, I will wait until tomorrow to see if the original patch authors have ideas of what might be the problem. Then a series of pushes to try while backing out sound like a good idea :)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Updated

•

11 years ago

Summary: 14% regression in ts paint on linux64 seen on fx-team → 14% regression in ts paint on linux 32/64 seen on fx-team

Benjamin Smedberg

Comment 6

•

11 years ago

Does anyone know if there are configs for the talos run outside of mozilla-central which might affect whether telemetry is enabled during this test? The two most-likely pushes in this range are: Benjamin Smedberg — Bug 986582 - Get rid of the toolkit.telemetry.enabledPreRelease pref and make the toolkit.telemetry.enabled pref do the right thing for beta users who are testing a final release build, r=rnewman Benjamin Smedberg — Bug 992208 - Add Telemetry Experiments to the package so that they are actually used. Also a basic test that the service exists and can be created. r=gfritzsche Of those I'd be pretty surprised if bug 992208 were at fault here, and mis-configured telemetry seems more likely. Although if telemetry being enabled affects startup perf by 15%, that sucks also. Did this show up on any other perf suites/OSes?

Blocks: 989609, 989761

Flags: needinfo?(jmaher)

Summary: 14% regression in ts paint on linux 32/64 seen on fx-team → 14% regression in ts paint on linux64 seen on fx-team

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 7

•

11 years ago

I need to see if other tests are affected- There are some other failures, but nothing as large as this. We have different preferences on talos than we do for desktop tests- although many are the same. Here is the preferences we use: http://hg.mozilla.org/build/talos/file/tip/talos/PerfConfigurator.py#l237 for ts paint, we don't use other preferences, so this should be it.

Flags: needinfo?(jmaher)

Justin Dolske [:Dolske]

Updated

•

11 years ago

Whiteboard: [talos_regression] → [talos_regression][Australis:P-]

Jared Wein [:jaws] (please needinfo? me)

Comment 8

•

11 years ago

The changes for bug 989626 are only seen when in-content preferences are loaded, not to mention that in-content preferences are currently disabled by default.

No longer blocks: 989626

Dão Gottwald [:dao]

Updated

•

11 years ago

No longer blocks: 989609, 989761

Marco Bonardo [:mak] (away 22-26 Sept)

Comment 9

•

11 years ago

the code modified in bug 984015 is only used by PlacesTransactions.jsm that is currently unused (disabled behind a pref), so it's not executed in this case.

No longer blocks: 984015

Benjamin Smedberg

Comment 10

•

11 years ago

jmaher said on IRC he was going to investigate/trigger some reruns. Hand it off to me if necessary later.

Assignee: nobody → jmaher

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 11

•

11 years ago

I don't believe we have any other tests/platforms which are failing, time to do some try pushes

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 12

•

11 years ago

backout 986582: https://tbpl.mozilla.org/?tree=Try&rev=4d0eaa96acb6 backout 992208: https://tbpl.mozilla.org/?tree=Try&rev=7d7aa86bee45 will do more tonight if these don't help

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 13

•

11 years ago

bug 992208 seems to be the culprit, when it was backed out (https://tbpl.mozilla.org/?tree=Try&rev=7d7aa86bee45), our numbers for ts_paint are <900, and in line with what we were at before the regression. I assume it has to do with: https://hg.mozilla.org/try/diff/7d7aa86bee45/browser/installer/package-manifest.in :bsmedberg, any thoughts on this? I am sort of surprised- maybe I am reading the numbers wrong?

Flags: needinfo?(benjamin)

Benjamin Smedberg

Comment 14

•

11 years ago

That's awesomely weird. Basically what's happening is we're running this code on profile-after-change now when we weren't before: http://hg.mozilla.org/mozilla-central/annotate/25aeb2bc79f2/browser/experiments/ExperimentsService.js#l31 to http://hg.mozilla.org/mozilla-central/annotate/25aeb2bc79f2/browser/experiments/Experiments.jsm#l355 Of the set of code running here, the only think I can see which might be a startup issue is the call to _startWatchingAddons which calls into AddonManager.addAddonListener/addInstallListener. The only thing that's a little weird is that we don't seem to be checking gExperimentsEnabled anywhere in the main sequence of _loadFromCache/_main/_evaluateExperiments.

Assignee: jmaher → georg.fritzsche

Component: Talos → Client: Desktop

Flags: needinfo?(benjamin) → needinfo?(gps)

Product: Testing → Firefox Health Report

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 15

•

11 years ago

is there anyway the profile we use for talos could be affecting this? I am open to trying out anything if needed.

Gregory Szorc [:gps]

Comment 16

•

11 years ago

All of the imports (https://hg.mozilla.org/mozilla-central/annotate/25aeb2bc79f2/browser/experiments/Experiments.jsm#l14) will also contribute to latency. Things like Metrics.jsm (which should be lazy loaded since it is only used for FHR) pull in a lot of sub-modules and services. Unless we really need Experiments to be initialized ASAP, we should consider delay loading it until after session restore and possibly a few seconds after that. FWIW, I don't think Experiments.jsm should be adding significantly new memory, CPU, or I/O to Firefox initialization. It's a thousand or so lines of JS + a few kb of read I/O to pull in state. However, timing is everything. It is likely initializing things a few seconds earlier than before and that's causing the regression. I'd be very interested in comparisons of usage 60s after startup to confirm or disprove my hypothesis: that will tell us if there is a deeper performance problem to investigate.

Flags: needinfo?(gps)

Benjamin Smedberg

Comment 17

•

11 years ago

If there is an active experiment we certainly don't want to wait: the experiment should be activated as soon as we know that it's still applicable.

Benjamin Smedberg

Comment 18

•

11 years ago

Attached patch Attempt lazy loading, rev. 1 — Details — Splinter Review

Attachment #8403559 - Flags: review?(gps)

Gregory Szorc [:gps]

Comment 19

•

11 years ago

(In reply to Benjamin Smedberg [:bsmedberg] from comment #17) > If there is an active experiment we certainly don't want to wait: the > experiment should be activated as soon as we know that it's still applicable. That may require tighter Add-on Manager integration. This is a result of having experiment add-ons disabled by default. In our current world, we require the Experiments service to be initialized to activate experiments at start-up time. That has performance implications. Switching back to having the Experiments service disable an experiment if needed sometime after initialization would be better for perf. We could have the Add-on Manager refuse to activate an experiment add-on unless the Experiments XPCOM component is present to plug that hole. Can we just delay experiment init today and fix the activation window as a followup?

Gregory Szorc [:gps]

Comment 20

•

11 years ago

Comment on attachment 8403559 [details] [diff] [review] Attempt lazy loading, rev. 1 Review of attachment 8403559 [details] [diff] [review]: ----------------------------------------------------------------- It will be interesting to see how much this moves the needle.

Attachment #8403559 - Flags: review?(gps) → review+

Benjamin Smedberg

Comment 21

•

11 years ago

https://hg.mozilla.org/integration/fx-team/rev/af98015396d4

Keywords: leave-open

Carsten Book [:Tomcat]

Comment 22

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/af98015396d4

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 23

•

11 years ago

I am not seeing any cleanup of ts paint from this: http://graphs.mozilla.org/graph.html#tests=[[83,131,33],[83,64,33],[83,132,33]]&sel=1394460105054,1397052105054&displayrange=30&datatype=running