Closed
Bug 419776
Opened 16 years ago
Closed 15 years ago
Create new Talos suite: talos should measure x resources
Categories
(Release Engineering :: General, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dougt, Assigned: anodelman)
References
Details
(Keywords: mobile, perf)
Attachments
(3 files, 1 obsolete file)
2.08 KB,
patch
|
anodelman
:
review+
bhearsum
:
checked-in+
|
Details | Diff | Splinter Review |
773 bytes,
patch
|
lsblakk
:
review+
|
Details | Diff | Splinter Review |
9.41 KB,
patch
|
lsblakk
:
review+
anodelman
:
checked-in+
|
Details | Diff | Splinter Review |
we should be measuring x resources during the talos run. Testing between 1.8 and 1.9, i have measured a huge improvement and hope that this stays that way. xrestop is the tool needed for measuring what is going on in X.
Comment 1•16 years ago
|
||
Is our goal here to measure running size or to detect leaks of graphics resources? If we want to detect leaks we need to: 1. When we do this run we want to be running with the session history turned off. 2. Same goes for the memory cache. At first blush it looks like keeping session history and memory cache hangs onto a large amount of X resources and aren't freed for some time. Might be interesting to do this with stuart's memory pressure stuff as well. It looks like looking at the Talos code that there might be a place to integrate this. There's code in there that already reads information out of /proc to get memory size and cpu numbers, yeah?
Reporter | ||
Comment 2•16 years ago
|
||
my goal here was to have something equivalent to tracking RSS on linux, but for X.
Reporter | ||
Comment 3•16 years ago
|
||
This returns the total bytes used by X for a given PID. I think we should tie this into the cmanager_linux.py as another counter type. Feel free to improve my greping. Alice, do you have any thoughts on adding this?
Assignee: nobody → doug.turner
Status: NEW → ASSIGNED
Reporter | ||
Comment 4•16 years ago
|
||
I was hoping to simply do: Add my function to cmanager_linux.py, then add: counterDict["XRes"] = GetXRes to the list of counters in the dictionary. However, the output file that is created contains only: 1,value after a tp run.
Reporter | ||
Comment 5•16 years ago
|
||
And of course, i added the right counter to the sample.config I am running.
Reporter | ||
Comment 6•16 years ago
|
||
Attachment #315666 -
Attachment is obsolete: true
Reporter | ||
Comment 7•16 years ago
|
||
Comment on attachment 315794 [details] [diff] [review] adds X Resource Tracking to Talos You also need to add something to your config file like: unix_counters : [ 'Private Bytes', 'RSS', 'XRes'] (Note, the addition of XRes in the above list)
Reporter | ||
Comment 8•16 years ago
|
||
http://graphs-stage.mozilla.org/dgraph.html#spst=range&spstart=0&spend=53&bpst=cursor&bpstart=0&bpend=53&name=tp_loadtime&m1tid=216004&m1bl=0&m1avg=0 looks like it is working.
Reporter | ||
Comment 9•16 years ago
|
||
We would like to get this on production tboxes soon.
Assignee | ||
Comment 10•16 years ago
|
||
What machines do you want this running on? All linux/mac? (I'm assuming here that this wouldn't just be of interest for testing mobile - correct me if I'm wrong).
Reporter | ||
Comment 11•16 years ago
|
||
only matters on linux/gtk machines, so not the mac. this is for all linux/gtk machines that test tp_RSS.
Assignee | ||
Comment 12•16 years ago
|
||
Oops - yeah, no mac. Shouldn't have replied before caffeine. So, you'd like this on all our linux machines? I'm just trying to scope out what the request is. I had a quick look and we don't have xrestop installed on our basic talos image, so it would also involve having to install it all over the place.
Assignee | ||
Comment 13•16 years ago
|
||
Oh, and there would also have to be some hacking in talos itself. At the moment, the counter that we manage (ie, monitoring RSS, Private Bytes, etc) are controlled in two different sets: one for unix style systems and one for windows. In this case we'd be creating a counter that would only run on linux and not mac and would break the current assumptions in the code.
Reporter | ||
Comment 14•16 years ago
|
||
Can't you just not add the counters to the config files on the mac.
Assignee | ||
Comment 15•16 years ago
|
||
Our system is too smart for that - we share config files across platforms. :) We probably could do something where we create custom config files for macs, but we are really trying to keep the amount of config files that we use low. We had an issue in the past with config file explosion where it was very hard to track who was using which and pushing changes to all of them became a finicky nightmare. Using mac only configs would double the already large amount of configs.
Reporter | ||
Comment 16•16 years ago
|
||
is it easier to just throw up a new machine?
Assignee | ||
Comment 17•16 years ago
|
||
No - we don't have enough machines for that. If we want this then we should put in the time to make the changes to talos to allow us to push it to currently up machines.
Reporter | ||
Comment 18•16 years ago
|
||
how can I help? we are not measuring the dark-matter of memory leaks -- what we allocate inside of X. we think it is pretty important.
Comment 19•16 years ago
|
||
Doug: - does this impact runtime or performance of the current talos run? That would help clarify if we can do this on our existing linux talos machines or if we need to start talking about buying more machines and figuring out how to set them up. - is this something we need run on every checkin in an automated manner? Or something that could be run manually at major milestones of a release?
Reporter | ||
Comment 20•16 years ago
|
||
1) i am sure there is some cost to running this on talos -- even though it is "in parallel" against the page cycler. No idea of the actual cost, but I would suspect it would be small. 2) every chickin similar to how we test vmrss.
Reporter | ||
Updated•16 years ago
|
Attachment #315794 -
Flags: review?(anodelman)
Assignee | ||
Updated•16 years ago
|
Attachment #315794 -
Flags: review?(anodelman) → review+
Reporter | ||
Comment 21•16 years ago
|
||
Comment on attachment 315794 [details] [diff] [review] adds X Resource Tracking to Talos great alice. What is next, does talos need another review, or approval, or should I just check it in?
Assignee | ||
Comment 22•16 years ago
|
||
I've r+ed the patch that gets the xres collection code into talos, but this will not cause it be run anywhere without: 1) seconday patches to have talos maintain different counters for linux/mac 2) requisite config files updates for all active talos config files 3) xrestop installed on all linux talos boxes The r+ed patch should be fine for check in as it will not be executed at all, so it is very low risk.
Assignee | ||
Comment 23•16 years ago
|
||
Checking in cmanager_linux.py; /cvsroot/mozilla/testing/performance/talos/cmanager_linux.py,v <-- cmanager_linux.py new revision: 1.2; previous revision: 1.1 done
Reporter | ||
Comment 24•16 years ago
|
||
thanks alice!
Updated•16 years ago
|
Comment 25•16 years ago
|
||
When we have the mobile builds going off hg and talos working for mobile, we can see how it affects talos turning this on and then stage it on the desktop Linux version when we know the effect of it.
Reporter | ||
Comment 26•16 years ago
|
||
In the short term, why can't we stand up a linux gtk/x11 box running this test case (against mozilla-central)? We are driving blind at the moment wrt x leaks+bloat.
Assignee | ||
Comment 27•16 years ago
|
||
There are currently no spare perf testing machines - as it stands even my staging perf machines have been commandeered for comparison perf testing work. Comment #22 is the route to having this run in an automated fashion.
Comment 28•16 years ago
|
||
(In reply to comment #26) > In the short term, why can't we stand up a linux gtk/x11 box running this test > case (against mozilla-central)? We are driving blind at the moment wrt x > leaks+bloat. > (In reply to comment #27) > There are currently no spare perf testing machines - as it stands even my > staging perf machines have been commandeered for comparison perf testing work. > > Comment #22 is the route to having this run in an automated fashion. > For us to set this up as part of automation framework is non-trivial. If we do it on the same existing talos machines, we have to schedule downtimes for them all, update the talos machines, and then do calibration runs to confirm no perf regressions caused by the change, before we reopen the trees. Alternatively, we could build out a new set of machines just for this new test, avoiding risk of change to existing talos machines, but this needs a set of machines and some setup/maintenance work. Both of these options are not on our plates for this quarter, but we can discuss it with other mobile requests for next quarter. In the meanwhile, if you have a spare machines, there's nothing to stop you setting up a private machine running this test on a branch of your choosing, and manually triggering it on builds of your choosing as you need. Hope all that makes sense. I do understand the usefulness of this new test, but I'm trying to juggle this with the other things we already did promise to do. If I'm missing something, maybe we should all have a quick conf call on this?
Reporter | ||
Comment 29•16 years ago
|
||
christian, we should be measuring x resources on mobile.
Assignee: doug.turner → nobody
Status: ASSIGNED → NEW
Updated•16 years ago
|
Component: Testing → Release Engineering
Product: Core → mozilla.org
QA Contact: testing → release
Version: Trunk → other
Comment 30•16 years ago
|
||
Looks like there's been no activity on this for quite awhile. I'm moving it out to future. If someone is going to grab it to work on, feel free to move it back.
Component: Release Engineering → Release Engineering: Future
Comment 31•16 years ago
|
||
There has been no activity because no one on build has been working on it. There is patch that is ready to land, can we just do that?
Component: Release Engineering: Future → Release Engineering
Comment 32•16 years ago
|
||
(In reply to comment #31) > There has been no activity because no one on build has been working on it. > There is patch that is ready to land, can we just do that? Deferring to Alice on this.
Assignee | ||
Comment 33•16 years ago
|
||
Patch has been checked in and landed. What's missing there is running this in an automated fashion, as described in comment #22 - that's the future part of the work.
Updated•16 years ago
|
Attachment #315794 -
Flags: checked‑in+
Comment 34•16 years ago
|
||
I'm moving this back to future as per comment #33
Component: Release Engineering → Release Engineering: Future
Comment 35•15 years ago
|
||
Any ETA on when this will happen?
Updated•15 years ago
|
Summary: talos should measure x resources → Create new Talos suite: talos should measure x resources
Assignee | ||
Comment 36•15 years ago
|
||
Resurrected as part of q4 talos test creation push.
Component: Release Engineering: Future → Release Engineering
Priority: -- → P2
Assignee | ||
Updated•15 years ago
|
Assignee: nobody → anodelman
Assignee | ||
Comment 37•15 years ago
|
||
Attachment #411844 -
Flags: review?(lsblakk)
Assignee | ||
Comment 38•15 years ago
|
||
Some difficulty getting xres onto the talos slaves (mostly due to them now being on a pretty old version of ubuntu - gutsy). Currently using: https://launchpad.net/ubuntu/+source/xrestop/0.4-1/+build/212407/+files/xrestop_0.4-1_i386.deb and 'dpkg -i xrestop_0.4-1_i386.deb' We'll see what kind of numbers we get out of that on stage.
Updated•15 years ago
|
Attachment #411844 -
Flags: review?(lsblakk) → review+
Assignee | ||
Comment 39•15 years ago
|
||
Attachment #411855 -
Flags: review?(lsblakk)
Updated•15 years ago
|
Attachment #411855 -
Flags: review?(lsblakk) → review+
Assignee | ||
Comment 40•15 years ago
|
||
Installed xrestop on all production & try talos linux slaves (except talos-rev2-linux04 which is unreponsive. Also, graph server changes pushed to production. This is now ready for roll out.
Assignee | ||
Comment 41•15 years ago
|
||
Comment on attachment 411855 [details] [diff] [review] [checked in]talos changes to make xres run on linux only Checking in sample.config; /cvsroot/mozilla/testing/performance/talos/sample.config,v <-- sample.config new revision: 1.38; previous revision: 1.37 done Checking in ttest.py; /cvsroot/mozilla/testing/performance/talos/ttest.py,v <-- ttest.py new revision: 1.36; previous revision: 1.35 done Checking in run_tests.py; /cvsroot/mozilla/testing/performance/talos/run_tests.py,v <-- run_tests.py new revision: 1.54; previous revision: 1.53 done
Attachment #411855 -
Attachment description: talos changes to make xres run on linux only → [checked in]talos changes to make xres run on linux only
Attachment #411855 -
Flags: checked-in+
Assignee | ||
Comment 42•15 years ago
|
||
Configured talos-rev2-linux04 now that it is up and reachable. That should finish things here.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•