Closed Bug 510587 Opened 15 years ago Closed 15 years ago

Create a cold startup Ts (mac/linux)

Categories

(Testing :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: adw, Assigned: anodelman)

References

Details

(Whiteboard: [ts])

Attachments

(3 files, 4 obsolete files)

Ts currently measures warm startup.  We need to measure cold startup, too.  Hand off to releng when it's ready.

If integrating real cold startup with a new Ts is a pain, simulated cold startup is probably OK, e.g., by flushing disk cache.  Notes on flushing disk cache on Windows [1] and OS X [2].  Need to figure out how to do so for other platforms we support, but IMO that shouldn't block getting this set up for Windows and OS X.

[1] http://chadaustin.me/2009/04/flushing-disk-cache/
[2] http://tuvix.apple.com/DOCUMENTATION/Darwin/Reference/ManPages/man8/purge.8.html
Whiteboard: [ts]
Does this belong in the RelEng component? Surely the work to do this will not be in the platform itself.
That was my thought too, but Alice suggested we file here to track development of the test and then hand off by filing a new bug in releng when it's ready to deploy.
This is my plan:

1. Modify talos so that it can run a head (and tail, for consistency) script before each individual run in a test.  I asked Alice about doing this, and she said she could see it.  I already have a simple patch.
2. Add a cold Ts, which would be the same as the current Ts except that it would use a head script to purge disk cache before each run as described in the link in comment 1.

Alice, what do you think about that?  I'll spin out a separate bug for part 1 if it's OK.  (Let me know the proper component.)
I think that it's a good plan.  Where still trying to figure out the correct placement of these test creation bugs, and I think that they may belong in Core:Testing with me cc'ed.
(In reply to comment #5)
> I think that it's a good plan.  Where still trying to figure out the correct
> placement of these test creation bugs, and I think that they may belong in
> Core:Testing with me cc'ed.

I don't see a Core::Testing.  Testing::General?  I notice there's a mozilla.org::Release Engineering: Future.  Also, what's the best way for me to make patches?  I created a repo from standalone talos.  Is it OK to diff my changes against that?
Depends on: 511914
I want to get moving on this, moving to Testing::General.  Spun out bug 511914 for head and tail scripts.
Assignee: adw → nobody
Status: NEW → ASSIGNED
Product: Core → Testing
QA Contact: general → general
Assignee: nobody → adw
Simulated cold startup on Windows is not easy.  The various methods people have suggested either don't work or don't work well.  It may be easier to simply reboot between runs.  Quoting an MSFT software tester:

---

This is actually a very complicated thing to do, and to do it correctly, these are some of the things you need to worry about:

1. Invalidating the CPU caches
2. Invalidating the cache on the storage media
3. Invalidating the OS's read cache (note that this is not the same as the write cache which can be flushed with FlushFileBuffers)
4. Removing items from the OS's KnownDLL cache

---

CC'ing Rob and Robert, but if anyone has some insight on how to do these things, please speak up.

More info:
https://wiki.mozilla.org/Firefox/Projects/Startup_Time_Improvements_Notes#adw.27s_Windows_XP_experience
Any chance that it would be reasonable to re-create the profile for each run?  It could be wiped clean between runs with some changes to the underlying talos code - is that a reasonable simulation?

When I was considering cold start up that was how I was going to design the test.  But I could have been off track.
I don't think that's enough. XP keeps track of early IO trends for applications (called Prefetch) which you'll need to clear each run to avoid tainted results for non-profile data (dlls, other global data). On Windows 7 (and probably Vista too), this requires high level privilege to inspect the contents (or security settings) of the folder (%SYSTEMROOT%\Prefetch).

I would think that the CPU cache would be wiped on context switch (or enough other code being run).

Wiping prefetch and rebooting seems like the easiest way to simulate cold startup.

How feasible is it to clone a new vm and installing the nightly on it each time? Differencing disks would make this more performant I'd think.
Depends on: 405578
Attached file WIP (obsolete) —
Python head script to be used in conjunction with bug 511914.  OS X is the only platform switched on right now.
On Linux we can enable password-less sudo for individual commands, so we could give the mozqa user permission to run a script that runs echo 3 > /proc/sys/vm/drop_caches
(Should also be able to use a suid root shell script on linux)
Attached patch add cold ts test to talos (obsolete) — Splinter Review
Add ts_cold to sample.config (and finally clean up those ^M line endings).  Creating scripts directory in talos and add cold.py there.
Attachment #399360 - Flags: review?(catlee)
Adds support to our buildbot set up to be able to control which platforms we run tests on, which is good as we are having some trouble getting ts_cold working on windows.
Attachment #399371 - Flags: review?(catlee)
Attached patch adw patch (obsolete) — Splinter Review
Updates run_tests.py for ts_cold.
Updates sample.config for ts_cold.
Creates scripts/ts_cold/head.py and scripts/ts_cold/linux/purge.c.
I couldn't get CVS to add the Linux purge binary to the patch.
Attachment #398798 - Attachment is obsolete: true
Blocks: 515540
I'm not 100% sure about what I'm talking about here, but how does this sound for Windows?

1. Creating N copies of the objdir and profile dir in distinct locations.
2. Store a counter I inited to 0 somewhere.
3. Read a file large enough to exhaust the OS read cache.
4. Write a file large enough to exhast the OS write cache (Or use FlushFileBuffers).
5. While I++ < N do:
  5.a. run a Ts test from objdir I with profile dir I
  5.b. report the result
6. Delete all 2N dirs.
7. Reboot

N  being the number of times we want to run the cold Ts tests.
Attachment #399360 - Flags: review?(catlee) → review+
Comment on attachment 399371 [details] [diff] [review]
run ts_cold on mac only on talos-stage

When do you want to land this?  This conflicts horribly with some of the changes I've made to configs for talos testing of release builds.
(In reply to comment #18)
> (From update of attachment 399371 [details] [diff] [review])
> When do you want to land this?

I think you were asking Alice, but on the Firefox side we'd like to get it up and running sooner rather than later.  What's the nature of the conflicts?
This is blocked behind q3 mobile talos goals, it will be picked up in q4.
Refreshing various patches and re-testing on talos-stage.
Attachment #399360 - Attachment is obsolete: true
Attachment #399371 - Attachment is obsolete: true
Attachment #399371 - Flags: review?(catlee)
Attachment #399372 - Attachment is obsolete: true
Assignee: adw → anodelman
Attachment #405371 - Flags: review?(bhearsum)
Upon playing around with this some more, I think that the easiest set up is going to be adding:

Cmnd_Alias DROPCACHE = /usr/bin/tee /proc/sys/vm/drop_caches
mozqa    ALL=NOPASSWD: DROPCACHE

to the sudoers.  It doesn't open a security hole and simplified the talos set up.
Attachment #405371 - Flags: review?(bhearsum) → review+
Up and running on stage.
Attachment #406298 - Flags: review?(bhearsum)
Attachment #406298 - Flags: review?(bhearsum) → review+
Comment on attachment 406298 [details] [diff] [review]
[checked in]add cold startup test to talos code

Looks reasonable to me.
Updated try talos linux slave's sudoers.
Updated all currently up linux talos production slaves, missing talos-rev2-linux05/14 as they are unreachable.
Attachment #406532 - Flags: review?(catlee) → review+
Comment on attachment 405371 [details] [diff] [review]
[Checked in]add cold startup tests to graph server db

changeset:   245:108275d0f693

Pushed to production by justdave.
Attachment #405371 - Attachment description: add cold startup tests to graph server db → [Checked in]add cold startup tests to graph server db
Comment on attachment 406298 [details] [diff] [review]
[checked in]add cold startup test to talos code

RCS file: /cvsroot/mozilla/testing/performance/talos/scripts/ts_cold/head.py,v
done
Checking in scripts/ts_cold/head.py;
/cvsroot/mozilla/testing/performance/talos/scripts/ts_cold/head.py,v  <--  head.py
initial revision: 1.1
done
Checking in sample.config;
/cvsroot/mozilla/testing/performance/talos/sample.config,v  <--  sample.config
new revision: 1.35; previous revision: 1.34
done
Attachment #406298 - Attachment description: add col startup test to talos code → [checked in]add cold startup test to talos code
Comment on attachment 406532 [details] [diff] [review]
[checked in]add cold tests to buildbot-configs (staging, try, production) talos

changeset:   1627:846b7d3c71f4
Attachment #406532 - Attachment description: add cold tests to buildbot-configs (staging, try, production) talos → [checked in]add cold tests to buildbot-configs (staging, try, production) talos
Working on linux/mac on all branches.
Summary: Create a cold startup Ts → Create a windows cold startup Ts
Going to mark this fixed and create a win cold ts test bug.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Summary: Create a windows cold startup Ts → Create a cold startup Ts (mac/linux)
(In reply to comment #33)
> Going to mark this fixed and create a win cold ts test bug.

...and that is bug#522807.
Blocks: 501563
You need to log in before you can comment on or make changes to this bug.