Closed Bug 480413 (Tshutdown) Opened 15 years ago Closed 15 years ago

design test to monitor browser shut down time

Categories

(Release Engineering :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: anodelman, Assigned: anodelman)

References

Details

Attachments

(3 files)

A similar metric should be collected for browser shutdown time as we do for browser startup time.
Moving to Future until someone picks this up.
Component: Release Engineering: Talos → Release Engineering: Future
The current Ts test starts the browser, cycles through the whole pageset, then exits, right? If we just calculated the time elapsed from when the test harness asks the browser to quit, to the time when the test runner notices that the process has exited, that would seem like a pretty reasonable Tshutdown. And since it will have cycled through a whole bunch of pages, it might even be a decent measure of what our users hit after a long browsing session.
Ts opens the browser with an empty profile, it does not cycle through any pages or do anything but immediately close.

You're mixing it up with Tp.
Oh, right, silly. I think "Tshutdown" would be more useful measured at the end of Tp then. Is my description in comment 2 accurate for Tp?
(In reply to comment #4)
> Oh, right, silly. I think "Tshutdown" would be more useful measured at the end
> of Tp then. Is my description in comment 2 accurate for Tp?

I agree, though it would be nice to be able to get more than one data point per run.  Still, anything is better than nothing as a first pass - don't have to report it anywhere but graph server at first, while we see if there's any stability to it.
Assignee: nobody → anodelman
Priority: -- → P2
I've got some preliminary work on a single linux slave that records shutdown time post Ts and Tp - figured that that would give us both a 'clean' shutdown and a 'dirty' shutdown metric.

I'll be installing it on all the talos staging boxes to work through the bugs.
Depends on: 485542
Ts_shutdown is running consistently on staging - you can see the results on the MozillaTest waterfall.

Tp_shutdown is working on ubuntu.  It's also generating results on winxp/vista, but the values are in the 50-60 seconds range so I want to double check that the numbers are trustworthy.  Leopard/tiger work about 70% of the time and appear to be hitting a race condition involving non-threadsafe python subprocess.Popen behavior - I hope to work that out this week.

The blocking bug regarding the graph server has to do with the formatting of the shutdown test results that hit an edge case on the graph server db insertion script.
Blocks: 478603
I've eliminated the race condition.  I'm still picking out a few strange bugs,
but I think that things are working pretty well on all platforms.  I'd like to
see a few days without any weird bugs appearing (ie, getting stuck somewhere
and then blocking indefinitely is no good).

When it comes to the long shutdown times on winxp/vista I sat and watched a few
test cycles to see if the test harness was buggy.  When the browser hit the
last page of the test I sat and counted off how long it took the browser to
shutdown, and I'm afraid that shutdown times on the order of 60 seconds are
accurate.
First step in monitoring for shutdown time is having the browser dump a timestamp just as it is going to close.
Attachment #371319 - Flags: review?(vladimir)
Comment on attachment 371319 [details] [diff] [review]
[Checked in]add step in pageloader to dump a timestamp before shutting down

You really want this to look like "__startTimestamp123456__endTimestamp"?  I guess that's fine, but maybe put some commas or something around it :)

Also, a comment here saying why we're printing this here, since it's not really directly related to the pageloader.  I can see it being useful to time shutdown time without using pageloader, though that can be done in the same way that startup is done right now (load a specific html file that just calls dump()).

Looks fine otherwise.
Attachment #371319 - Flags: review?(vladimir) → review+
Just an update, I've worked out the threading issues and gotten the shutdown times on winxp/vista to come back down into a more believable range.  Am now down to fixing some vista specific bugs and double checking edge cases (browser crash, browser in frozen state, etc).
Alias: Tshutdown
Here's what I ended up with:

- No longer opening the browser with Popen, had strange side effects on winxp/vista in terms of blocking wait() not returning in a reasonable amount of time also issues with non-threadsafe popen behavior on mac resulting in blocked wait() calls
- No longer communicating through pipes to the browser, again strange side effects on winxp/vista where non-blocking reads from pipes appeared to block and cause the browser to freeze
- Moving around some code to libraries that was in ttest.py and didn't belong there

What's left to do in other patches:
- if we like what's happening here we can remove a lot of code from ffprocess_* as we no longer use all the procedures for reading from pipes
- There's more opportunity for streamlining ttest.py and generally making things less confusing
Attachment #372972 - Flags: review?(catlee)
Attachment #371323 - Flags: review?(rdoherty) → review?(catlee)
Comment on attachment 371323 [details] [diff] [review]
[Checked in]add shutdown tests to graph server db

>+insert into tests values (NULL,"tp_shutdown","Tp3 Shutdown",1,1,NULL);

Do you want pageset_id here to patch what's used in the Tp3 test?  Or is that field used for something else?
Attachment #371323 - Flags: review?(catlee) → review+
Attachment #372972 - Flags: review?(catlee) → review+
Comment on attachment 372972 [details] [diff] [review]
[Checked in]talos support for shutdown test

Looks great!

Just a few nits:
- Use != instead of <> to test inequality.  <> is going away.

- See if we need shell=True in the subprocess calls.  Also, you should be able to pass the command to subprocess as a list of arguments instead of one big string. e.g. try this:
    process = subprocess.Popen(['python', 'bcontroller.py', '--command', command_line, '--name', process, '--timeout', str(browser_wait), '--log', log], universal_newlines=True, shell=False, bufsize=0, env=os.environ)
Comment on attachment 371323 [details] [diff] [review]
[Checked in]add shutdown tests to graph server db

changeset:   208:e719a3af02cb

Per catlee's comment, corrected pagesets associated with tp_shutdown.  Also added tp_shutdown_nochrome and tp_fast_shutdown, which were missing.
Attachment #371323 - Attachment description: add shutdown tests to graph server db → [Checked in]add shutdown tests to graph server db
Attachment #371323 - Flags: checked‑in+ checked‑in+
Pushed db changes to production db.

Now just need to schedule downtime for code changes to be checked in.
Comment on attachment 371319 [details] [diff] [review]
[Checked in]add step in pageloader to dump a timestamp before shutting down

Checking in report.js;
/cvsroot/mozilla/layout/tools/pageloader/report.js,v  <--  report.js
new revision: 1.10; previous revision: 1.9
done
Attachment #371319 - Attachment description: add step in pageloader to dump a timestamp before shutting down → [Checked in]add step in pageloader to dump a timestamp before shutting down
Attachment #371319 - Flags: checked‑in+ checked‑in+
Comment on attachment 372972 [details] [diff] [review]
[Checked in]talos support for shutdown test

RCS file: /cvsroot/mozilla/testing/performance/talos/bcontroller.py,v
done
Checking in bcontroller.py;
/cvsroot/mozilla/testing/performance/talos/bcontroller.py,v  <--  bcontroller.py
initial revision: 1.1
done
Checking in cmanager_win32.py;
/cvsroot/mozilla/testing/performance/talos/cmanager_win32.py,v  <--  cmanager_win32.py
new revision: 1.3; previous revision: 1.2
done
Checking in ffprocess.py;
/cvsroot/mozilla/testing/performance/talos/ffprocess.py,v  <--  ffprocess.py
new revision: 1.10; previous revision: 1.9
done
Checking in ffprocess_mac.py;
/cvsroot/mozilla/testing/performance/talos/ffprocess_mac.py,v  <--  ffprocess_mac.py
new revision: 1.13; previous revision: 1.12
done
Checking in ffsetup.py;
/cvsroot/mozilla/testing/performance/talos/ffsetup.py,v  <--  ffsetup.py
new revision: 1.10; previous revision: 1.9
done
Checking in run_tests.py;
/cvsroot/mozilla/testing/performance/talos/run_tests.py,v  <--  run_tests.py
new revision: 1.41; previous revision: 1.40
done
Checking in sample.config;
/cvsroot/mozilla/testing/performance/talos/sample.config,v  <--  sample.config
new revision: 1.27; previous revision: 1.26
done
Checking in ttest.py;
/cvsroot/mozilla/testing/performance/talos/ttest.py,v  <--  ttest.py
new revision: 1.22; previous revision: 1.21
done
Checking in utils.py;
/cvsroot/mozilla/testing/performance/talos/utils.py,v  <--  utils.py
new revision: 1.9; previous revision: 1.8
done
Checking in startup_test/startup_test.html;
/cvsroot/mozilla/testing/performance/talos/startup_test/startup_test.html,v  <--  startup_test.html
new revision: 1.6; previous revision: 1.5
done
Checking in startup_test/twinopen/winopen.js;
/cvsroot/mozilla/testing/performance/talos/startup_test/twinopen/winopen.js,v  <--  winopen.js
new revision: 1.3; previous revision: 1.2
done
Attachment #372972 - Attachment description: talos support for shutdown test → [Checked in]talos support for shutdown test
Attachment #372972 - Flags: checked‑in+ checked‑in+
Comment on attachment 372972 [details] [diff] [review]
[Checked in]talos support for shutdown test

Backed out, redness on tiger/leopard talos boxes.
Attachment #372972 - Flags: checked‑in+ checked‑in+ → checked‑in- checked‑in-
Attachment #372972 - Attachment description: [Checked in]talos support for shutdown test → [Backed out]talos support for shutdown test
Issue with the landing on Saturday afternoon tracked down to how the regexes were written that recognize browser output.  With the larger set of output generated by the production pageset (as opposed to the new, small pageset that I was testing with in staging) it was enough to cause the python regex parser to go into some sort of ultra-loop.

Fix is minor and will be checked in as a bustage fix (with approval from catlee in irc).
Changes has been green on staging all afternoon/evening.  No longer seeing freezing on any platform during testing.

I believe that this is now worth another landing attempt at the next available downtime.
Attachment #372972 - Attachment description: [Backed out]talos support for shutdown test → [Checked in]talos support for shutdown test
Attachment #372972 - Flags: checked‑in- checked‑in- → checked‑in+ checked‑in+
Comment on attachment 372972 [details] [diff] [review]
[Checked in]talos support for shutdown test

Same as before, with minor bustage fixes (regex format, throttling long output) approved by catlee via irc.

Checking in cmanager_win32.py;
/cvsroot/mozilla/testing/performance/talos/cmanager_win32.py,v  <--  cmanager_win32.py
new revision: 1.5; previous revision: 1.4
done
Checking in ffprocess.py;
/cvsroot/mozilla/testing/performance/talos/ffprocess.py,v  <--  ffprocess.py
new revision: 1.12; previous revision: 1.11
done
Checking in ffprocess_mac.py;
/cvsroot/mozilla/testing/performance/talos/ffprocess_mac.py,v  <--  ffprocess_mac.py
new revision: 1.15; previous revision: 1.14
done
Checking in ffsetup.py;
/cvsroot/mozilla/testing/performance/talos/ffsetup.py,v  <--  ffsetup.py
new revision: 1.12; previous revision: 1.11
done
Checking in run_tests.py;
/cvsroot/mozilla/testing/performance/talos/run_tests.py,v  <--  run_tests.py
new revision: 1.43; previous revision: 1.42
done
Checking in sample.config;
/cvsroot/mozilla/testing/performance/talos/sample.config,v  <--  sample.config
new revision: 1.29; previous revision: 1.28
done
Checking in ttest.py;
/cvsroot/mozilla/testing/performance/talos/ttest.py,v  <--  ttest.py
new revision: 1.26; previous revision: 1.25
done
Checking in utils.py;
/cvsroot/mozilla/testing/performance/talos/utils.py,v  <--  utils.py
new revision: 1.11; previous revision: 1.10
done
Checking in startup_test/startup_test.html;
/cvsroot/mozilla/testing/performance/talos/startup_test/startup_test.html,v  <--  startup_test.html
new revision: 1.8; previous revision: 1.7
done
Checking in startup_test/twinopen/winopen.js;
/cvsroot/mozilla/testing/performance/talos/startup_test/twinopen/winopen.js,v  <--  winopen.js
new revision: 1.5; previous revision: 1.4
done
Component: Release Engineering: Future → Release Engineering
Rolled out, not causing any redness and no observable regressions in
performance numbers.  Results are called "Tp3 Shutdown", "Tp3 Nochrome
Shutdown", "Tp3 Fast Shutdown" and "Ts Shutdown" on the new graph server.  The
tp shutdown tests are considered a 'dirty' shutdown time, while ts shutdown is
a 'clean shutdown' (as ts uses a new, blank profile).
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: