Closed Bug 347966 Opened 18 years ago Closed 17 years ago

Need jprof profile data to be exported by (some) Tinderbox performance tests

Categories

(Release Engineering :: General, defect, P2)

x86
Linux

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: roc, Assigned: rhelmer)

References

Details

Attachments

(2 files, 4 obsolete files)

It can sometimes be hard to reproduce performance issues reported by Tinderbox test machines. It would help a lot if some performance test machines ran with profiling on and published their profiles.

jprof:
http://www.mozilla.org/performance/jprof.html
http://lxr.mozilla.org/mozilla/source/tools/jprof/README.html
It works only on Linux x86, as far as I know.
My $.02 on making the log  files avail - if it isn't useful/feasible to append the profile data to the build logs, can we just dump the files locally and run a basic apache server to serve up the results?  I.e. just dump the files named buildid.html and serve them up.  

This came up in the 1.9 discussion the other day as something that's blocking bug  130078 which is blocking bug 337801 which is blocking being able to usefully drag tabs between windows...  Is there something that can be done to help this along?
rhelmer, is there going to be any movement on this? It's blocking work.
QA Contact: timeless → tinderbox
We don't have enough test hardware up yet to handle this, set up a request for hardware in bug 366615. Please comment there if we have any special needs as far as hardware goes.

I've built and run jprof on trunk on my dev machine, getting this set up on tinderbox should be straightforward.
Severity: normal → major
Depends on: 366615
Machine is racked, setting up based on bsmedberg's reference config:
http://wiki.mozilla.org/ReferencePlatforms/Linux

Let me know if that looks ok. I've got gcc4.1.1 on there now, working on getting all the deps for Mozilla right.
First test run, looks like everything worked ok. Here is an example report:
http://people.mozilla.org/~rhelmer/logs/jprof/testrun.html

I'll work on the tinderbox hookup next week.
I've hacked tinderbox such that it can run jprof during tests and upload the result at the end. Here is an example:

http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/jprof/2007-01-19-18-trunk/

Let me know if that looks ok, and I'll get the tinderbox patches reviewed and checked in.
What period are you using? If you can crank the Hz up to the max, that would probably be good for giving us more accurate results to track subtle changes. (Thanks!!!)
On Linux, if we have root on the machine and use the RTC we should be able to do 8192 Hz.  For the typical 500ms Txul, that should give us about 4000 hits.  The profile linked to has 200.
Right now:

JPROF_FLAGS="JP_START JP_FIRST=3 JP_PERIOD=0.015"

So should this be instead? :

JPROF_FLAGS="JP_START JP_FIRST=3 JP_RTC_HZ=8192"

bz, looks like /dev/rtc is already world-readable, do I just need to "echo 8192 > /proc/sys/dev/rtc/max-user-freq" as root and restart with the above JPROF_FLAGS?
Hmmm....  Yeah, the echo and the JP_RTC_HZ=8192 are right.

The only issue is that JP_RTC_HZ doesn't respect JP_FIRST at the moment.  I'll fix the documentation accordingly.

We should try to fix that (filed bug 367675), but you can also start jprof from JS if you want.  So if you have some JS that runs the test (which is the case for Tp and Tdhtml and Txul at least), then you can put:

  try {
    JProfStartProfiling();
  } catch (e) {}

at the beginning of the test code to start the profiler then, and use JP_DEFER instead of JP_START.  It seems more likely to be accurate than the 3-second delay, and I'd prefer it even if JP_FIRST were working.
BTW I just tried running jprof with JP_RTC_HZ=8192 under a VM (Parallels, Intel Core 2 Duo Macbook Pro) and in my profile I saw PR_IntervalNow taking 20% of the time. That seems to be an artifact of running jprof under the VM. PR_Now seems to do something similar. It's a problem because in some places we call PR_Now/PR_IntervalNow a lot and that distorts the profiles.
(This may or may not be a problem for the Tinderbox jprof setup, but keep an eye out for it.)
roc points out that we can use JP_START without JP_FIRST (wasn't clear to me from the docs), changed to:

JPROF_FLAGS="JP_START JP_RTC_HZ=8192"

Also modified /etc/sysctl.conf so max freq is 8192:
dev.rtc.max-user-freq = 8192

So now:
[rhelmer@bl-bldlnx02 ~]$ cat /proc/sys/dev/rtc/max-user-freq
8192

Tinderbox has been restarted, so every build from now on will have these settings.
Component: Tinderbox → Build & Release
Product: Webtools → mozilla.org
Version: Trunk → other
For what it's worth, startup is _very_ expensive; I bet it will skew the Txul results at least significantly.  Same with shutdown.  I still think that if we can use JP_DEFER and start/stop the profiler around the test that would be ideal.

roc, I've seen some profiles where PR_Now actually _did_ take that much (at least according to jprof, on a non-VM system)... Sometimes we just call it a _lot_.  :(
(In reply to comment #16)
> For what it's worth, startup is _very_ expensive; I bet it will skew the Txul
> results at least significantly.  Same with shutdown.  I still think that if we
> can use JP_DEFER and start/stop the profiler around the test that would be
> ideal.

I'll work on it; I'd prefer if we could wrap the test in some way rather than having to modify the tests, maybe have a jprof_start.html that starts profiling and redirects to the test.

That doesn't help the shutdown case though.
How does the browser know to shut down at the end of the test?
(In reply to comment #18)
> How does the browser know to shut down at the end of the test?
> 

The dom.allow_scripts_to_close_windows pref is set to True, so window.close() can close the window. This doesn't work on Mac though, since closing the last window does not shut down the app.

There are other ways we can shut down from Javascript which will actually shut down the browser (although it's Moz-specific of course), which we've been planning on doing. I guess we could have the tests include a startup and shutdown script, and try calling the jprof start/stop methods from those.
Hmm...  Is it possible to pass two separate URIs to Firefox on startup to be loaded in two tabs?  One for the test, the other (in a background tab) with just an onunload handler that stops the profiler?

In any case, I'd worry less about shutdown than startup.  Shutdown is not as expensive and easier to filter out of profiles in my experience.
(In reply to comment #20)
> Hmm...  Is it possible to pass two separate URIs to Firefox on startup to be
> loaded in two tabs?  One for the test, the other (in a background tab) with
> just an onunload handler that stops the profiler?

Hm, doing it with tabs would break the way current shutdown works; also, I have been working on a way to test with or without the browser.xul loaded, so it'd be nice to not rely on being able to open tabs.

This seems to work (assuming changing dump to jprof start/stop methods works):

firefox -url "javascript:dump('start'); window.location = 'http://example.com'; dump('stop');" -console

Would this avoid profiling startup/shutdown time (assuming we put the shutdown method in the URL above instead of it being inside the test)?
> firefox -url "javascript:dump('start'); window.location = 'http://example.com';
> dump('stop');" -console

URL loads are async, so the dump('stop') happens before the load finishes.

I just wouldn't worry too much about stopping for now.  ;)
Moving to Core/Testing. alice/robcee, I have a one-off machine at the office doing this, this would be a very valuable feature to have for the new test farm 
(jprof profiling, currently linux-only).

This needs a special build, and a performance test under special conditions. I am not sure if it needs to run on physical hardware or if a VM is ok.. the point of the perf test is to exercise the right bits of the browser so we can get profile data out of it AFAICT, so I am assuming that e.g. the Tp/Tp2/etc. numbers aren't actually interesting (someone on this bug should be able to correct me if I am mistaken).

This sounds like it'd be more appropriate in the unit/ref/etc. test farm than in the perf test farm.
Component: Build & Release → Testing
Product: mozilla.org → Core
Version: other → Trunk
I'm not sure how stable the VMs are, i.e., do they introduce jitter into the profile numbers? If that's the case, a stand-alone box might be the way to go.

It would definitely be a useful source of metrics though. Maybe IT or QA have an older machine lying around we could repurpose for this?
The Tp/Tp2 numbers are interesting because we will want to see if a performance regression seen on some tinderboxes is also happening on the jprof tinderbox.

A VM may not be OK: see comment #13.
I notice that this box is crashing during tests a lot lately; do I need to do anything here or is this just a victim of general crashiness?

I'd like to get this off my plate, robcee should I assign this to you to get up on the new test farm or do you want a new bug for that?
Status: NEW → ASSIGNED
robcee, feel free to retriage to nobody if you don't want it yet.. I think jprof would be an excellent thing to support on the test farm, let me know if you need help getting it going.
Assignee: rhelmer → rcampbell
Status: ASSIGNED → NEW
Assignee: rcampbell → rhelmer
Hard drive on the prototype I set up seems to have died; need to get it up and running again, and make it stable enough to report to a tree that people actually care about (inot MozillaTest).

Assignee: rhelmer → build
Component: Testing → Build & Release
Product: Core → mozilla.org
QA Contact: tinderbox → preed
Version: Trunk → other
Assignee: build → nobody
Component: Build & Release → Testing
Product: mozilla.org → Core
QA Contact: preed → testing
Version: other → Trunk
We've got a new machine up, unfortunately we couldn't recover the other one so it'll take a little more setup work than I'd hoped.

Should have it reporting this week.
Assignee: nobody → rhelmer
This is a quick hack; attaching so I don't lose it. Not sure if we should check this into Tinderbox or not, but if so I think it just needs to honor a TestWithJprof setting (or something like that).

Used the idea most recently discussed in comment #22, to start profiling after browser startup (but don't bother stopping before shutdown).

The box is reporting to MozillaTest tree, but it's currently doing a --testonly cycle on a build that I know works. Sample reports here:
http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/bl-bldlnx02-trunk/

Nightlies will go here:
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/jprof/

I'll set it to start building in the morning.
Status: NEW → ASSIGNED
Attached patch cleanup previous patch a bit (obsolete) — Splinter Review
Attachment #268323 - Attachment is obsolete: true
Ok, test box is up and working, you can see a nightly run of jprof reports here:
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/experimental/jprof/2007-06-15-11-trunk/

Hourlies here:
http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/bl-bldlnx02-trunk/

This is "Linux bl-bldlnx02 Dep Nightly" on the MozillaTest tree. It's just a server here in the office, not sure if we should publish to Firefox tree, because it's not going to get e.g. weekend support. Any thoughts?

Robcee - I am kicking this bug to the default Core/Testing assignee. Please let me know if/when you need help getting this on the unit test and/or perf boxes; bl-bldlnx02 is only a stopgap measure until that's ready!
Assignee: rhelmer → nobody
Status: ASSIGNED → NEW
Actually after discussion taking this back to do a couple more things:

* get configs in CVS
* put "jprof" in the name
* work out Tier2 support for IT, so this can publish to Firefox tree
Assignee: nobody → rhelmer
Attached patch jprof support for tinderbox (obsolete) — Splinter Review
Attachment #268534 - Attachment is obsolete: true
Attachment #268574 - Flags: review?(ccooper)
Status: NEW → ASSIGNED
Priority: P1 → P3
Whiteboard: waiting for review
Comment on attachment 268574 [details] [diff] [review]
jprof support for tinderbox

The jprof changes are working on bl-bldlnx02 and reporting to MozillaTest. The current bustage is unrelated to jprof.
Attachment #268574 - Flags: review?(ccooper) → review+
Whiteboard: waiting for review
Whiteboard: patch needs refresh and checkin
Priority: P3 → P2
Comment on attachment 279674 [details] [diff] [review]
[backed out]refreshed patch, no conflicts

Landed:
Checking in build-seamonkey-util.pl;
/cvsroot/mozilla/tools/tinderbox/build-seamonkey-util.pl,v  <--  build-seamonkey-util.pl
new revision: 1.371; previous revision: 1.370
done
Checking in post-mozilla-rel.pl;
/cvsroot/mozilla/tools/tinderbox/post-mozilla-rel.pl,v  <--  post-mozilla-rel.pl
new revision: 1.130; previous revision: 1.129
done
Checking in tinder-defaults.pl;
/cvsroot/mozilla/tools/tinderbox/tinder-defaults.pl,v  <--  tinder-defaults.pl
new revision: 1.118; previous revision: 1.117
done
Attachment #279674 - Attachment description: refreshed patch, no conflicts → refreshed patch, no conflicts[checked in]
This is currently running and looks correct on the MozillaTest tree; I don't think it's stable enough to go to Firefox tree. Please open a new bug or contact me to discuss if you'd like.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Rob, this is turning Tp orange for Camino:

http://tinderbox.mozilla.org/Camino/

boxset Tp turned orange on its build beginning 14:40, the first after your checkin at 14:27.  Tp for the three maya builds will probably go orange too beginning with the next Cm1.0-M1.8.0 build, when multi-tinderbox updates the tinderbox scripts for a new set of builds.
(In reply to comment #39)
> Rob, this is turning Tp orange for Camino:
> 
> http://tinderbox.mozilla.org/Camino/
> 
> boxset Tp turned orange on its build beginning 14:40, the first after your
> checkin at 14:27.  Tp for the three maya builds will probably go orange too
> beginning with the next Cm1.0-M1.8.0 build, when multi-tinderbox updates the
> tinderbox scripts for a new set of builds.

Ok, I'll back out while we figure this out.
Backed out -

Checking in build-seamonkey-util.pl;
/cvsroot/mozilla/tools/tinderbox/build-seamonkey-util.pl,v  <--  build-seamonkey-util.pl
new revision: 1.372; previous revision: 1.371
done
Checking in post-mozilla-rel.pl;
/cvsroot/mozilla/tools/tinderbox/post-mozilla-rel.pl,v  <--  post-mozilla-rel.pl
new revision: 1.131; previous revision: 1.130
done
Checking in tinder-defaults.pl;
/cvsroot/mozilla/tools/tinderbox/tinder-defaults.pl,v  <--  tinder-defaults.pl
new revision: 1.119; previous revision: 1.118
done
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: patch needs refresh and checkin → need to figure out camino bustage
Interesting that this actually only turned the Camino trunk builds orange - the Cm1.0-M1.8.0 and Cm1.5-M1.8 builds stayed green even though they got the new JProfStartProfiling stuff.
(In reply to comment #42)
> Interesting that this actually only turned the Camino trunk builds orange - the
> Cm1.0-M1.8.0 and Cm1.5-M1.8 builds stayed green even though they got the new
> JProfStartProfiling stuff.

If you have time to run your tinderbox in test-only mode and see what's going on, or just try starting up Camino trunk the way this patch does, I'd appreciate it :)
Sounds like we're probably not packaging something that this test depends on.  I'm on vacation now, though, so I'm not sure I'll get a chance to look too closely any time soon.
Attachment #279674 - Attachment description: refreshed patch, no conflicts[checked in] → [backed out]refreshed patch, no conflicts
Only do the "try{ JprofStartProfiling } catch {}" stuff if TestWithJprof is turned on in Tinderbox.
Attachment #268574 - Attachment is obsolete: true
Attachment #279674 - Attachment is obsolete: true
Attachment #282846 - Flags: review?
Whiteboard: need to figure out camino bustage → waiting on review
Attachment #282846 - Flags: review? → review?(ccooper)
Comment on attachment 282846 [details] [diff] [review]
only turn on jprof in JS if jprof is on in tinderbox

Looks good. Can you make sure to add the appropriate  defaults to tinder-defaults.pl and tinder-config.pl for $Settings::TestWithJprof?
Attachment #282846 - Flags: review?(ccooper) → review+
Here's the complete patch. Landed:

Checking in build-seamonkey-util.pl;
/cvsroot/mozilla/tools/tinderbox/build-seamonkey-util.pl,v  <--  build-seamonkey-util.pl
new revision: 1.373; previous revision: 1.372
done
Checking in post-mozilla-rel.pl;
/cvsroot/mozilla/tools/tinderbox/post-mozilla-rel.pl,v  <--  post-mozilla-rel.pl
new revision: 1.136; previous revision: 1.135
done
Checking in tinder-defaults.pl;
/cvsroot/mozilla/tools/tinderbox/tinder-defaults.pl,v  <--  tinder-defaults.pl
new revision: 1.120; previous revision: 1.119
done
Status: REOPENED → RESOLVED
Closed: 17 years ago17 years ago
Resolution: --- → FIXED
Whiteboard: waiting on review
Component: Testing → Release Engineering
Product: Core → mozilla.org
QA Contact: testing → release
Version: Trunk → other
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: