Closed Bug 472706 Opened 16 years ago Closed 16 years ago

add better profiling input for spidermonkey in PGO builds

Categories

(Firefox Build System :: General, defect)

x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.9.2a1

People

(Reporter: ted, Assigned: ted)

References

Details

(Keywords: perf, Whiteboard: [needs 1.9.1 landing, pgo part: needs unbitrotted patch, needs approval] [fixed1.9.1b4])

Attachments

(1 file)

Currently in the profiling phase of PGO builds we just start the browser, load an almost empty HTML page, and then shut down after a few seconds. This exercises a good portion of the platform, but it probably doesn't touch the JS tracing code at all. (Since most of the JS executed is chrome). We should throw some real-world JS in content at SpiderMonkey during the profiling run.
(In reply to comment #0)
> Currently in the profiling phase of PGO builds we just start the browser, load
> an almost empty HTML page, and then shut down after a few seconds. This
> exercises a good portion of the platform, but it probably doesn't touch the JS
> tracing code at all. (Since most of the JS executed is chrome). 

It does trace JS components, and traces the loops in the module boilerplate file. But I agree we can do better.
How much would enabling JIT chrome help for the current case? I'm assuming that would exercise many of the same code paths as content does.
Flags: blocking1.9.1?
Keywords: perf
At Andreas' behest in bug 467271, I ran SunSpider during the profiling stage. Results look good for Standard-PGO vs. SS-PGO. Links to the full results are below. Interesting that some tests see big wins, some stay the same, and regexp-dna takes a big hit.
1.05x as fast     1284.0ms +/- 2.9%   1222.8ms +/- 1.8%

Interestingly, the functions compiled for speed only increased from 34% to 46%. If I'm feeling particularly nutty, I might try Dromaeo at some point too.
1422 of 3114 ( 45.66) profiled functions will be compiled for speed
3114 of 3114 functions (100.0) were optimized using profile data
13721561286 of 13721561286 instructions (100.0) were optimized using profile data

Standard-PGO:
http://www2.webkit.org/perf/sunspider-0.9/sunspider-results.html?%7B%223d-cube%22:%5B59,59,60,59,59%5D,%223d-morph%22:%5B32,33,32,31,31%5D,%223d-raytrace%22:%5B96,96,95,99,91%5D,%22access-binary-trees%22:%5B45,43,46,44,44%5D,%22access-fannkuch%22:%5B67,44,70,67,65%5D,%22access-nbody%22:%5B28,32,28,28,27%5D,%22access-nsieve%22:%5B12,14,14,12,14%5D,%22bitops-3bit-bits-in-byte%22:%5B3,2,2,2,2%5D,%22bitops-bits-in-byte%22:%5B14,15,15,14,14%5D,%22bitops-bitwise-and%22:%5B3,3,3,3,4%5D,%22bitops-nsieve-bits%22:%5B30,29,30,29,29%5D,%22controlflow-recursive%22:%5B41,40,41,40,41%5D,%22crypto-aes%22:%5B40,41,40,41,41%5D,%22crypto-md5%22:%5B19,20,20,19,20%5D,%22crypto-sha1%22:%5B8,7,8,7,8%5D,%22date-format-tofte%22:%5B127,127,136,148,131%5D,%22date-format-xparb%22:%5B88,86,87,87,88%5D,%22math-cordic%22:%5B30,30,32,30,30%5D,%22math-partial-sums%22:%5B19,19,18,18,19%5D,%22math-spectral-norm%22:%5B9,9,8,9,8%5D,%22regexp-dna%22:%5B82,75,90,91,73%5D,%22string-base64%22:%5B18,18,18,18,18%5D,%22string-fasta%22:%5B73,74,77,73,72%5D,%22string-tagcloud%22:%5B93,84,102,102,104%5D,%22string-unpack-code%22:%5B191,181,188,173,191%5D,%22string-validate-input%22:%5B59,54,55,57,59%5D%7D

SS-PGO:
http://www2.webkit.org/perf/sunspider-0.9/sunspider-results.html?%7B%223d-cube%22:%5B54,50,52,53,54%5D,%223d-morph%22:%5B28,29,29,28,29%5D,%223d-raytrace%22:%5B87,82,88,85,86%5D,%22access-binary-trees%22:%5B45,45,48,46,46%5D,%22access-fannkuch%22:%5B69,72,68,67,72%5D,%22access-nbody%22:%5B30,31,31,30,30%5D,%22access-nsieve%22:%5B12,13,15,14,12%5D,%22bitops-3bit-bits-in-byte%22:%5B2,2,2,2,2%5D,%22bitops-bits-in-byte%22:%5B14,14,14,14,14%5D,%22bitops-bitwise-and%22:%5B3,3,3,3,3%5D,%22bitops-nsieve-bits%22:%5B26,25,26,25,26%5D,%22controlflow-recursive%22:%5B40,40,40,41,40%5D,%22crypto-aes%22:%5B36,36,36,38,37%5D,%22crypto-md5%22:%5B14,14,14,14,14%5D,%22crypto-sha1%22:%5B6,6,6,6,6%5D,%22date-format-tofte%22:%5B120,106,125,116,125%5D,%22date-format-xparb%22:%5B84,83,87,82,85%5D,%22math-cordic%22:%5B27,27,31,27,27%5D,%22math-partial-sums%22:%5B18,19,19,18,19%5D,%22math-spectral-norm%22:%5B9,6,9,9,9%5D,%22regexp-dna%22:%5B84,98,103,95,92%5D,%22string-base64%22:%5B16,15,15,15,16%5D,%22string-fasta%22:%5B74,78,74,70,73%5D,%22string-tagcloud%22:%5B95,104,104,105,98%5D,%22string-unpack-code%22:%5B155,160,150,172,160%5D,%22string-validate-input%22:%5B49,56,55,54,55%5D%7D
sayrer emailed me this patch that he assembled a while ago, and I've updated it to apply to mozilla-central. It adds some blueprintcss test pages and sunspider test pages to the profiling run. I have yet to try a PGO build with this input, but Ryan's results suggest that it ought to help.
Assignee: nobody → ted.mielczarek
Any idea why I'm getting the following when I try to hg import this?

C:\mozbuild\mozilla-central>hg import --no-commit 472706v1.patch
applying 472706v1.patch
** unknown exception encountered, details follow
** report bug details to http://www.selenic.com/mercurial/bts
** or mercurial@selenic.com
** Mercurial Distributed SCM (version 1.0.1+20080525)
Traceback (most recent call last):
  File "hg", line 20, in <module>
  File "mercurial\dispatch.pyc", line 20, in run
  File "mercurial\dispatch.pyc", line 29, in dispatch
  File "mercurial\dispatch.pyc", line 45, in _runcatch
  File "mercurial\dispatch.pyc", line 364, in _dispatch
  File "mercurial\dispatch.pyc", line 417, in _runcommand
  File "mercurial\dispatch.pyc", line 373, in checkargs
  File "mercurial\dispatch.pyc", line 356, in <lambda>
  File "mercurial\commands.pyc", line 1513, in import_
  File "mercurial\patch.pyc", line 81, in extract
  File "mercurial\demandimport.pyc", line 70, in __call__
TypeError: <unloaded module 'walk'> object is not callable
I think import is busted in mozillabuild's hg. Try qimport from mq.
mq worked nicely. Using Ted's patch, libxul is up to 4.01% profiled for speed, which is a nice gain there. As far as SS-PGO vs. Ted-PGO, the results are what you might expect given that the JS components are basically the same. Given the amount of spread I've seen between various SS runs, I'm inclined to call this a tie. The good news is that it means that Ted's patch will work nice for delivering the ~5% speedup attained by profiling SunSpider.

** TOTAL **:           ??                1199.6ms +/- 1.5%   1213.8ms +/- 1.7%     not conclusive: might be *1.01x as slow*

SS-PGO:
http://www2.webkit.org/perf/sunspider-0.9/sunspider-results.html?%7B%223d-cube%22:%5B52,52,52,58,55%5D,%223d-morph%22:%5B27,28,27,28,29%5D,%223d-raytrace%22:%5B82,86,88,87,86%5D,%22access-binary-trees%22:%5B46,44,45,45,45%5D,%22access-fannkuch%22:%5B69,66,69,68,68%5D,%22access-nbody%22:%5B31,30,31,30,30%5D,%22access-nsieve%22:%5B12,14,13,13,14%5D,%22bitops-3bit-bits-in-byte%22:%5B2,2,2,2,2%5D,%22bitops-bits-in-byte%22:%5B14,14,15,15,14%5D,%22bitops-bitwise-and%22:%5B3,3,3,3,3%5D,%22bitops-nsieve-bits%22:%5B25,25,25,25,25%5D,%22controlflow-recursive%22:%5B40,39,39,40,40%5D,%22crypto-aes%22:%5B36,38,37,37,34%5D,%22crypto-md5%22:%5B13,14,13,14,14%5D,%22crypto-sha1%22:%5B6,7,6,6,6%5D,%22date-format-tofte%22:%5B121,116,116,90,121%5D,%22date-format-xparb%22:%5B84,71,84,81,82%5D,%22math-cordic%22:%5B28,27,28,27,27%5D,%22math-partial-sums%22:%5B27,18,19,18,12%5D,%22math-spectral-norm%22:%5B9,9,9,8,6%5D,%22regexp-dna%22:%5B96,88,94,92,97%5D,%22string-base64%22:%5B15,16,15,16,16%5D,%22string-fasta%22:%5B71,73,82,73,73%5D,%22string-tagcloud%22:%5B87,105,98,98,91%5D,%22string-unpack-code%22:%5B154,153,162,165,149%5D,%22string-validate-input%22:%5B50,57,52,51,50%5D%7D

Ted-PGO:
http://www2.webkit.org/perf/sunspider-0.9/sunspider-results.html?%7B%223d-cube%22:%5B52,54,53,56,53%5D,%223d-morph%22:%5B28,28,28,28,29%5D,%223d-raytrace%22:%5B83,83,89,68,85%5D,%22access-binary-trees%22:%5B46,43,42,42,43%5D,%22access-fannkuch%22:%5B71,66,61,61,63%5D,%22access-nbody%22:%5B30,31,30,30,31%5D,%22access-nsieve%22:%5B12,14,12,12,15%5D,%22bitops-3bit-bits-in-byte%22:%5B2,2,3,2,2%5D,%22bitops-bits-in-byte%22:%5B14,14,14,14,14%5D,%22bitops-bitwise-and%22:%5B3,3,3,3,3%5D,%22bitops-nsieve-bits%22:%5B25,25,25,25,25%5D,%22controlflow-recursive%22:%5B43,41,42,42,43%5D,%22crypto-aes%22:%5B41,34,35,42,35%5D,%22crypto-md5%22:%5B15,15,14,14,14%5D,%22crypto-sha1%22:%5B6,5,6,6,6%5D,%22date-format-tofte%22:%5B120,122,119,123,119%5D,%22date-format-xparb%22:%5B83,79,79,81,80%5D,%22math-cordic%22:%5B27,27,27,27,27%5D,%22math-partial-sums%22:%5B19,18,18,18,18%5D,%22math-spectral-norm%22:%5B9,9,10,9,10%5D,%22regexp-dna%22:%5B92,100,101,105,104%5D,%22string-base64%22:%5B16,16,16,16,17%5D,%22string-fasta%22:%5B72,72,70,73,74%5D,%22string-tagcloud%22:%5B99,96,95,94,101%5D,%22string-unpack-code%22:%5B175,176,175,149,148%5D,%22string-validate-input%22:%5B49,49,49,48,52%5D%7D
Thanks for testing this! Since you already have the build, could you test on some other benchmarks vs. a comparable nightly? That patch throws in a bit of CSS loading etc.
OK, here some various various benchmarks.

JPG Loader is a home-brew test I created back when working on bugs related to speeding up JPG loading. Not surprisingly since nothing in the PGO pass really uses JPGs, there's no real changes worth mentioning.

JPG Loader (s)
	Run1	Run2	Run3	Run4	Run5	Avg	%Diff
No PGO	2.325	2.317	2.323	2.311	2.329	2.321	0.00%
Std PGO	2.276	2.284	2.337	2.35	2.307	2.311	0.43%
Ted PGO	2.333	2.287	2.291	2.289	2.365	2.313	0.34%

CSS Loader is a test I found here: http://www.howtocreate.co.uk/csstest.html
From the site - "The test measures the time it takes the browser to render a page consisting of almost 2500 positioned DIVs." This test shows a significant speedup with the new PGO pageset.

CSS Loader (ms)
	R1	R2	R3	R4	R5	R6	R7	R8	R9	R10	Avg	%Diff
No PGO	183	196	137	175	193	178	197	140	199	140	173.8	0.00%
Std PGO	177	189	180	162	142	203	179	177	197	171	177.7	-2.24%
Ted PGO	143	162	163	155	157	116	142	115	117	144	141.4	18.64%

I bootlegged a copy of Tp2 from an unnamed source once upon a time. The number shown below is the sum of the mean values of all 40 pages for each run. Nice win! (Though I can't explain the apparent loss going from no PGO to the standard trunk PGO)

Tp2 (ms)
	Run1	Run2	Run3	Run4	Run5	Avg	%Diff
No PGO	3294	3288	3290	3297	3288	3291	0.00%
Std PGO	3423	3404	3405	3433	3429	3419	-3.87%
Ted PGO	3068	3029	3078	3038	3052	3053	7.24%

Looks good overall!
Comment on attachment 356168 [details] [diff] [review]
sayrer's pgo input patch, updated to m-c
[Checkin: See comment 11 & 18+20]

From Ryan's testing, sounds like this is good. I know it looks like I'm r+ing my own patch, but it's sayrer's patch, honestly, I just merged it to m-c tip.
Attachment #356168 - Flags: review+
Pushed to m-c:
http://hg.mozilla.org/mozilla-central/rev/167b82ee7162

I forgot I had to update the other two consumers of automation.py to cope with changing the return value of runApp, so I did that and also tweaked some Makefile indentation and license headers while I was at it.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Flags: wanted1.9.1+
Flags: blocking1.9.1?
Flags: blocking1.9.1-
Comment on attachment 356168 [details] [diff] [review]
sayrer's pgo input patch, updated to m-c
[Checkin: See comment 11 & 18+20]

Can we land this on 1.9.1 with the other JS PGO patches that got approval for a nice low-risk perf win?
Attachment #356168 - Flags: approval1.9.1?
(In reply to comment #12)
> (From update of attachment 356168 [details] [diff] [review])
> Can we land this on 1.9.1 with the other JS PGO patches that got approval for a
> nice low-risk perf win?

Not sure this is low risk.  It appears this has resulted in an issue regarding yahoo slideshows.

http://news.yahoo.com/nphotos/Photo-Highlight/ss/441

Does not advance past slide 3 under windows builds.  I can't reproduce these on my own builds, so I am assuming a PGO issue.

The regression range has been identified as:

http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2009-01-13+03%3A00%3A00&enddate=2009-01-13+06%3A00%3A00

I fail to see how this could be a places bug, and this checkin is the only other one in the range.

I suspect this somehow triggered some latent issue.
Depends on: 475178
Yahoo News issue --> bug 475178
Attachment #356168 - Flags: approval1.9.1? → approval1.9.1-
Comment on attachment 356168 [details] [diff] [review]
sayrer's pgo input patch, updated to m-c
[Checkin: See comment 11 & 18+20]

We can't take this on 1.9.1 without sorting out the regression first.
Blocks: 428009
Target Milestone: --- → mozilla1.9.2a1
Depends on: 480077
(In reply to comment #15)
> We can't take this on 1.9.1 without sorting out the regression first.

This is needed for the diff context of bug 476163...

Note that this patch has bitrotted after bug 470963, which landed before this one on 1.9.1 :-/
Whiteboard: [needs 1.9.1 landing]
Yeah, it's not a big deal. I have an updated copy of the patch from bug 476163 locally that will apply without this. We should re-ask for approval for this since I fixed the regression, but we'll have to get that landed first.
Whiteboard: [needs 1.9.1 landing]
Comment on attachment 356168 [details] [diff] [review]
sayrer's pgo input patch, updated to m-c
[Checkin: See comment 11 & 18+20]


http://hg.mozilla.org/releases/mozilla-1.9.1/rev/b3dafbeb0e99
Attachment #356168 - Attachment description: sayrer's pgo input patch, updated to m-c → sayrer's pgo input patch, updated to m-c [Checkin: See comment 11 & 18]
Whiteboard: [needs 1.9.1 landing, pgo part: needs unbitrotted patch, needs approval] [fixed1.9.1b4]
(In reply to comment #18)
> http://hg.mozilla.org/releases/mozilla-1.9.1/rev/b3dafbeb0e99

(Bv1-191) |automation.runApp()| fix only
Attachment #356168 - Attachment description: sayrer's pgo input patch, updated to m-c [Checkin: See comment 11 & 18] → sayrer's pgo input patch, updated to m-c [Checkin: See comment 11 & 18+20]
Comment on attachment 356168 [details] [diff] [review]
sayrer's pgo input patch, updated to m-c
[Checkin: See comment 11 & 18+20]


http://hg.mozilla.org/releases/mozilla-1.9.1/rev/6ec4b2a3a81f
(Cv1-191) fix bad merge in changeset b3dafbeb0e99
It looks like only part of this landed on 1.9.1, and not the real fix. Isn't this ready to go in?
Flags: blocking1.9.1- → blocking1.9.1?
This was marked as not blocking.  Unless you have a new reason as to why this should block, please do not re-request blocking.
Flags: blocking1.9.1?
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: