Closed Bug 631581 Opened 13 years ago Closed 13 years ago

JM: profile jitcode memory history and usage

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 631951

People

(Reporter: dmandelin, Assigned: dmandelin)

References

Details

Attachments

(1 file, 2 obsolete files)

Attached patch WIP v1 (obsolete) — Splinter Review
I'm working on a patch to profile jitcode memory. Specifically, I want to be able to measure, for every script that the methodjit compiles:

 - script identity
 - how much memory is used, for code and metadata
 - for what time interval that memory is used
 - the max # of times any basic block in that script runs

This will give us a start on understanding how many scripts there are in JS-heavy websites, where they come from, how much we pay to compile them, and how much we get out of that.
Assignee: general → dmandelin
Attached patch WIP v2 (add new files) (obsolete) — Splinter Review
Attachment #509811 - Attachment is obsolete: true
Attached patch WIP v3Splinter Review
This has the basic functionality in place. Key limitations:

 - you can't understand the output unless you read dump() inside Profiler.cpp
 - doesn't track PICs
 - doesn't know that a function can be compiled as both normal and ctor

Next step: run on Quora and write scripts to analyze the data.
Attachment #509812 - Attachment is obsolete: true
Summary: JM: profile jitcode memory history and usage. → JM: profile jitcode memory history and usage
Full results for opening up Quora and logging in here:

https://spreadsheets.google.com/ccc?key=0AsEOzaxftycIdGRlM2hHMzBTRGI0b3JIZkduaEVDVFE&hl=en&authkey=CITopowE

* Summary Statistics:

873 scripts were compiled into 3.8 MB code + 1.5 MB data = 5.4 MB. The average per script is 4.4 KB. The largest script compiled into 512 KB code+data.

There were 165k calls into compiled scripts, which ran 178k loop iterations (a script that doesn't run a loop is still counted as making one iteration) total. Thus, on average scripts don't contain loops. The average number of calls per script was 189, and the average number of loop iterations per script was 203. 

50% of scripts were called only once. Similarly, 47% of scripts ran only one iteration (ever). Only 9% of scripts ran at least one iteration.

* Analysis

The main thing I wanted to learn in this first round was how much code we need to compile to get good performance. The "Iters vs Size" graph in the spreasheet answers this question graphically. 

*** One point of interest is (0.2, 0.99), which means that we need to compile only scripts making up 1/5 of the code+data bytes in order to run 99% of iterations as compiled code. That's great! That means we may be able to cut compiled code size by up to 1/5 by compiling only functions with hot code, with almost no speed cost. ***

Another way to look at this question is to ask how much code needs to be compiled in order to compile all scripts that run over N iterations. For N = 50, it is 13% of code. For N = 100, 9%, and for N = 250, 5%. Again, plenty of room for reduction.
I did a little further analysis by assuming that it takes time |t| to run one iteration in the methodjit, |4t| to run one iteration in the interpreter, and |500t| to compile one script. For those parameters and this data set, and a simple strategy of compiling a script after it runs N loop iterations (counting entry as an iteration if the script doesn't run a loop), N = 69 minimizes total run time:

     N    total time                           code+data size fraction
 -----    ----------                           -----------------------
     0           611 (* 1000t)                                   1.000
     1           410                                              .440
    69           263                                              .105
   109           266 [1% worse than optimal]                      .092
   500           304                                              .037
     ∞           710                                              .000

There are all kinds of simplifying assumptions here, as well as the fact this is an N=1 sample, but I'd bet this analysis is good enough to point in the right direction. *Assuming these results apply broadly to the web*, we can conclude:

1. Compiling loopless methods on the second call, instead of immediately, cuts memory usage 2x and improves speed 1.5x. (That 1.5x probably amounts to something like 3ms/click, so don't get *too* excited. I bet it could at least measurably improve pageload on some sites, if not necessarily perceptibly.)

2. Compiling methods after 100 iterations or so cuts memory usage 10x and improves speed 2x.

I might look at how well the aging algorithm manages to throw this away over time, to see what there is to learn, but these results suggest #1 and #2 here are probably enough to solve all our jitcode memory problems, i.e., make our jitcode use less memory than Chrome's. I'm not sure if we have time/risk to do these for Fx4 but we can debate that in bugs that I'm about to file on those ideas (unless we already have them).
Awesome, awesome work, dmandelin!  Thank you thank you thank you :)

If you want to make a bunch of things smaller, you can shrink each one, or just remove some.  I've been focussing on the former but it looks like the latter is way more effective in this case.
A couple of other thoughts:

- From bug 630072, I got the impression that he thought the quora.com memory bloat was due to something other than the methodjit, eg. a problem with the bfcache.  Does quora.com use a lot of Javascript?  Sounds like it, I guess.  techcrunch.com and www.cad-comics.com/cad/ are two other sites I've been measuring that I know are very JS heavy, and might be worth measuring if you have time.

- If V8 compiles everything, how does it manage to be so space-efficient?  Is its code really compact?  Is it really aggressive throwing code away?
(In reply to comment #6)
> A couple of other thoughts:
> 
> - From bug 630072, I got the impression that he thought the quora.com memory
> bloat was due to something other than the methodjit, eg. a problem with the
> bfcache.  Does quora.com use a lot of Javascript?  Sounds like it, I guess. 

The problem in bug 630072 is that DOMWindows are staying alive. A DOMWindow corresponds to a browser tab, iframe, XUL panel, etc. When a DOMWindow stays alive too long, all related JS objects, scripts, and DOM nodes stay alive too.

So, keeping Quora DOMWindows alive too long carries a stiff penalty, because they are associated with a lot of script. Reducing the amount of code we generate would make the problem in 630072 less noticeable.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: