Closed Bug 1357180 Opened 7 years ago Closed 5 years ago

[meta] Get some real world data on what JS built-ins web pages use in the wild


(Core :: JavaScript Engine, enhancement, P3)




Performance Impact ?


(Reporter: ehsan.akhgari, Assigned: djvj)


(Blocks 1 open bug)


(Keywords: meta, perf, triage-deferred)


(3 files, 7 obsolete files)

The V8 team has done some great work on collecting real world data on what JS built-ins web pages use in the wild.  They have used this data in order to sort the built-ins by order of importance, and for each ones they have looked into the performance of the said built-in in great detail and tried to optimize it as much as possible, and they claim some massive improvements on real world pages based on this approach.

It would be nice if we can work on getting a similar instrumentation.  Once we have this instrumentation, running Firefox on these pages and getting it to dump out the statistics somewhere and collect them all in one place should be relatively easy.

Slide 32 on this deck has the top 8 built-ins that they found at the time they ran their analysis: <>  We should be able to start some optimization work based on this information even now before we have our own measurements.

CCing Jan and Ted to see what they think about this idea and if they have seen it before.
We are currently identifying and handling cases similar to those in current process. (eg. Object.hasOwnProperty and keyword |in| are recently landed/in-progress). I think Jan and :evilpie can give a clearer picture of the methodologies they are applying to identify real world cases (I'm more on the fixing than identifying side).

An example of a related strategy we currently try is the IONFLAGS=cacheir-log mechanism that Tom added. It identified a variety of cases were we failed to install an IC. These are due to exotic objects we generated, weird user monkey-patching, and basic oversights on our part. I believe Tom used this in full browser builds and real-world websites as well as things like speedometer.

The built-ins use count sounds like an easy strategy to add to our repertoire.
*handling /some/ cases
Whiteboard: [qf]
Whiteboard: [qf] → [qf:investigate:p1]
Assignee: nobody → kvijayan
Attached patch measure-intrinsics.patch (obsolete) — Splinter Review
Really rough initial patch to measure intrinsics time using rdtsc.  Has some bug where it doesn't always close its start/end, and some are printed with no name.
Attached patch measure-intrinsics.patch (obsolete) — Splinter Review
Updated measure intrinsics patch.  Haven't tested this yet, but I have to put patches up that I have n't tested yet because my machine is crashes a lot now and I can't guarantee I'll get through a build without my VM corrupting :(
Attachment #8866536 - Attachment is obsolete: true
updated measure intrinsics patch that actually works.
Attachment #8867314 - Attachment is obsolete: true
Attached file (obsolete) —
Python script to parse the measurements and sum them up and order them and print them.
Attached patch updates.patch (obsolete) — Splinter Review
Partial updates patch instrumenting Array and String builtins.  Doesn't build, just saving it so I don't lose it if my VM goes corrupt like it did last week.
Attached patch updates.patchSplinter Review
New updates patch that builds and seems to work.
Attachment #8868279 - Attachment is obsolete: true
Attached file (obsolete) —
Updated measurement analysis script.
Attachment #8867368 - Attachment is obsolete: true
Attached patch measure-intrinsics.patch (obsolete) — Splinter Review
Updated measurement analysis script.

Ok, so this thing uses a bunch of intuitive measures to try to throw out outliers.  I don't think there's a fantastic way to do this in any case, so here's an outline of the basic reasoning the script uses:

1. First, we construct a histograms with 50 regular buckets, ranging [min, max]

2. We choose the "weakest fulcrum point" on this histogram to do the cutoff at.

3. To calculate the weakness at a given fulcrum point, we take the product of the "weight" on either side of the fulcrum, and divide it by the "strength" at the fulcrum.

4. The "weight" on one side of the fulcrum is the sum over the product of each bucket and the fourth power of its distance from the fulcrum |weight[i] = hist[i] * dist(fulcrum, i)^4|

5. The "strength" at a given point on the fulcrum is just the fourth power of the value at the fulcrum bucket.

These values were chosen by eyeballing their outcomes on a visual inspection of a log-scale display of the histogram.

I'm doing this with eyeball analysis because it's really hard to come up with a formal way of teasing out actual "cliffs" in the algorithm, with spurious numbers in the measurement system.

Let's hope an actual statistician never looks at this, because I'm pretty sure they would have a legal cover and professional blessing to murder me over this.  If any of you are reading: I apologize in advance to your honourable profession.
Attachment #8868399 - Attachment is obsolete: true
Attached file (obsolete) —
Wrong file.  Fixing.
Attachment #8868601 - Attachment is obsolete: true
Updated analysis script.  I got rid of the old convoluted logic and just replaced it with the following: remove the longest tail-sequence of buckets which have <10 values.
Attachment #8868632 - Attachment is obsolete: true
Whiteboard: [qf:investigate:p1] → [qf:p1]
Changing this to [qf:meta].  These scripts need to be maintained and updated to avoid bitrot, but the results and follow-up work are being done on bug 1365361.
Whiteboard: [qf:p1] → [qf:meta]
Keywords: triage-deferred
Priority: -- → P3
Keywords: perf
Closed: 5 years ago
Resolution: --- → FIXED
Performance Impact: --- → ?
Keywords: meta
Whiteboard: [qf:meta]
Summary: Get some real world data on what JS built-ins web pages use in the wild → [meta] Get some real world data on what JS built-ins web pages use in the wild
You need to log in before you can comment on or make changes to this bug.