Closed Bug 548388 Opened 14 years ago Closed 14 years ago

GC Benchmark Suite

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: sayrer, Assigned: gwagner)

References

(Blocks 1 open bug)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(7 files, 4 obsolete files)

Benchmark: create, taverse and free big object graph 14 years ago Gregor Wagner [:gwagner] 1000 bytes, application/x-javascript		Details
DSlot Benchmark 14 years ago Gregor Wagner [:gwagner] 1.20 KB, text/plain		Details
Dslot GC Graph 14 years ago Gregor Wagner [:gwagner] 7.75 KB, image/png		Details
Clock GC benchmark 14 years ago David Mandelin [:dmandelin] 1.96 KB, text/html		Details
Clock GC Graph Tip 14 years ago Gregor Wagner [:gwagner] 9.60 KB, image/png		Details
Scalability: Browser Tip with ca. 30 Tabs. 14 years ago Gregor Wagner [:gwagner] 9.28 KB, image/png		Details
first draft 14 years ago Gregor Wagner [:gwagner] 205.11 KB, patch		Details \| Diff \| Splinter Review
update 14 years ago Gregor Wagner [:gwagner] 8.43 KB, patch		Details \| Diff \| Splinter Review
update 14 years ago Gregor Wagner [:gwagner] 11.47 KB, patch		Details \| Diff \| Splinter Review
update 14 years ago Gregor Wagner [:gwagner] 13.11 KB, patch	jorendorff : review+	Details \| Diff \| Splinter Review
update 14 years ago Gregor Wagner [:gwagner] 13.13 KB, patch		Details \| Diff \| Splinter Review

Robert Sayre

Reporter

Description

•

14 years ago

We have a bunch of scattered GC benchmarks. Let's consolidate them in one suite.

Robert Sayre

Reporter

Updated

•

14 years ago

Assignee: general → gwagner

Robert Sayre

Reporter

Updated

•

14 years ago

Assignee: gwagner → anygregor

Gregor Wagner [:gwagner]

Assignee

Comment 1

•

14 years ago

For motivation a short excerpt from:
JSMeter: Characterizing Real-World Behavior of JavaScript Programs

can be found:
http://research.microsoft.com/apps/pubs/?id=115687

Real applications allocate a significant amount of memory, ranging from one to almost twenty megabytes of data, in the relatively short interactions we had with them. As with bytecode execution behavior, google is the most lean of the applications, while bingmap and amazon allocate the most data. Of the real applications that have the most application-like characteristics, bingmap, facebook, gmail, and googlemap, we see that allocating megabytes of data in a short period of time is common.

Benchmarks: The overall allocation of the benchmark programs is highly variable, with many benchmarks hardly allocating any data at all (e.g., richards deltablue, controlflow, math − cordic, etc.) and others allocating ten or more megabytes (e.g., earley, splay, and regexp). Only six of the benchmarks allocate more data than google, the real application that allocates the least data. The SunSpider benchmarks, in particular, have total allocation behavior that is highly unrepresentative of the real applications, and as a result, performance comparisons based on them will be highly skewed to the performance of code execution without regard to the efficiency of the object representation or memory management.

Lessons: One conclusion that can be reached across both the real applications and the benchmarks is that the only object types of significance are script functions, strings, arrays, and objects. The other types rarely, if ever contribute substantively to the overall memory allocation of the applications.
Another conclusion we can reach from the real web applications is that many make substantial use of all four major data types, with the mix of types varying between the applications.

Gregor Wagner [:gwagner]

Assignee

Comment 2

•

14 years ago

Some other points to consider from the paper:

Live Heap Content:
The real applications allocate a diverse collection of strings, functions, objects and arrays with strings being the most short-lived and functions being the most long-lived.
Some real applications have short-lived heaps that are destroyed when one page is unloaded and regenerated when a new page is loaded.
Live heap contents in the benchmarks do not reflect real applications.

Object Allocation Discussion:
First, we observed that the mix of types allocated by the real applications is much different than most of the benchmarks, containing a large quantity of script functions and strings. Objects are less frequently allocated in the real applications and the lifetime of objects is considerably longer then that of strings in many cases.
• Second, our analysis of the contents of the live heaps suggests that current web applications fall into two categories: those with page transitions that clear the JavaScript heap, and those that do not. In applications that do not have many page transitions, such as gmail, we observe that arrays and objects are relatively long-lived compared to strings. Of applications with many page transitions, such as amazon, by definition almost all objects are short-lived. Such sites do not require sophisticated memory management and would benefit most from a very fast and simple allocator. Being able to predict what class a site falls into and using an appropriate allocator might have performance benefits.
• Finally, in considering object lifetimes, we see that strings are by far the shortest lived types in JavaScript and that functions are commonly long-lived. Except for earley and splay, object lifetimes in the V8 and SunSpider benchmarks are extremely short- lived, suggesting that performance results of these benchmarks will not reliably reflect the effectiveness of the JavaScript engine’s memory management implementation. Even in earley, object lifetimes are significantly shorter than is observed in many of the real web applications, while in splay objects are almost never freed.

Gregor Wagner [:gwagner]

Assignee

Comment 3

•

14 years ago

Back to this bug...

I want to start with some basic benchmarks that measure a single GC functionality.
- mark performance
- sweep performance
- finalize objects (with and without dslots)
- allocation performance

Later I want to add real workload benchmarks and add generational stuff.
Including benchmarks that simulate page transition and recreation.

Gregor Wagner [:gwagner]

Assignee

Comment 4

•

14 years ago

Attached file Benchmark: create, taverse and free big object graph — Details

Igor posted this benchmark in another bug and I used it for measuring the marking performance.
Details:
Measures the time to create, traverse and free a big object graph.

Gregor Wagner [:gwagner]

Assignee

Comment 5

•

14 years ago

Attached file DSlot Benchmark — Details

This benchmarks measures the overhead of alloc and dealloc Objects with and without dslots. 1E5 Objects are created and 0-5 properties are set.

Gregor Wagner [:gwagner]

Assignee

Comment 6

•

14 years ago

Attached image Dslot GC Graph — Details

The GC Graph shows the overhead of dslots deallocation and that the GC pause is mainly caused by the object finalization. 
The finalization of Objects with only 1-3 slots are set is still very expensive.

Gregor Wagner [:gwagner]

Assignee

Comment 7

•

14 years ago

I just saw that the clock benchmark is no longer online. Does anybody know who wrote this benchmark? The URL was: http://people.mozilla.com/~afranquet/clock.html

David Mandelin [:dmandelin]

Comment 8

•

14 years ago

Attached file Clock GC benchmark — Details

(In reply to comment #7)
> I just saw that the clock benchmark is no longer online. Does anybody know who
> wrote this benchmark? The URL was:
> http://people.mozilla.com/~afranquet/clock.html

Heh, turns out I wrote it. I put it in the attachment.

David Mandelin [:dmandelin]

Updated

•

14 years ago

Attachment #434576 - Attachment mime type: text/plain → text/html

Gregor Wagner [:gwagner]

Assignee

Comment 9

•

14 years ago

Attached image Clock GC Graph Tip — Details

The internals of the GC for the Clock benchmark. The GC pause is dominated by Object finalization and Chunk destruction.

Gregor Wagner [:gwagner]

Assignee

Comment 10

•

14 years ago

Attached image Scalability: Browser Tip with ca. 30 Tabs. — Details

I tried to get an idea of the scalability of the current GC approach.
I opened ca. 30 tabs with popular websites including benchmark pages that include high throughput.
What I see in the graph is that long living objects and high throughput leads to a GC pause time explosion. 
A generational GC looks like a good solution for this problem.

Furthermore object finalization is also a very significant factor.
I know it's not that easy but others solved it with lazy finalization.

Looks like an optimal GC for the web needs these two features.

Jason Orendorff [:jorendorff]

Updated

•

14 years ago

Blocks: 561486

Gregor Wagner [:gwagner]

Assignee

Comment 11

•

14 years ago

Attached patch first draft (obsolete) — Details — Splinter Review

A first try how it might look like...
I am still not sure how the output should look like.
what are meaningful things to measure? Right now I calculate the max and mean of the total, marking and sweep time.

A sample output looks like:

clock.js:
Total max: '69.275696'
Total mean: '64.365382'
Mark max: '0.832184'
Mark mean: '0.711289'
Sweep max: '68.342440'
Sweep mean: '63.494899'


I still have to rewrite the benchmark files and yeah it's my first python code...

Gregor Wagner [:gwagner]

Assignee

Comment 12

•

14 years ago

Attached patch update (obsolete) — Details — Splinter Review

now with JSON output.

Attachment #442507 - Attachment is obsolete: true

Gregor Wagner [:gwagner]

Assignee

Comment 13

•

14 years ago

A sample output looks like:
{
   clock.js: {"Total max": 69.5, "Total mean": 65.1, "Mark max":  1.0, "Mark mean":  0.7, "Sweep max": 68.4, "Sweep mean": 64.1}
  dslots.js: {"Total max": 61.9, "Total mean": 32.1, "Mark max":  0.1, "Mark mean":  0.1, "Sweep max": 61.6, "Sweep mean": 31.8}
   empty.js: {"Total max":  0.5, "Total mean":  0.2, "Mark max":  0.1, "Mark mean":  0.0, "Sweep max":  0.1, "Sweep mean":  0.1}
objGraph.js: {"Total max": 60.0, "Total mean": 32.9, "Mark max": 46.8, "Mark mean": 14.9, "Sweep max": 59.7, "Sweep mean": 17.8}
}

Gregor Wagner [:gwagner]

Assignee

Comment 14

•

14 years ago

Attached patch update (obsolete) — Details — Splinter Review

now with comparison mode...

                      loops.js: faster:  48.03 < baseline  48.30 ( -0.56%)
                   objGraph.js: faster:  32.01 < baseline  32.10 ( -0.29%)
                      clock.js: faster:  64.84 < baseline  65.00 ( -0.25%)
                     dslots.js: SLOWER:  32.01 > baseline  32.00 ( +0.04%)

Attachment #442785 - Attachment is obsolete: true

Gregor Wagner [:gwagner]

Assignee

Comment 15

•

14 years ago

Attached patch update (obsolete) — Details — Splinter Review

The basic framework. More benchmarks to come.

Attachment #442864 - Attachment is obsolete: true

Attachment #443271 - Flags: review?(jorendorff)

Gregor Wagner [:gwagner]

Assignee

Comment 16

•

14 years ago

jorendorff: any comments? Is that what you expected or should it look completely different? 
I might have to focus more on micro-benchmarks that measure just one thing like marking time or sweep time because I calculate based on the average GC pause time. 
I am also working to get real web examples. I try to get a tool that can reproduce the JS stuff offline.
Also the average gc pause time might not be the best thing to measure but the longest pause time is to noisy and the shortest pause time is mostly just an "empty" gc.

Jason Orendorff [:jorendorff]

Updated

•

14 years ago

Attachment #443271 - Flags: review?(jorendorff) → review+

Gregor Wagner [:gwagner]

Assignee

Comment 17

•

14 years ago

dmandelin wrote in an email:
Since these are measurements, not tests, I would suggest something like
js/src/metrics/gc. Hopefully we can grow more types of measurements. We
could also consider moving the 't' and 'v8' directories there.

I will move and push it today.

Gregor Wagner [:gwagner]

Assignee

Comment 18

•

14 years ago

Attached patch update — Details — Splinter Review

moved suite into metrics/gc.

Attachment #443271 - Attachment is obsolete: true

Gregor Wagner [:gwagner]

Assignee

Comment 19

•

14 years ago

http://hg.mozilla.org/tracemonkey/rev/ed22e05bc003

more to come.

Gregor Wagner [:gwagner]

Assignee

Updated

•

14 years ago

Whiteboard: fixed-in-tracemonkey

Robert Sayre

Reporter

Comment 20

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/ed22e05bc003

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.