Closed Bug 417052 Opened 17 years ago Closed 17 years ago

Analyze quality of JS_MaybeGC heuristic

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla1.9

People

(Reporter: jorendorff, Assigned: igor)

References

Details

(Keywords: perf)

Attachments

(13 files, 5 obsolete files)

patch showing what I'm tuning 17 years ago Jason Orendorff [:jorendorff] 940 bytes, patch		Details \| Diff \| Splinter Review
SunSpider results 17 years ago Jason Orendorff [:jorendorff] 13.63 KB, text/plain		Details
Memory use while loading 10 cycles of 30 page pageset (~300 page loads) 17 years ago Mike Schroepfer 82.68 KB, image/png		Details
massif graph for 3 sunspider laps, trunk 17 years ago Robert Sayre 369.32 KB, image/png		Details
massif graph for 3 sunspider laps, trunk 17 years ago Robert Sayre 157.84 KB, image/png		Details
massif graph for 3 sunspider laps, constant=2 17 years ago Robert Sayre 163.37 KB, image/png		Details
massif graph for 3 sunspider laps, constant=3 17 years ago Robert Sayre 166.68 KB, image/png		Details
massif graph for 3 sunspider laps, constant=8 17 years ago Robert Sayre 130.39 KB, image/png		Details
massif graph for 3 sunspider laps, constant=16 17 years ago Robert Sayre 145.35 KB, image/png		Details
massif data from gmail / techcrunch session -- constant=1.33 17 years ago Robert Sayre 99.58 KB, image/png		Details
massif data from gmail / techcrunch session -- constant=2 17 years ago Robert Sayre 106.14 KB, image/png		Details
massif data from gmail / techcrunch session -- constant=3 17 years ago Robert Sayre 101.40 KB, image/png		Details
massif data from gmail / techcrunch session -- constant=8 17 years ago Robert Sayre 111.38 KB, image/png		Details
massif data from gmail / techcrunch session -- constant=16 17 years ago Robert Sayre 110.50 KB, image/png		Details
v1 17 years ago Jason Orendorff [:jorendorff] 3.39 KB, patch	crowderbt : review+	Details \| Diff \| Splinter Review
v2 - fix overflow 17 years ago Jason Orendorff [:jorendorff] 3.39 KB, patch	crowderbt : review+	Details \| Diff \| Splinter Review
v3 - parameterized 17 years ago Jason Orendorff [:jorendorff] 6.23 KB, patch		Details \| Diff \| Splinter Review
custom MaybeGC for dom 17 years ago Igor Bukanov 2.35 KB, patch	brendan : review+ brendan : approval1.9+	Details \| Diff \| Splinter Review

Jason Orendorff [:jorendorff]

Reporter

Description

•

17 years ago

<shaver> when running some of the sunspider tests in-browser, we trigger GC bunch <shaver> 3d-cube is a good example, and access-nbody <shaver> we would like to know what conditions in MaybeGC are tripping the GC <shaver> and what the disposition of the garbage is

Mike Schroepfer

Comment 1

•

17 years ago

10:29 schrep do you think there are things we can improve r.e. gc timing? 10:30 igor Yes: it just a matter of retuning the heuristics. 10:30 igor It can take time, but is definitely fixable. 10:31 igor There are number in js_MaybeGC that one can definitely play with and the last ditch scheduling in js_NewGCThing can be altered as well. 10:31 igor I meant the last ditch *GC* 10:31 schrep yea sweet 10:32 schrep conceptually can we just run GC less often? 10:33 igor The problem now is that we run GC with very little garbage. 10:33 igor So yes, we can and should run it less often. 10:33 schrep is there an easy patch there? 10:34 igor JS_MaybeGC from jsapi.c contains: 10:34 igor bytes > 8192 && bytes > lastBytes + lastBytes / 3) || 10:35 igor Changing that lastBytes/3 into lastBytes * x/ y with different x and y may help. For a start x = y = 1 can be a good case. 10:40 schrep k 10:46 igor There are lengthy explanations about that number. They should be updated in any case as using mmap for GC arena decreased internal heap fragmentation. As such less holes exists on the heap and less space among lastBytes wasted. 10:47 igor => changing the condition to bytes > 8192 && bytes > 2 * lastBytes can be a good start.

Assignee: general → jorendorff

Mike Schroepfer

Comment 2

•

17 years ago

--> Jason can you run with this?

Jason Orendorff [:jorendorff]

Reporter

Comment 3

•

17 years ago

Yes.

Status: NEW → ASSIGNED

Jason Orendorff [:jorendorff]

Reporter

Comment 4

•

17 years ago

The higher I push the constant, the faster it goes. GC final speed---------------------------------- constant cycles heap size overall binary-trees nbody ======== ======= ========= ============ ============ =========== 1.33 552 1433600 (baseline) (baseline) (baseline) 2 244 1236992 10.0% faster 60% faster 58% faster 3 123 1224704 13.2% faster 57% faster 72% faster 8 29 1331200 16.6% faster 53% faster 102% faster I'll attach a file containing SunSpider URLs in a second. constant -- the heuristic constant in JS_MaybeGC. GC cycles -- the number of times a call to JS_MaybeGC resulted in an actual collection during the test run final heap size -- the value of rt->gcBytes at the end of the tests. This is the size of the SpiderMonkey GC heap; it doesn't include string buffers or anything outside the JS engine (like COM objects reachable from JS objects). overall speed -- the SunSpider ** TOTAL ** number binary-trees, nbody -- Times on the two tests that exercise the GC the most. Increasing the constant from 4/3 to 3 actually *decreases* the final heap size (again, not taking into account string buffers etc.) My guess is that the higher constant gives new objects long enough to become unreachable before GC, thus allowing the GC to reclaim more completely empty pages. Don't know how plausible that is. Pushing the constant to 8 makes the heap size go up again. We're performing GC quite rarely at this point.

Mike Schroepfer

Comment 5

•

17 years ago

(In reply to comment #4) > Pushing the constant to 8 makes the heap size go up again. We're performing GC > quite rarely at this point. > What happens if you take it to 11?

Robert Sayre

Comment 6

•

17 years ago

nbody =========== 102% faster This is faster than I can get this test to run in the shell. We should consider taking a patch with a constant value of 3 while we continue to tune this, to get the noise out of our tests.

Mike Schroepfer

Comment 7

•

17 years ago

(In reply to comment #6) > nbody > =========== > 102% faster > > This is faster than I can get this test to run in the shell. We should consider > taking a patch with a constant value of 3 while we continue to tune this, to > get the noise out of our tests. > Why not 8 - still a smaller heap than the current impl and faster.

Robert Sayre

Comment 8

•

17 years ago

(In reply to comment #7) > > Why not 8 - still a smaller heap than the current impl and faster. Because the heap grew at 8. We'll take the smallest heap for now, and revisit speed vs. memory later, after more analysis and review.

Robert Sayre

Comment 9

•

17 years ago

(In reply to comment #8) > (In reply to comment #7) > > > > Why not 8 - still a smaller heap than the current impl and faster. > > Because the heap grew at 8. We'll take the smallest heap for now, and revisit > speed vs. memory later, after more analysis and review. > Also we have other patches that are going to change the speed number, so we can't make a reasonable tradeoff decision until those are in. Also note that we lose binary-trees speed with fewer cycles. Too much garbage builds up.

Mike Schroepfer

Comment 10

•

17 years ago

(In reply to comment #8) > Because the heap grew at 8. We'll take the smallest heap for now, and revisit > speed vs. memory later, after more analysis and review. > Sounds like a plan.

Jason Orendorff [:jorendorff]

Reporter

Comment 11

•

17 years ago

Whoa, slow down! I think my baseline run was abnormally bad, especially on the memory-intensive tests. I shut everything down, turned off my second monitor, rebooted, and ran the tests again. I'll post numbers early tomorrow. To summarize: there *is* a pretty big speed gain to be had here, maybe 10% on SunSpider overall. I tested with the constant up to 12 and it kept getting faster. The "heap size" numbers in comment 4 are probably bogus though. More to come.

Jason Orendorff [:jorendorff]

Reporter

Comment 12

•

17 years ago

Checking in a constant of 3 or 4 would increase overall memory usage. (My swag is 2-3 MB in the case of starting the browser and running SunSpider once. This is based on numbers produced by "top", though, so take with a grain of salt.)

Robert Sayre

Comment 13

•

17 years ago

> Whoa, slow down! > The "heap size" numbers in comment 4 are probably bogus though. ...sigh. ;)

Robert Sayre

Comment 14

•

17 years ago

jorendorff, Could you post a patch with the exact constant you're tweaking? Not sure which of Igor's suggestions you're exploring. I'd like to test these against Massif and watch heap size over time.

Mike Schroepfer

Comment 15

•

17 years ago

Yea - an in particular you want to watch peak heap size during the test run

Jason Orendorff [:jorendorff]

Reporter

Comment 16

•

17 years ago

Attached patch patch showing what I'm tuning (obsolete) — Details — Splinter Review

Jason Orendorff [:jorendorff]

Reporter

Comment 17

•

17 years ago

Does massif measure what we want to be measuring here? Summary of my new run: constant GC cycles overall speed ======== ========== ============= 1.33 550 (baseline) 1.5 438 +3.9% 2 251 +8.3% 3 126 +11% 4 84 +13% 5 57 +14% 6 42 +15% 8 27 +15% 10 18 +17% 12 10 +19% 16 3 +22% Inf 0 +22% For the last run, I disabled JS_MaybeGC entirely. There's variability between runs, 2-3%.

Brian Crowder

Comment 18

•

17 years ago

Massif will give you lovely pretty pictures of the process' memory-growth over time, which is very handy. Also it will colorize/record allocation call-sites, so you can blame specific chunks of code for what they allocate.

Mike Schroepfer

Comment 19

•

17 years ago

If someone gives me a mac build with 16 as a constant I can do a talos multi-window pageload run as I was doing with Pav to tune out JEMalloc

Jason Orendorff [:jorendorff]

Reporter

Comment 20

•

17 years ago

Attached file SunSpider results — Details

I was equivocating yesterday because I found out that a little variability in the baseline numbers made a big difference in the results. But after running this a lot of times, I think the numbers in comment 17 are actually pretty solid. Attaching raw SunSpider numbers, in case anybody cares... These numbers aren't terribly interesting in isolation. We need the memory usage info to make a decision.

Brian Crowder

Comment 21

•

17 years ago

schrep: When the TryServer is done building it, there will be a shiny Mac build with 16 ready-to-go for you.

Brian Crowder

Comment 22

•

17 years ago

schrep: https://build.mozilla.org/tryserver-builds/2008-02-14_08:42-bcrowder@mozilla.com-forschrep/

Mike Schroepfer

Comment 23

•

17 years ago

Attached image Memory use while loading 10 cycles of 30 page pageset (~300 page loads) — Details

Here's the results with the build brian provided compared to today's trunk doing a pageload run. Short summary - no difference.

Robert Sayre

Comment 24

•

17 years ago

Attached image massif graph for 3 sunspider laps, trunk (obsolete) — Details

Robert Sayre

Updated

•

17 years ago

Attachment #303321 - Attachment is patch: false

Robert Sayre

Updated

•

17 years ago

Attachment #303321 - Attachment mime type: text/plain → image/png

Robert Sayre

Comment 25

•

17 years ago

Comment on attachment 303321 [details] massif graph for 3 sunspider laps, trunk ugh, wrong graph

Attachment #303321 - Attachment is obsolete: true

Robert Sayre

Comment 26

•

17 years ago

Attached image massif graph for 3 sunspider laps, trunk — Details

Mike Schroepfer

Comment 27

•

17 years ago

I am seeing higher terminal memory usage after a SS run. Let me do a few more cycles and see if that stays consistent.

Mike Schroepfer

Comment 28

•

17 years ago

Yep - on SS runs I see higher memory usage - but the total memory allocated does not grow after multiple runs. On my Zimbra, GReader, Zoho tests the builds are within 1%. I'm also seeing a 1.19 improvement on SS times. I think it might be worth it to try this out and see what broader testing reveals. Also - is this possible to make an about:config pref to tune at runtime?

Robert Sayre

Comment 29

•

17 years ago

Attached image massif graph for 3 sunspider laps, constant=2 — Details

Brian Crowder

Comment 30

•

17 years ago

Yeah, we could make this about:configable without too much work.

Igor Bukanov

Assignee

Comment 31

•

17 years ago

(In reply to comment #28) > Also - is this possible to make an > about:config pref to tune at runtime? This suggests to make the ratio dynamic and dd a time-dependent component to the tests. For example, if a script runs, user UI is blocked and it does not make sence to call GC unless there is a lot of garbage. On the other hand, when the system is idle, it is a perfect time to schedule a GC.

Igor Bukanov

Assignee

Comment 32

•

17 years ago

> For example, if a script runs, user UI is blocked and it does not > make sence to call GC unless there is a lot of garbage. I meant from GUI responsiveness point of view since GC cycles just prolongs the pause.

Robert Sayre

Comment 33

•

17 years ago

Attached image massif graph for 3 sunspider laps, constant=3 — Details

Robert Sayre

Comment 34

•

17 years ago

Attached image massif graph for 3 sunspider laps, constant=8 — Details

this one hurts max heap

Brendan Eich [:brendan]

Comment 35

•

17 years ago

Smaug has done work with jst & sicking reviewing, IIRC, on keeping GC out of the user's face, since cycle collection ups the ante. /be

Robert Sayre

Comment 36

•

17 years ago

Comment on attachment 303339 [details] massif graph for 3 sunspider laps, constant=8 pay no attention to the flat ending, I just forgot to close the browser quickly

Igor Bukanov

Assignee

Comment 37

•

17 years ago

(In reply to comment #34) > Created an attachment (id=303339) [details] > massif graph for 3 sunspider laps, constant=8 > > this one hurts max heap The data AFAIC is incomplete. JS uses mmap so to get full data JSRuntime.gcBytes must be included in the graph. Alternatively one can disable mmap when compiling. For that make sure that JS_GC_USE_MMAP define is set to 0 when compiling jsgc.c. >

Mike Schroepfer

Comment 38

•

17 years ago

I think the most important question is how does it impact memory usage during normal browsing sessions...

Robert Sayre

Comment 39

•

17 years ago

Attached image massif graph for 3 sunspider laps, constant=16 — Details

Robert Sayre

Comment 40

•

17 years ago

(In reply to comment #38) > I think the most important question is how does it impact memory usage during > normal browsing sessions... > Oh, I'll get that too.

Robert Sayre

Comment 41

•

17 years ago

Attached image massif data from gmail / techcrunch session -- constant=1.33 — Details

1.) open browser to http://google.com 2.) type "gmail.com" in the location bar 3.) sign into gmail 4.) view inbox 5.) open new window (mozilla default page) 6.) navigate new window to techcrunch.com 7.) close gmail window 8.) quit

Robert Sayre

Updated

•

17 years ago

Attachment #303353 - Attachment description: massif data from gmail / techcrunch session → massif data from gmail / techcrunch session -- constant=1.33

Robert Sayre

Comment 42

•

17 years ago

Attached image massif data from gmail / techcrunch session -- constant=2 — Details

Robert Sayre

Comment 43

•

17 years ago

Attached image massif data from gmail / techcrunch session -- constant=3 — Details

Robert Sayre

Comment 44

•

17 years ago

Attached image massif data from gmail / techcrunch session -- constant=8 — Details

Robert Sayre

Comment 45

•

17 years ago

Attached image massif data from gmail / techcrunch session -- constant=16 — Details

Mike Schroepfer

Comment 46

•

17 years ago

So jason what's left here? I'm thinking we land 16 and see what it does in nightlies. Otherwise we've got other areas where you can help...

Jason Orendorff [:jorendorff]

Reporter

Comment 47

•

17 years ago

Attached patch v1 (obsolete) — Details — Splinter Review

This changes the constant to 16. Kinda gutsy, but sayrer and I didn't see any major ill effects. We'll see if anyone complains.

Attachment #303277 - Attachment is obsolete: true

Attachment #304238 - Flags: review?(crowder)

Brian Crowder

Comment 48

•

17 years ago

Comment on attachment 304238 [details] [diff] [review] v1 Looks good to me based on the testing in this bug.

Attachment #304238 - Flags: review?(crowder)

Attachment #304238 - Flags: review+

Attachment #304238 - Flags: approval1.9?

Igor Bukanov

Assignee

Comment 49

•

17 years ago

Comment on attachment 304238 [details] [diff] [review] v1 >- if ((bytes > 8192 && bytes > lastBytes + lastBytes / 3) || >+ if ((bytes > 8192 && bytes > 16 * lastBytes) || This can overflow with huge heap. Instead use bytes / 16 > lastBytes

Jason Orendorff [:jorendorff]

Reporter

Comment 50

•

17 years ago

Attached patch v2 - fix overflow (obsolete) — Details — Splinter Review

Thanks to Igor for pointing this out.

Attachment #304238 - Attachment is obsolete: true

Attachment #304240 - Flags: review?(crowder)

Attachment #304238 - Flags: approval1.9?

Brian Crowder

Updated

•

17 years ago

Attachment #304240 - Flags: review?(crowder) → review+

Brendan Eich [:brendan]

Comment 51

•

17 years ago

Comment on attachment 304240 [details] [diff] [review] v2 - fix overflow >- if ((bytes > 8192 && bytes > lastBytes + lastBytes / 3) || >+ if ((bytes > 8192 && bytes / 16 > lastBytes) || Strongly suggest again that, for the sake of other embeddings, we parameterize using JS_SetGCParameter. A multiplier and divisor packed into 32 bits, split out into new uint16 JSRuntime members, could do it: if (bytes > 8192 && (bytes * rt->gcMaybeGCMultiplier) / rt->gcMaybeGCDivisor > lastBytes) { ... } /be

Brendan Eich [:brendan]

Comment 52

•

17 years ago

And suggest that the defaults for the new params reflect historical norms that embeddings may have come to count on. /be

Igor Bukanov

Assignee

Comment 53

•

17 years ago

(In reply to comment #51) > Strongly suggest again that, for the sake of other embeddings, we parameterize > using JS_SetGCParameter. A multiplier and divisor packed into 32 bits, split > out into new uint16 JSRuntime members, could do it: This means packing more and more logic into JS_MaybeGC that even in an advanced case may not suite an embedding. Note how XPConnect effectively disables the second condition JS_MaybeGC(), rt->gcMallocBytes >= rt->gcMaxMallocBytes, by setting gcMaxMallocBytes to UINT32_MAX. So why not to allow for the embedding to make the decision for itself and call JS_GC as necessary? For that it is sufficient to expose rt->gc(Last)Bytes.

Brendan Eich [:brendan]

Comment 54

•

17 years ago

The embedding can make any decision it likes about when to call JS_GC, and I agree with you that it would be better for it to make an informed decision, rather than making a wish and calling JS_MaybeGC. JS_MaybeGC has always been a hack at best. However, embeddings may count on it to behave as it has, and fiddling with the scaling of either term in the comparison may cause non-linearly bad space growth from too little maybe-GC'ing, *or* runtime cost from too much maybe-GC'ing. I am pleading for some semblance of API compatibility here. If Gecko embeddings stop using JS_MaybeGC, great -- then we can leave it be, and hope to stub it out in a future where GC auto-schedules better. /be

Jason Orendorff [:jorendorff]

Reporter

Comment 55

•

17 years ago

Attached patch v3 - parameterized (obsolete) — Details — Splinter Review

To avoid floating-point math (or parameters), risk of overflow, and having two parameters where there really should just be one, I arbitrarily set the current behavior = 100 on a linear scale. What we've been calling "16" will then be 4500. The formula is: JSGC_MAYBEGC_PATIENCE parameter = (old constant - 1) * 300. Examples: 100 = (4/3 - 1) * 300 4500 = (16 - 1) * 300

Attachment #304270 - Flags: review?(crowder)

Jason Orendorff [:jorendorff]

Reporter

Updated

•

17 years ago

Attachment #304240 - Attachment is obsolete: true

Igor Bukanov

Assignee

Comment 56

•

17 years ago

Attached patch custom MaybeGC for dom — Details — Splinter Review

Here is an alternative that just patches nsJSEnvironment.cpp to call JS_GC() directly as necessary. It access JSRuntime members directly for now.

Assignee: jorendorff → igor

Attachment #304272 - Flags: review?(brendan)

Jason Orendorff [:jorendorff]

Reporter

Updated

•

17 years ago

Attachment #304270 - Attachment is obsolete: true

Attachment #304270 - Flags: review?(crowder)

Brendan Eich [:brendan]

Comment 57

•

17 years ago

Comment on attachment 304272 [details] [diff] [review] custom MaybeGC for dom Great, who could object? I do not see any other JS_MaybeGC uses in Gecko. /be

Attachment #304272 - Flags: review?(brendan)

Attachment #304272 - Flags: review+

Attachment #304272 - Flags: approval1.9+

Brendan Eich [:brendan]

Comment 58

•

17 years ago

The patch makes me wonder whether JSRuntime.gcBytes, etc. should be size_t not uint32 -- do we cope with very large heaps on 64-bit systems well, or at all? /be

Brendan Eich [:brendan]

Updated

•

17 years ago

Flags: blocking1.9+

Target Milestone: --- → mozilla1.9beta4

Robert Sayre

Comment 59

•

17 years ago

I checked this in to get data ASAP Checking in nsJSEnvironment.cpp; /cvsroot/mozilla/dom/src/base/nsJSEnvironment.cpp,v <-- nsJSEnvironment.cpp new revision: 1.390; previous revision: 1.389 done

Target Milestone: mozilla1.9beta4 → ---

Robert Sayre

Updated

•

17 years ago

Target Milestone: --- → mozilla1.9beta4

Mike Schroepfer

Comment 60

•

17 years ago

http://graphs.mozilla.org/#spst=range&spss=1203465506.9558232&spse=1203474760&spstart=1202322756&spend=1203474760&bpst=Cursor&bpstart=1203459220&bpend=1203474760&m1tid=134821&m1bl=0&m1avg=0 xp02 has reported in. 845 pre vs 649 post (1.3x faster). So not up to vlad's 2 line 3x faster standards. But not bad :-) Private-bytes looks unchanged: http://graphs.mozilla.org/#spst=range&spss=1203421819.566265&spse=1203474760&spstart=1196883676&spend=1203474760&bpst=Cursor&bpstart=1203421819.566265&bpend=1203474760&m1tid=53258&m1bl=0&m1avg=0 If we leave this in overnight I can repeat in-browser mem tests with tomorrow's nightly.

Mike Schroepfer

Comment 61

•

17 years ago

About at 2% Ts win on windows as well: http://graphs.mozilla.org/#spst=range&spss=1203466265.3052208&spse=1203481173.8323293&spstart=1202931767&spend=1203481726&bpst=Cursor&bpstart=1203466265.3052208&bpend=1203481173.8323293&m1tid=148133&m1bl=0&m1avg=0&m2tid=148131&m2bl=0&m2avg=0&m3tid=148165&m3bl=0&m3avg=0

Igor Bukanov

Assignee

Comment 62

•

17 years ago

(In reply to comment #60) > Private-bytes looks unchanged: Does private test account for memory allocated with direct mmap/Virtualalloc calls?

Mike Beltzner [:beltzner, not reading bugmail]

Comment 63

•

17 years ago

Moving bugs that aren't beta 4 blockers to target final release.

Target Milestone: mozilla1.9beta4 → mozilla1.9

Damon Sicore (:damons)

Comment 64

•

17 years ago

This has landed. Please file followup bugs if necessary.

Status: ASSIGNED → RESOLVED

Closed: 17 years ago

Resolution: --- → FIXED

Bob Clary [:bc] (inactive)

Updated

•

17 years ago

Flags: in-testsuite-

Flags: in-litmus-

Igor Bukanov

Assignee

Updated

•

16 years ago

Depends on: 446379

You need to log in before you can comment on or make changes to this bug.