Closed Bug 657949 Opened 14 years ago Closed 14 years ago

Segregated allocation for finalizable objects

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

Q1 12 - Brannan

People

(Reporter: lhansen, Assigned: lhansen)

References

Details

Attachments

(2 files, 1 obsolete file)

Preliminary patch 14 years ago Lars T Hansen 26.39 KB, patch		Details \| Diff \| Splinter Review
Patch 14 years ago Lars T Hansen 27.10 KB, patch	pnkfelix : review+	Details \| Diff \| Splinter Review
Updated patch 14 years ago Lars T Hansen 34.24 KB, patch		Details \| Diff \| Splinter Review

Lars T Hansen

Assignee

Description

•

14 years ago

Investigations show that only 10%-50% of the heap contains finalizable objects (including RC objects) in typical Flash applications. The pause induced by FinishIncrementalMark could thus be reduced, possibly substantially, by segregating the allocation of finalizable objects from non-finalizable objects; only the former blocks need be scanned during FinishIncrementalMark.

Lars T Hansen

Assignee

Comment 1

•

14 years ago

Attached patch Preliminary patch (obsolete) — Details — Splinter Review

Passes tests and is conceptually complete, but I've not investigated the impact of the change on memory / pauses / performance.

Steven Johnson

Comment 2

•

14 years ago

Currently all RCObjects are finalized, but clearly only a subset of them need to be. Is it possible/practical to make a non-finalized RCObject?

Lars T Hansen

Assignee

Comment 3

•

14 years ago

(In reply to comment #2) > Currently all RCObjects are finalized, but clearly only a subset of them > need to be. Is it possible/practical to make a non-finalized RCObject? It's probably doable. But the benefits are far from clear: every ScriptObject needs to be finalized because the refcount on the "delegate" member of ScriptObject needs to be decremented. And String has m_master and Namespace has some strings, so every AvmPlusScriptableObject needs to be finalized to do refcounting right. So how many other RCObjects are there (that we care about)? Probably almost none.

Lars T Hansen

Assignee

Comment 4

•

14 years ago

Preliminary numbers from a case study on Get the Glass on a MacBook Air suggests that finalization scanning volume is generally cut in half and max pause times are reduced by 25%, without increasing the heap size at all (measured either as GC blocks or total blocks). (An interesting anomaly on that program is that finalization will sometimes pause for 4 - 5 seconds even on a MacPro, with or without this change. It would appear that some of the "finalizers" actually do major synchronous work.) The volume of weak references appears to be high on this benchmark (up to 7000 reaped per GC, out of an unknown population); more investigation is needed to determine the population. If the population is generally large then segregating weakrefs can be beneficial because those that are live will not be finalized and those that are dead are no longer finalizable (they've been cleared); ergo they need never be scanned at all.

Lars T Hansen

Assignee

Comment 5

•

14 years ago

Further study reveals that the toal weakref populations are not much larger than the reapable weakref populations on this program. Still, it seems that we can't do any harm if we remove the weakref scanning work from GCAlloc::Finalize.

Lars T Hansen

Assignee

Updated

•

14 years ago

Blocks: 604333

Lars T Hansen

Assignee

Comment 6

•

14 years ago

Another data point. Get the Glass appears to be an AS2 application. The vast bulk of finalizable objects falls into two groups, AS2 ScriptObjects (which are RCObjects) and AS2 variable containers (which are FinalizedObjects). The variable container probably need not be marked as finalized, because its finalization function is NULL - its contents are finalized by the ScriptObject destructor. This could be fixed by having ExactStructContainer turn off the finalized bit in the constructor if the finalization function is NULL, it's a minor tweak and allocating it non-finalized in the first place would be better.

Lars T Hansen

Assignee

Comment 7

•

14 years ago

A broader study of Brent's benchmark suite is somewhat less encouraging, these are relative numbers (new / old) for peak heap blocks, gc blocks, and private blocks: audiotool 1.00 1.00 1.03 bigpixelzombies 1.04 1.08 0.92 brainstorm 0.98 0.97 0.96 checkinapp 1.03 1.00 1.06 cignatabbing 1.00 1.00 0.99 coverflow 0.99 1.01 0.98 crushthecastle 0.99 0.95 1.06 flexdatagrid 1.00 1.02 1.01 gettheglass 1.15 1.31 1.16 mechanism 0.95 0.91 0.97 phystest 1.04 1.10 1.20 watsonexp 1.06 1.08 1.08 Some of these are highly variable due to gameplay and in-game advertising (get the glass, crush the castle, big pixel zombies, mechanism), and at one point in get the glass i finally figured out how to steer the van, so... the 10% growth in gc blocks in phystest worries me the most (not clear what the "private" numbers are worth, they are as reported but i don't trust them). (FRR from yesterday + TR from yesterday + the segregated-alloc patch, 64-bit plugin, updated Safari on MacOS 10.6, peak-occupancy data extracted from a gcbehavior dump) The underlying numbers are worth a study too; audiotool appears to trigger GC explicitly (one incremental mark per GC cycle for a long time).

Lars T Hansen

Assignee

Comment 8

•

14 years ago

Much controlled benchmark running and the usual gnuplot hair-pulling later: There is some variation on phystest, but the amount of noise in that program is huge. Across five runs with the old and new allocator setup: With the original allocator, the peak number of GC blocks ranges from 5119 to 5427, and the peak number of heap blocks from 7438 to 7898. The number of collections ranges from 20 to 40. The amount of allocation work varies from 1.2GB to 2.4GB (correlates directly with the number of collections). With the new allocator, the peak number of GC blocks ranges from 5099 to 6148, and the peak number of heap blocks from 7271 to 8553. The number of collections ranges from 26 to 32. The allocation work varies from 1.6 to 1.9GB. It's anyone's guess why the new allocator seems to have higher variability in the allocation numbers - with an impact on GC activity and block occupancy - than the old allocator; there should be no feedback there. That said, there may be very different heap dynamics with the segregated allocator: we're relying on lazy sweeping to return memory more than before, and lazy sweeping in one allocator is /not/ triggered by scarcity in other allocators - allocators go straight to GCHeap. This is an issue I brought up in bug #659317 also. With the unsegregated allocator it doesn't make sense for GCHeap to trigger sweeping - there would be little or nothing to gain from it. With the segregated allocator, it's different - there may be entirely empty blocks sitting there, waiting to be swept. Thus if the present change were to land on its own it would create the need for a follow-on item to investigate how we can drive sweeping from GCHeap when memory is scarce.

Lars T Hansen

Assignee

Comment 9

•

14 years ago

(In reply to comment #8) Argh! > With the original allocator, ... > With the new allocator, ... I mixed up my numbers. They should be: With the original allocator, the peak number of GC blocks ranges from 5119 to 5427, and the peak number of heap blocks from 7438 to 7898. The number of collections ranges from 26 to 32. The allocation work varies from 1.6 to 1.9GB. With the new allocator, the peak number of GC blocks ranges from 5099 to 6148, and the peak number of heap blocks from 7271 to 8553. The number of collections ranges from 20 to 40. The amount of allocation work varies from 1.2GB to 2.4GB (correlates directly with the number of collections).

Lars T Hansen

Assignee

Comment 10

•

14 years ago

Attached patch Patch — Details — Splinter Review

Rebased to current TR. I'd like to land this because we want it for the incrementality work; as commented earlier in this bug and one place in the patch there will be knock-on work items for lazy sweeping, but there's no evidence of serious regressions from this change now, and it only goes into Brannan.

Attachment #533301 - Attachment is obsolete: true

Attachment #535341 - Flags: review?(fklockii)

Dan Smith

Updated

•

14 years ago

Flags: flashplayer-qrb+

Flags: flashplayer-injection-

Felix S. Klock II [:pnkfelix, :fklock]

Comment 11

•

14 years ago

Comment on attachment 535341 [details] [diff] [review] Patch - Consider alpha-renaming *PointersAllocs to *PointersNonFinalizedAllocs; it would help ensure that both old and new code is properly vetted to accommodate the change, the same way we say noPointersAllocs instead of just Allocs. - Good food for thought in the note above GCAlloc::Sweep. Should we open a ticket to investigate that?

Attachment #535341 - Flags: review?(fklockii) → review+

Lars T Hansen

Assignee

Comment 12

•

14 years ago

(In reply to comment #11) > Comment on attachment 535341 [details] [diff] [review] [review] > Patch > > - Consider alpha-renaming *PointersAllocs to *PointersNonFinalizedAllocs; it > would help ensure that both old and new code is properly vetted to > accommodate the change, the same way we say noPointersAllocs instead of just > Allocs. Agreed. > - Good food for thought in the note above GCAlloc::Sweep. Should we open a > ticket to investigate that? I will consider it, though it's feeding into some other work I'm doing on block recycling so I don't know for sure yet whether it's going to become its own work item.

Lars T Hansen

Assignee

Comment 13

•

14 years ago

Attached patch Updated patch — Details — Splinter Review

Contains a couple of bug fixes and the rename suggested by Felix, ready for integration and testing.

Tamarin Bot

Comment 14

•

14 years ago

changeset: 6402:c8459ef051b0 user: Lars T Hansen <lhansen@adobe.com> summary: Fix 657949 - Segregated allocation for finalizable objects (r=fklockii) http://hg.mozilla.org/tamarin-redux/rev/c8459ef051b0

Lars T Hansen

Assignee

Comment 15

•

14 years ago

The GCAlloc::Sweep issue is logged as bug #665416.

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Segregated allocation for finalizable objects

Categories

(Tamarin Graveyard :: Garbage Collection (mmGC), defect, P2)

Tracking

(Not tracked)

People

(Reporter: lhansen, Assigned: lhansen)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(2 files, 1 obsolete file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Attachment

General

Description

File Name

Content Type