Closed Bug 506012 Opened 16 years ago Closed 7 years ago

ZCT reap frequency throttling

Categories

(Tamarin Graveyard :: Garbage Collection (mmGC), defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX
Future

People

(Reporter: lhansen, Unassigned)

References

Details

(Whiteboard: PACMAN)

Attachments

(5 obsolete files)

Fast paths are possible for esp Add and maybe Remove. These may be important because every new RCObject is added to the ZCT, and many of them (10% - 25%) are removed quickly again when their reference counts are increased. Experiments suggest that a 2-element cache would allow most insert-remove pairs to be processed especially efficiently, but there are other approaches that may be comparable in speed.
There are no code changes, just refactoring of the ZCT into GCZCT.{h,cpp}. (Windows project files have not yet been updated in this patch.)
Attachment #390242 - Flags: review?(treilly)
Comment on attachment 390242 [details] [diff] [review] ZCT refactored into its own files I would have preferred just ZCT.cpp but not worth changing.
Attachment #390242 - Flags: review?(treilly) → review+
Attached patch ZCT refactored, v2 (obsolete) — Splinter Review
Renames the new files, adds vcproj changes, and fixes a build problem in DEBUG mode.
Attachment #390242 - Attachment is obsolete: true
Comment on attachment 390437 [details] [diff] [review] ZCT refactored, v2 redux changeset: 2240:aee05b7cbe78
Attachment #390437 - Attachment is obsolete: true
Attached patch Candidate patch (obsolete) — Splinter Review
Creates inlinable fast paths for ZCT::Add and ZCT::Remove; no longer maintains a ZCT free list; makes ZCT::Reap true depth-first. Plenty of documentation and minor reorg. API & semantics unchanged to the best of my knowledge. Further testing & benchmarking is pending, but gcbench shows a speedup of 1.09 on MacOS X desktop with this change. (Of course that's an extreme case.) Among other things this suggests that there may be further percentages to be had by looking at nearby control paths, notably reference counting.
Target Milestone: --- → flash10.1
Attached patch Candidate patch, v2 (obsolete) — Splinter Review
Compile fixes for Windows
Attachment #390829 - Attachment is obsolete: true
Benchmarking against frr-api + tr tip + patch, Fuze: - gridtest: 5.4% faster - scrollspeed: 2.6% faster - textinputform_load: 3% faster
Attached patch Patch (obsolete) — Splinter Review
Attachment #391317 - Attachment is obsolete: true
Attachment #391335 - Flags: review?(treilly)
Attachment #391335 - Flags: review?(treilly) → review+
Comment on attachment 391335 [details] [diff] [review] Patch uint64 for zctSizeBlocks seems like overkill. I think we can get rid of the composite == 0 bit if we check if objects have been finalized before decrementing the ref. What if composite == 0 meant stuck? Then a free'd RCObject would be considered stuck and we wouldn't touch it. Passing scanStack to PinProgramStack is redundant RCObject header comment is cut off "Client code is allowed to..." Just nulling out zct entries on 0->1 worries me that we'll perform more ZCT reaps that will be unfruitful. We originally did it this way and I recall the holes being more than the 10% to 25% you quote. Maybe we could compact when a segment gets full or count the holes and compact if holes are over some threshold?
(In reply to comment #9) > (From update of attachment 391335 [details] [diff] [review]) > uint64 for zctSizeBlocks seems like overkill. Fixed. > I think we can get rid of the composite == 0 bit if we check if objects have > been finalized before decrementing the ref. > > What if composite == 0 meant stuck? Then a free'd RCObject would be > considered stuck and we wouldn't touch it. Neat. Will note this in an appropriate bug. > Passing scanStack to PinProgramStack is redundant Actually it isn't. PinProgramStack is called if scanStack is true /or/ validateDefRef is true, and it always captures the stack extent. But it only needs to pin if scanStack is true. I'm not terribly fond of the structure of the code, but I wanted to get the MMGC_GET_STACK_EXTENTS call out of Reap. > RCObject header comment is cut off "Client code is allowed to..." Fixed, I think this was supposed to be about pinning but that's captured elsewhere now. > Just nulling out zct entries on 0->1 worries me that we'll perform more ZCT > reaps that will be unfruitful. We originally did it this way and I recall > the holes being more than the 10% to 25% you quote. Maybe we could compact > when a segment gets full or count the holes and compact if holes are over > some threshold? Will leave this bug open; we should do some more careful profiling directed at resolving this, and we should consider compaction as a remedy if there is trouble.
Comment on attachment 391335 [details] [diff] [review] Patch redux changeset: 2281:20aa1151e7ca
Attachment #391335 - Attachment is obsolete: true
REMAINING WORK, 2009-07-30: - Profile memory consumption and reaping frequency, also see bug #501916 and bug #499968. We want to ensure that the lack of a free list does not unreasonably elevate the reaping frequency. - Consider whether ZCT compaction is useful either as a remedy to combat elevated reaping frequency (if found to be the case) or just by itself, to throttle reaping even if it is not too frequent.
OS: Mac OS X → All
Hardware: x86 → All
Sketch for compaction code (insert into ZCT::Reap after test for empty ZCT): // Try to compact first RCObject ***up_blockp = blocktable; RCObject **up_objp = *up_blockp; RCObject **up_limit = up_objp + CAPACITY(RCObject*); RCObject ***down_blockp = blocktop - 1; RCObject **down_objp = top; RCObject **down_bottom = bottom; for (;;) { // Scan up to find a hole while (up_blockp < blocktop) { while (up_objp < up_limit) { if (*up_objp == NULL) goto found_hole; up_objp++; } up_blockp++; up_objp = *up_blockp; up_limit = up_objp + CAPACITY(RCObject*); } break; // no hole found_hole: // Scan to find a pointer while (down_blockp >= blocktable) { while (down_objp >= down_bottom) { if (*down_objp != NULL) goto found_ptr; down_objp--; top--; } // FIXME: should pop an empty segment? // Must at a minimum move 'top' and 'bottom' // into the new segment. down_blockp--; down_bottom = *down_blockp; down_objp = down_bottom + CAPACITY(RCObject*) - 1; } break; // no pointer found_ptr: // Move the entry, unless up and down have crossed // FIXME } // Now skip the reaping if the table isn't very full
Based on profiling of the test cases in WE 2262944, ZCT reaping times are not a substantial contributor (< 2% of overall running time in current frr+tr code). That bug was thought to suffer from ZCT reaping cost caused by a large amount of consing of short lived strings.
Also relevant may be WE 2315695.
Flags: flashplayer-qrb?
Target Milestone: flash10.1 → ---
QRB: removing from Flash 10.1 unless required to address performance regressions.
Flags: flashplayer-qrb? → flashplayer-qrb+
Summary: ZCT fast paths → ZCT reap frequency throttling
Target Milestone: --- → Future
Assignee: lhansen → nobody
Status: ASSIGNED → NEW
Actually may be somewhat important for performance because stack scanning uses setjmp, which turns out to be expensive on Unix and Mac.
Priority: -- → P5
Target Milestone: Future → flash10.1
Target Milestone: flash10.1 → flash10.2
Whiteboard: PACMAN
Blocks: 604350
Priority: P5 → --
Target Milestone: Q3 11 - Serrano → Q1 12 - Brannan
Target Milestone: Q1 12 - Brannan → Future
Tamarin is a dead project now. Mass WONTFIX.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Tamarin isn't maintained anymore. WONTFIX remaining bugs.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: