Closed Bug 594265 Opened 14 years ago Closed 14 years ago

--memory switch to runtests.py appears to significantly underestimate heap size

Categories

(Tamarin Graveyard :: Tools, defect, P2)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED FIXED
Q3 11 - Serrano

People

(Reporter: lhansen, Assigned: lhansen)

Details

(Whiteboard: has-patch)

Attachments

(2 files)

The --memory switch is documented thusly: "logs the high water memory mark". It appears not to do that very well, for a conventional definition of "high water memory mark". The context here is bug #594008, TR 5162 with the patch from that bug applied, and with a tweak to jsbench/typed/FFT.as to run it with 128K elements rather than the default 512K. It should be easy to see the same effects without changing FFT.as however, and it should be possible to see the effects without the patch on many programs. The --memory switch to runtests.py runs the shell with -memstats and captures the output from the 'private' item, if we run the program and capture that line we see this tail: ... [mem] private 1160 (4.5M) 100% [mem] private 1160 (4.5M) 100% [mem] private 1160 (4.5M) 100% [mem] private 1240 (4.8M) 100% The 'private' number is obtained from VMPI_getPrivateResidentPageCount at certain points during the run. If we contrast with the accurate count of pages allocated to the heap as reported by -gcbehavior, however, we see this: [gcbehavior] peak-occupancy: blocks-heap-allocated=2467 blocks-gc-allocated=2362 blocks-private=2038 2467 blocks * 4K/block = 9868K = 9.6M, more than twice what was reported above. Why the discrepancy? - One big issue is that the memstats data are printed /after/ significant GC events, such as reaping and sweeping. Thus the peak heap size is not actually captured at all. If I insert a call to gclog() at the beginning of GC::Sweep, I see lines like these printed: ... [mem] private 1120 (4.4M) 100% [mem] private 2022 (7.9M) 100% [mem] private 1120 (4.4M) 100% [mem] private 1120 (4.4M) 100% [mem] private 995 (3.9M) 100% The program is otherwise unchanged, but now the 'peak' is 7.9MB, up from 4.8. Since this is higher than the old results the old code did not in fact record the peak. - Resident pages could in principle be affected by paging and somewhat by lazy page allocation, while the page count maintained for -gcbehavior should not be subject to those effects. While typed/FFT may be an outlier (lots of primitive vector data), it is almost certainly the case that runtests.py should use -gcbehavior (or some variant of it that only prints summary information at the end, not something for every GC) because that yields more relevant numbers. (Even if we stick with -memstats this is a bug we have to fix somewhere; whatever the definition of the high water mark, it is not the number of pages following GC.)
Priority: -- → P2
Target Milestone: --- → flash10.2
Assuming we stick with -memstats, it's probably the case that two things have to change: - print statistics before sweep - runtests.py should use the 'mmgc' number, not the 'private' number
Attachment #472945 - Flags: review?(treilly)
(In reply to comment #1) > Assuming we stick with -memstats, it's probably the case that two things have > to change: > > - print statistics before sweep > - runtests.py should use the 'mmgc' number, not the 'private' number Three things: - for some reason runtests.py picks up the last reading of 'private', it's not clear to me why that would make sense, it's not like VMPI_getPrivateResidentPageCount is guaranteed to return a nondecreasing sequence of results.
Attachment #472946 - Flags: review?(treilly)
Attachment #472946 - Flags: review?(brbaker)
Assignee: nobody → lhansen
Status: NEW → ASSIGNED
Whiteboard: has-patch
Comment on attachment 472946 [details] [diff] [review] Parse all output lines, find the max number, use mmgc and not private Looks like the how parseMemHigh() is called has changed in a refactoring round since the method looks like it used to parse all of the output and track the highest mem value, if val>memoryhigh: memoryhigh=val but now it is only fed one line and not the entire output. Although the old comment of "#we only want the LAST [mem] private reading" seems to indicate that it was a conscious decision.
Attachment #472946 - Flags: review?(brbaker) → review+
Attachment #472945 - Flags: review?(treilly) → review+
Comment on attachment 472946 [details] [diff] [review] Parse all output lines, find the max number, use mmgc and not private private always seemed truthier to me but for our purposes consistency is probably more important.
Attachment #472946 - Flags: review?(treilly) → review+
(In reply to comment #6) > private always seemed truthier to me but for our purposes consistency is > probably more important. I could probably go either way, the high bit is fixing the output from -memstats to reflect values nearer the peak. The value from -memstats can still be well below peak if there's significant memory that is managed by some RCObject - that object can be born, allocate some fixed-malloc memory, and then die again and deallocate that memory, all between two collections. In that case we wouldn't see the real peak with -memstats.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: