Closed Bug 768794 Opened 12 years ago Closed 11 years ago

GC and CC performance measurements for B2G

Categories

(Firefox OS Graveyard :: General, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: gwagner, Assigned: gwagner)

Details

(Keywords: perf, Whiteboard: [c= p= s=2013.09.20 u=])

We want to track the GC and CC performance and get a sense what GC and CC pause time we can expect if we increase the heap size.
Currently I see CC pause times between 10ms and 150 msec.
GCs take between 2-3 sec (IGC) with a peak pause time around 100msec for a 14MB heap.


E/GeckoConsole(  114): CC(T+57920.3) duration: 57ms, suspected: 170, visited: 1250 RCed and 9863 GCed, collected: 48 RCed and 0 GCed (48 waiting for GC)

I don't get a full output of a GC yet but a typical GC when we don't touch the device looks like:
I/Gecko   (  114): GC(T+57934.0) Total Time: 313.0ms, Compartments Collected: 116, Total Compartments: 116, MMU (20ms): 0%, MMU (50ms): 0%, Max Pause: 93.3ms, Allocated: 14MB, +Chunks: 1, -Chunks: 0
I/Gecko   (  114):     Slice: 0, Pause: 60.6 (When: 0.0ms, Reason: SET_NEW_DOCUMENT): Begin Callback: 0.2ms, Purge: 1.1ms, Mark: 41.5ms, Mark Roots: 26.9ms, Discard Code: 9.7ms
I/Gecko   (  114):     Slice: 2, Pause: 18.5 (When: 213.2ms, Reason: INTER_SLICE_GC): Wait Background Thread: 0.1ms, Mark: 16.0ms
I/Gecko   (  114):     Slice: 3, Pause: 12.1 (When: 264.1ms, Reason: INTER_SLICE_GC): Wait Background Thread: 0.1ms, Mark: 10.7ms
I/Gecko   (  114):     Slice: 5, Pause: 18.5 (When: 606.5ms, Reason: INTER_SLICE_GC): Wait Background Thread: 0.1ms, Mark: 16.2ms
I/Gecko   (  114):     Slice: 6, Pause: 14.2 (When: 800.5ms, Reason: INTER_SLICE_GC): Mark: 12.9ms
I/Gecko   (  114):     Slice: 7, Pause: 15.2 (When: 1028.9ms, Reason: INTER_SLICE_GC): Mark: 13.4ms
I/Gecko   (  114):     Slice: 9, Pause: 14.1 (When: 1242.1ms, Reason: REFRESH_FRAME): Mark: 11.3ms
I/Gecko   (  114):     Slice: 13, Pause: 93.3 (When: 1528.8ms, Reason: REFRESH_FRAME): Mark: 16.6ms, Mark Delayed: 1.1ms, Mark Oth
In certain cases bug 754495, and in some other cases bug 730639 should help with the higher-end
CC times.
There are also some plans to represent DOM subtrees as one CC node in the graph. That
should improve the higher-end CC times. Can't find the bug# atm.

Someone should try about:cc (Bug 726346) or perhaps create CC logs to see what
all is kept in CC graph. We want to keep the graph small.
Yes, a CC log would be helpful.  That's a fairly large CC graph considering that it didn't collect anything.  This is what it looks like for me on desktop with 6 tabs open:

CC(T+36677.4) duration: 17ms, suspected: 240, visited: 2286 RCed and 2966 GCed, collected: 2 RCed and 0 GCed (2 waiting for GC)
ForgetSkippable 30 times before CC, min: 0 ms, max: 10 ms, avg: 1 ms, total: 49 ms, removed: 2374

Perhaps we can do some B2G-specific optimizations.
...and make sure B2G specific code doesn't leak.
Similar to what have been done with FF UI (bug 728407).
Is it possible to get the full GC log data? I'm interested in the sweep times as well, since I'd like to work on that soon.
(In reply to Bill McCloskey (:billm) from comment #4)
> Is it possible to get the full GC log data? I'm interested in the sweep
> times as well, since I'd like to work on that soon.

Somehow the printf_stderr doesn't print the whole log. I will try to get a complete one. Maybe we could also remove marking slices more aggressively.
Gregor, in which process were you looking at CC/GC times. Was there several apps in that same
process? I do hope that b2g gets process handling done properly asap so that apps have always
their own processes, and so also their own CC/GC.
(In reply to Olli Pettay [:smaug] from comment #6)
> Gregor, in which process were you looking at CC/GC times. Was there several
> apps in that same
> process? I do hope that b2g gets process handling done properly asap so that
> apps have always
> their own processes, and so also their own CC/GC.

Currently we only have one app (calculator app) that runs oop. So I was running the calculator app in parallel with other apps. I see CC/GC times from the parent and the child.
Surprisingly I see the log for the child CC twice. The child and the parent process prints the same log.
I found a shorter log right after startup that's more complete. The main part of sweeping is here.

I/Gecko   (  115): GC(T+299.2) Total Time: 247.2ms, Compartments Collected: 107, Total Compartments: 107, MMU (20ms): 0%, MMU (50ms): 0%, Max Pause: 115.3ms, Allocated: 12MB, +Chunks: 0, -Chunks: 0
I/Gecko   (  115):     Slice: 0, Pause: 28.8 (When: 0.0ms, Reason: SET_NEW_DOCUMENT): Purge: 0.4ms, Mark: 21.4ms, Mark Roots: 10.5ms, Discard Code: 3.6ms
I/Gecko   (  115):     Slice: 1, Pause: 14.4 (When: 260.6ms, Reason: INTER_SLICE_GC): Mark: 13.2ms
I/Gecko   (  115):     Slice: 3, Pause: 13.0 (When: 495.9ms, Reason: INTER_SLICE_GC): Mark: 11.8ms
I/Gecko   (  115):     Slice: 4, Pause: 12.3 (When: 584.6ms, Reason: INTER_SLICE_GC): Mark: 11.4ms
I/Gecko   (  115):     Slice: 5, Pause: 12.5 (When: 609.2ms, Reason: INTER_SLICE_GC): Mark: 11.8ms
I/Gecko   (  115):     Slice: 10, Pause: 115.3 (When: 748.3ms, Reason: INTER_SLICE_GC): Mark: 37.5ms, Mark Delayed: 1.3ms, Mark Other: 36.1ms, Finalize Start Callback: 1.0ms, Sweep: 58.0ms, Sweep Compartments: 26.1ms, Sweep Object: 7.7ms, Sweep String: 0.3ms, Sweep Script: 0.5ms, Sweep Shape: 6.3ms, Discard Code: 2.5ms, Discard Analysis: 8.5ms, Discard TI: 1.9ms, Sweep Types: 2.0ms, Clear Script Analysis:
I also noticed that the pause between GC slices become very long.

I/Gecko   (  115): GC(T+4435.6) Total Time: 607.1ms, Compartments Collected: 117, Total Compartments: 117, MMU (20ms): 0%, MMU (50ms): 0%, Max Pause: 388.8ms, Nonincremental Reason: requested, Allocated: 13MB, +Chunks: 0, -Chunks: 0
I/Gecko   (  115):     Slice: 0, Pause: 38.7 (When: 0.0ms, Reason: SET_NEW_DOCUMENT): Begin Callback: 0.4ms, Purge: 0.3ms, Mark: 28.4ms, Mark Roots: 18.1ms, Mark Types: 1.7ms, Discard Code: 6.1ms
I/Gecko   (  115):     Slice: 2, Pause: 17.3 (When: 1261.8ms, Reason: INTER_SLICE_GC): Wait Background Thread: 0.1ms, Mark: 15.2ms
I/Gecko   (  115):     Slice: 5, Pause: 16.4 (When: 2604.9ms, Reason: INTER_SLICE_GC): Wait Background Thread: 0.1ms, Mark: 14.0ms
I/Gecko   (  115):     Slice: 6, Pause: 12.0 (When: 2707.6ms, Reason: INTER_SLICE_GC): Mark: 11.3ms
I/Gecko   (  115):     Slice: 7, Pause: 12.9 (When: 2820.3ms, Reason: INTER_SLICE_GC): Mark: 11.9ms
I/Gecko   (  115):     Slice: 8, Pause: 19.4 (When: 3948.8ms, Reason: INTER_SLICE_GC): Mark: 17.3ms
I/Gecko   (  115):     Slice: 10, Pause: 17.5 (When: 5003.8ms, Reason: INTER_SLICE_GC): Wait Background Thread: 0.1ms, Mark: 15.2ms
I/Gecko   (  115):     Slice: 11, Pause: 12.2 (When: 5105.9ms, Reason:
The slices are scheduled based on a 100ms timer. If they're far apart, that suggests that something is running for a long time and preventing the timer from firing.
Keywords: perf
Whiteboard: [c= p= u= s=]
In performance triage we decided that we were going to close this out as we believe these numbers have been tuned within the last year. If this is not the case and this needs further investigation please reopen.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Whiteboard: [c= p= u= s=] → [c= p= s=2013.09.20 s=]
Whiteboard: [c= p= s=2013.09.20 s=] → [c= p= s=2013.09.20 u=]
You need to log in before you can comment on or make changes to this bug.