Closed Bug 1267812 Opened 4 years ago Closed 4 years ago

Reduce our worst-case minor GC latency

Categories

(Core :: JavaScript: GC, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
firefox49 --- fixed

People

(Reporter: terrence, Assigned: terrence)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(3 files)

Attached file compare_ggc_perf.py
I used browser startup to profile. This is a bit of a synthetic workload, but unlike more complex workloads, it's fairly stable and actually hammers on some of our worst case behaviors.

Findings
========
To a first approximation, all of our FULL_STORE_BUFFER GC's during startup are caused by the generic buffer. The generic buffer uses a LifoAlloc with a 64KiB chunk size. We trigger an overflow minor collection when we use 16KiB of this. We need some room for overflow until we reach a GC safepoint, but we're typically wasting most of this block.

I reduced the LifoAlloc chunk size to 8KiB and set the overflow position to 4KiB. This leads to the following change:

Test             Before   After
-------------------------------
Num GGC              54     151
Num FSB GGC          10     113
GGC Total Time    44573   27647
FSB GGC Tot Time  22893   22537
Max GGC Time       6500    1176
Max FSB GGC Time   6500    1176
Avg GGC Time        825.4   183.1
Avg FSB GGC Time   2289.3   199.4
Med GGC Time        462     142
Med FSB GGC Time   1704     164

We do 3 *times* more collections, but our total time spent doing minor collection drops from 44ms to 27ms during startup. The FSB collections do not change so much, but (at a guess) evicting those objects sooner means they don't entrain other objects. Our mean and median collection times go from 825.4 and 462us respectively to 183.1 and 142us. The contrast is only more stark if we look only at FULL_STORE_BUFFER collections.

I did not expect this sort of severe non-linear response. Oh, also octane scores for the shell:

before: 38441
after:  38710
I tried runs at 6, 4, 3, and 2 KiB overflow. It's basically in the noise at this level. I also did several runs against tip and the one above is not the worst. I think I'm going to go with an 8/4KiB chunk/overflow split, since our best runs seem to be in that configuration.
Attachment #8745575 - Flags: review?(jcoppeard)
Comment on attachment 8745575 [details] [diff] [review]
5.0_lower_storebuffer_thresholds-v0.diff

Review of attachment 8745575 [details] [diff] [review]:
-----------------------------------------------------------------

Great stuff.
Attachment #8745575 - Flags: review?(jcoppeard) → review+
Backed out for Linux x64 opt xpcshell failure in devtools/server/tests/unit/test_memory_footprint.js.

Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/d4e2eee69d2374de167e711a639ca638f697354f

Push with failure: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=fb1cada3c198212d9dbe5a12067dbf42fdee6539
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=26735395&repo=mozilla-inbound

10:47:11     INFO -  TEST-START | devtools/server/tests/unit/test_memory_footprint.js
10:47:12  WARNING -  TEST-UNEXPECTED-FAIL | devtools/server/tests/unit/test_memory_footprint.js | xpcshell return code: 0
10:47:12     INFO -  TEST-INFO took 736ms
10:47:12     INFO -  >>>>>>>
10:47:12     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
10:47:12     INFO -  (xpcshell/head.js) | test run_next_test 0 pending (2)
10:47:12     INFO -  (xpcshell/head.js) | test MAIN run_test finished (2)
10:47:12     INFO -  running event loop
10:47:12     INFO -  PROCESS | 11090 | DBG-TEST: resource://devtools/shared/Loader.jsm:326: strict error: ReferenceError: reference to undefined property Services.appinfo.name
10:47:12     INFO -  PROCESS | 11090 | DBG-TEST: head_dbg.js observed a console message: ReferenceError: reference to undefined property Services.appinfo.name
10:47:12     INFO -  "CONSOLE_MESSAGE: (warn) [JavaScript Warning: "ReferenceError: reference to undefined property Services.appinfo.name" {file: "resource://devtools/shared/Loader.jsm" line: 326}]"
10:47:12     INFO -  devtools/server/tests/unit/test_memory_footprint.js | Starting init_server
10:47:12     INFO -  (xpcshell/head.js) | test init_server pending (2)
10:47:12     INFO -  TEST-PASS | devtools/server/tests/unit/test_memory_footprint.js | init_server - [init_server : 26] Footprint after DebuggerServer.init() is 48 kB (should be less than 150 kB).
10:47:12     INFO -  !!! The devtools server's memory usage increased either significantly, or slowly over time.
10:47:12     INFO -  !!! If your patch doesn't cause a big increase, feel free to raise the thresholds for this test as needed. - true == true
10:47:12     INFO -  (xpcshell/head.js) | test run_next_test 1 pending (3)
10:47:12     INFO -  (xpcshell/head.js) | test init_server finished (3)
10:47:12     INFO -  (xpcshell/head.js) | test run_next_test 0 finished (2)
10:47:12     INFO -  devtools/server/tests/unit/test_memory_footprint.js | Starting add_browser_actors
10:47:12     INFO -  (xpcshell/head.js) | test add_browser_actors pending (2)
10:47:12     INFO -  TEST-PASS | devtools/server/tests/unit/test_memory_footprint.js | add_browser_actors - [add_browser_actors : 26] Footprint after DebuggerServer.addBrowserActors() is 988 kB (should be less than 2000 kB).
10:47:12     INFO -  !!! The devtools server's memory usage increased either significantly, or slowly over time.
10:47:12     INFO -  !!! If your patch doesn't cause a big increase, feel free to raise the thresholds for this test as needed. - true == true
10:47:12  WARNING -  TEST-UNEXPECTED-FAIL | devtools/server/tests/unit/test_memory_footprint.js | connect_client - [connect_client : 26] Footprint after DebuggerClient.connect() is 2508 kB (should be less than 2000 kB).
Flags: needinfo?(terrence)
My tweaks in this bug change our GC timing a bit. I cannot reproduce locally, so I guess we must be right on the cusp and somewhat dependent on timing -- the overage here is less than our ceiling on the next test. This bumps our maximums down one so that client_connect is where list_tabs was and list_tabs is increased by the same amount.
Flags: needinfo?(terrence)
Attachment #8748749 - Flags: review?(poirot.alex)
Comment on attachment 8748749 [details] [diff] [review]
up_devtools_memory_threshold-v0.diff

Review of attachment 8748749 [details] [diff] [review]:
-----------------------------------------------------------------

As long as you are confident you are not introducing unexpected increase of memory usage.
It is totally fine bumping these thresholds.

So if I follow that story correctly, it is just that before your patch some GC was able to collect some more memory right in time and now it is just delayed?
If you have ideas on how to make this test more deterministic, I'm happy to receive suggestions.
We do not call forceGC before checking for memory usage as we want to prevent unecessary spike in memory usage and not just check the final memory usage after collection.
Attachment #8748749 - Flags: review?(poirot.alex) → review+
Well, this is just tuning our collection heuristics to GC the minor heap more frequently. We're certainly not allocating more memory with this. I suppose it could potentially cause a larger number of things which are transitively live to be moved to tenured storage, but it cannot cause an increase in the live set itself.
https://hg.mozilla.org/mozilla-central/rev/2d50de084439
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Depends on: 1270849
Depends on: 1270774
You need to log in before you can comment on or make changes to this bug.