Closed Bug 1369748 Opened 7 years ago Closed 7 years ago

Buffering gray roots takes a long time

Categories

(Core :: JavaScript: GC, enhancement)

55 Branch
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Performance Impact high
Tracking Status
firefox55 --- fixed

People

(Reporter: jonco, Assigned: jonco)

Details

Attachments

(2 files, 1 obsolete file)

Following updates to the statistics code in bug 1361458 we have seen an increase in the number of collections where buffering gray roots was the longest phase.

I investigated this it see whether this was real or a problem with the stats changes.  Running membench locally I see times > 10ms for buffering gray roots even when there are only a few tens of tabs open.  So it seems like this is a real problem.
For example, here is some trace output:

    *** bufferGrayRoots 16.824372 mS, 80560 collecting, 0 non, state 1
      cycle collector holders 15.790875 42541
      TraceAdditionalNativeGrayRoots 1.019483 0
        XPConnect wrapped JS roots 0.700865 5286
        XPConnect variant roots 0.000100 0
        XPConnect wrapped native wrapper maps 0.296006 0

The numbers are time in milliseconds and loop iterations.  The above is for a full GC, but in some cases we were doing a zone GC and there were many roots reported to us for zones that were not being traced.  It seems that most of the time comes from tracing the CC's held JS objects of which there can be many.
Yeah, the holders are just a single gigantic hash table. With a bit of work, they could be made per-zone, which would help the zone GC case. I expect that holders, unlike XPCWNs, are only going to be involved with a single compartment.

Maybe we could cache the objects with a write barrier to avoid the callback overhead.
In theory a holder may move across zones. Say, some chrome JS moves a chrome DOM node to a content DOM node.
Whiteboard: [qf]
Patch to split GCRuntime::beginMarkPhase into a bunch of smaller methods/functions.
Attachment #8875347 - Flags: review?(sphink)
Patch to run unmarking and buffering of gray roots on helper threads in GCRuntime::beginMarkPhase.

I added a new node in the stats heirarchy for preparation done before marking starts.  I had to remove an assertion from DOM workers that their trace method is called on the main thread.

Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b61878c0ebb1520d3a767260a24fee9ca0abd796
Attachment #8875365 - Flags: review?(sphink)
Running possibly weird browser code off the main thread is a little scary.
Attachment #8875347 - Flags: review?(sphink) → review+
Could you change the worker assert to check that it is either on the main thread or a JS worker thread? Or maybe at least that it isn't on a worker thread?
Comment on attachment 8875365 [details] [diff] [review]
bug1369748-parallelise-begin-mark-phase

Review of attachment 8875365 [details] [diff] [review]:
-----------------------------------------------------------------

I'm sure there *must* be something horribly wrong with this patch, but I'm not seeing it. Looks good to me.

::: dom/workers/WorkerPrivate.cpp
@@ -4033,5 @@
>  
>  template <class Derived>
>  NS_IMPL_CYCLE_COLLECTION_TRACE_BEGIN_INHERITED(WorkerPrivateParent<Derived>,
>                                                 DOMEventTargetHelper)
> -  tmp->AssertIsOnParentThread();

How hard would it be to tmp->AssertIsOnParentThreadOrPreparingForGC() or something?

::: js/src/gc/GenerateStatsPhases.py
@@ +79,5 @@
> +        PhaseKind("UNMARK", "Unmark", 7),
> +        PhaseKind("BUFFER_GRAY_ROOTS", "Buffer Gray Roots", 49),
> +        PhaseKind("MARK_DISCARD_CODE", "Mark Discard Code", 3),
> +        PhaseKind("RELAZIFY_FUNCTIONS", "Relazify Functions", 4),
> +        PhaseKind("PURGE", "Purge", 5),

Heh. Now the telemetry numbering serves as a history of how this code used to work.
Attachment #8875365 - Flags: review?(sphink) → review+
Whiteboard: [qf] → [qf:p1]
Updated to remove another DOM assertion that tracing happens on the main thread.
Attachment #8875365 - Attachment is obsolete: true
Attachment #8876050 - Flags: review+
Comment on attachment 8876050 [details] [diff] [review]
bug1369748-parallelise-begin-mark-phase v2

Requesting additional review for DOM changes to remove assertions that tracing happens on a specific thread, and to check you're happy with this change.
Attachment #8876050 - Flags: review?(bugs)
Attachment #8876050 - Flags: review?(bugs) → review+
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/aafdd9bcceaf
Refactor GCRuntime::beginMarkPhase r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/825e71dae9bf
Parallelise the start of GC marking r=sfink r=smaug
https://hg.mozilla.org/mozilla-central/rev/aafdd9bcceaf
https://hg.mozilla.org/mozilla-central/rev/825e71dae9bf
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Performance Impact: --- → P1
Whiteboard: [qf:p1]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: