Closed Bug 1258172 Opened 8 years ago Closed 8 years ago

Super janky scrolling in inverted allocations tree

Categories

(DevTools :: Memory, defect, P2)

defect

Tracking

(firefox48 fixed)

RESOLVED FIXED
Firefox 48
Tracking Status
firefox48 --- fixed

People

(Reporter: fitzgen, Assigned: fitzgen)

References

(Blocks 1 open bug)

Details

Attachments

(4 files)

STR:

* Open memory tool on irccloud.com
* Enable recording allocations
* Refresh
* Take snapshot
* View -> "aggregates"
* Group by -> "inverted call stack"

Profile: http://media.fitzgeraldnick.com/perf-profile-of-memory-profiling-irccloud.json

Tons of time in minor gc, and some in major gc too but not much. I think this is a fishy amount of minor gc'ing, its just too much time.
Has STR: --- → yes
~40% of time in nursery collections

~16% of time in major GC
Repeatedly getting ~10ms nursery collections inside Tree#_dfsFromRoots. Nursery collections shouldn't usually be more than 1-2ms or so.

ni myself to re-test with JS_GC_PROFILE_NURSERY=1 env variable to see where nursery time is being spent
Flags: needinfo?(nfitzgerald)
Flags: needinfo?(nfitzgerald)
Flags: needinfo?(nfitzgerald)
New STR:

* Open memory tool
* View -> "aggregate"
* Group by -> "Inverted call stack"
* Import this snapshot: http://media.fitzgeraldnick.com/dumping-grounds/irccloud-with-stacks.fxsnapshot

----------------------------------------------------

If I change Tree#_dfsFromRoots' for/of loop into a C-style for loop, then performance improves quite a bit, but is still fairly suboptimal.

This is worrying because the temporary iteration protocol objects are /exactly/ the kinds of objects that generational GC is supposed to make "almost free" to allocate.
Here is the results with `JS_GC_PROFILE_NURSERY=5000`:

> MinorGC:               Reason  PRate Size    Time canIon mkVals mkClls mkSlts mcWCll mkGnrc ckTbls mkRntm mkDbgr clrNOC collct  tenCB swpABO updtIn frSlts  clrSB  sweep resize pretnr logPtT
> MinorGC:    FULL_STORE_BUFFER   1.1%    9   12636      0    129      0    106      1  11650      0    118      0      2     88      1     10      3      0     10    517      1      0
> MinorGC:    FULL_STORE_BUFFER  11.8%   16    5554      0    431    283    152     13    660      0    102      0      1   3360      1     14      4      5     15    510      0      0
> MinorGC:    FULL_STORE_BUFFER   0.7%   14    6064      0    151      1     86      1   4883      0     60      0      1     48      0      4      2      0      9    720     98      0
> MinorGC:    FULL_STORE_BUFFER  11.6%   16    5671      0    246    201     76      8    364      0     93      0      1   3532      3     15      7      9     10   1105      0      0
> MinorGC:    FULL_STORE_BUFFER   1.4%    7   23299      0    127      1     74      0  22531      0     87      1      0     84      0      5      2      0     12    374      1      0
> MinorGC:    FULL_STORE_BUFFER   1.1%    9   21642      0    102      0     82      0  20882      0     75      0      1     59      1      3      2      0     10    424      1      0
> MinorGC:    FULL_STORE_BUFFER   1.1%    9    5353      0    105      0     94      1   4620      0     55      0      1     39      0      3      1      0      9    423      0      0
> MinorGC:    FULL_STORE_BUFFER   1.1%    9    5432      1    106      0     73      1   4689      1     60      1      0     44      1      4      2      0      8    441      0      0
> MinorGC:    FULL_STORE_BUFFER   1.1%    9    9130      0    117      1     87      1   8330      0     76      1      0     58      0      4      3      0     10    442      0      0
> MinorGC:    FULL_STORE_BUFFER   1.1%    9   21082      0    103      0     82      1  20320      0     77      1      0     61      0      4      1      0     10    420      0      0
> MinorGC:    FULL_STORE_BUFFER   1.1%    9   10835      0    120     11     89      1  10074      0     63      0      0     40      1      4      1      0      9    420      1      0
> MinorGC:    FULL_STORE_BUFFER   1.1%    9   20792      0    129      0     96      1  19975      0     78      0      1     58      0      4      2      0      9    437      1      0
> MinorGC:    FULL_STORE_BUFFER   1.1%    9  169353      0    106      0     92      1 168437      0    102      1      1     93      1      5      4      0     13    494      0      0

Looks like mkGnrc is dominating, which is similar to bug 1237058.
Seems to involve a lot of CCW wrapping, just like bug 1237058. Seems to be the same nursery perf cliff.
Flags: needinfo?(nfitzgerald)
See Also: → 1237058
Terrence, I remember you telling me how using stable hashing for the CCW map (bug 1225577) could help with reducing mkGnrc time, but I have completely forgotten why/how that might be. Can you re-enlighten me? Thanks!
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #6)
> Terrence, I remember you telling me how using stable hashing for the CCW map
> (bug 1225577) could help with reducing mkGnrc time, but I have completely
> forgotten why/how that might be. Can you re-enlighten me? Thanks!

Terrence: ^
Flags: needinfo?(terrence)
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #6)
> Terrence, I remember you telling me how using stable hashing for the CCW map
> (bug 1225577) could help with reducing mkGnrc time, but I have completely
> forgotten why/how that might be. Can you re-enlighten me? Thanks!

It would help because we're moving the update from the generic buffer to the normal store buffer. This allows for deduplication and avoids a ton of complexity in the update loop. Given that inlining the normal update loop saved us 60% of the time we were spending there, I think this has the potential to be a huge win.

I've already got patches doing this. I'll dust those off and re-request review today.
Flags: needinfo?(terrence)
Note: I don't plan on landing this until I can confirm that Terrence's ongoing
nursery + CCW work helps in this Real World test case.
Attachment #8734121 - Flags: review?(jimb)
Assignee: nobody → nfitzgerald
Status: NEW → ASSIGNED
Comment on attachment 8734121 [details] [diff] [review]
Use C-style for loops instead of for/of to cut down on the number of CCWs when using the Tree component

Review of attachment 8734121 [details] [diff] [review]:
-----------------------------------------------------------------

This looks fine to me, with no changes. But:

- If you don't know that it's worth it to pull `length` out of the loop like that, it's pretty distracting.

- Some of these could become `.forEach` calls and not be quite so ugly.

Up to you.
Attachment #8734121 - Flags: review?(jimb) → review+
Terrence, we're getting a little close to uplift, so I want to land this before then. We can always locally back out this patch to retest your nursery + wrappers changes.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/690bbfa6023c
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 48
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.