Closed Bug 1599604 Opened 6 years ago Closed 6 years ago

Map usage in Network monitor is slow and causes massive GC

Categories

(DevTools :: Netmonitor, defect, P2)

defect

Tracking

(firefox73 fixed)

RESOLVED FIXED
Firefox 73
Tracking Status
firefox73 --- fixed

People

(Reporter: Harald, Assigned: Harald)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file, 2 obsolete files)

Follow up from bug 1557862.

STR: Open Network panel on https://firefox-devtools-log-a-lot.glitch.me/ and hit the net* button.

On Windows Firefox didn't even recover to successfully record a profile. Getting a profile on OSX took minutes.

https://perfht.ml/2DjYuD1

On a long request list MapConstructorInit and the GC it causes takes up 100% of the time, in this case for 51s, uninterrupted: https://perfht.ml/37I5E1E (50/50 JS/GC)

Why Map/Set are not for Redux:

  • Map/Set are only optimized for read/write access
  • Creating/destroying them is expensive compared to Array/Object
  • Map/Set are not held in the nursery

This change could be the single-biggest improvement in Network panel, both for memory use and general responsiveness.

Assignee: nobody → hkirschner
Attachment #9115067 - Attachment is obsolete: true

Before/after comparison: https://perfht.ml/2EaYGVs – 16s down to 9.5s on https://firefox-devtools-log-a-lot.glitch.me/ (going from 3k to 4k net logs)

Some notes from the analysis https://twitter.com/FirefoxDevTools/status/1204625601627213825

Summary: Map/Set usage in Network monitor is slow and causes massive GC → Map usage in Network monitor is slow and causes massive GC

From a quick look at the code it seems that the mapSet() function was used to copy the Map every time a new entry was added (in addRequest). If the map gets large then populating it in this way will be slow (I guess it's O(n^2) in the number of entries).

The updated code uses arrays, but still copies the array to add entries.

@Harald, what if we created a wrapper object for the array (MyFastImmutableArray) that mutates when a new request is appended into it. But the internal (wrapped) Array would not be cloned - we'd just push entries at the end (or remove custom requests). We would have to do more changes in the code (not that much) but, the impact on perf could be even bigger. Happy to help if you want me.

Honza

(In reply to :Harald Kirschner :digitarald from comment #0)

Why Map/Set are not for Redux:

  • Map/Set are only optimized for read/write access
  • Creating/destroying them is expensive compared to Array/Object
  • Map/Set are not held in the nursery

I won't claim that Map should be ok for Redux, but this analysis isn't quite right. Map/Set are allocated in the nursery.

When I look at the before/after profiles, I think what's happening is that the Map just uses enough more memory for the same data. It's enough that it exceeds the max nursery size, and lots of short-lived data starts getting tenured simply because the nursery is full. That in turn bloats up the tenured heap, and triggers quite a few major GCs. (In the before profile, there's a constant stream of minor GCs due to OUT_OF_NURSERY, and a regular cadence of major GCs as well, resulting in a sawtoothed memory graph that slowly climbs. The new profile mostly just climbs gradually, with smaller spaced-out drops corresponding to a handful of major GCs.)

I guess there's a question as to why Maps are so much bigger than Arrays (of just the values). I would expect it to be a small factor larger (4-6x, maybe?). A more "fair" comparison would probably be to switch to a plain Object instead of an Array. Though if these action.id's are dense, I would guess the best option would be an Array like you're switching it to here, but use the action.id as the index:

function addValue_was_mapSet(collection, key, value) {
    const newCollection = collection.slice(0);
    newCollection[key] = value;
    return newCollection;
}

and then use in to find an entry instead of the linear search in your above patch. But again, that only makes sense if action.id is dense.

(In reply to Jan Honza Odvarko [:Honza] (always need-info? me) from comment #5)

@Harald, what if we created a wrapper object for the array (MyFastImmutableArray) that mutates when a new request is appended into it. But the internal (wrapped) Array would not be cloned - we'd just push entries at the end (or remove custom requests). We would have to do more changes in the code (not that much) but, the impact on perf could be even bigger. Happy to help if you want me.

The only part that sounds funky about that is removing the custom requests, since that means the older copies are no longer immutable.

If you need to preserve true immutability, you could have one "global" Array with all of the data, plus an Array or Map of removed ids mapped to a generation. ImmutableArray would then be a <length, generation> tuple. Lookup would be something like

function lookup(globals, immutable, key) {
    const gravestone = globals.removed.get(key);
    if (gravestone && gravestone <= immutable.generation)
        return undefined;
    return globals.data[key];
}

function set(globals, immutable, key, value) {
    const { generation, length } = immutable;
    globals.data[key] = value;
    return { generation: generation + 1, length: globals.data.length };
}

function remove(globals, immutable, key) {
    const { oldGeneration, length } = immutable;
    const generation = oldGeneration + 1;
    globals.removed[key] = generation;
    return { generation, length };
}

Except that never uses the length, it relies on globals.data, so really your "immutable Array" could just be a numeric generation.

The problem with this approach is that you can never garbage collect the removed values without potentially invalidating older immutable arrays you have floating around. So perhaps Honza's destructive approach is preferable, since in practice you probably don't care about random old arrays lying around anyway?

I think we can use a generation like you propose but make it more explicit, stored as a reducer. And make the map mutable. Then everywhere we use this map we can implement a shouldComponentUpdate method that would check the generation but not the map.

A lot of time is spend here on making ES6 Map work. Is there a benefit of having Map in the first place? AFAIK, to display data selector functions sort and filter the data to display it, which is all done on arrays with array methods, so I have not seen clear advantages.

(In reply to :Harald Kirschner :digitarald from comment #9)

A lot of time is spend here on making ES6 Map work. Is there a benefit of having Map in the first place? AFAIK, to display data selector functions sort and filter the data to display it, which is all done on arrays with array methods, so I have not seen clear advantages.

Nah, the discussion kind of forks. One thread is "why is Map so slow here?" The other is more a reaction to seeing the Array solution in the patch that replaces O(1) lookup with O(n). Which might be totally fine, if n is going to be small. All of comment 5 and comment 7, and the last part of comment 6, are about various Array-based implementations.

Any solution that uses Map would depend on hypothetical improvements to Map performance and/or space usage that might be generally valuable, but are wholly theoretical and unplanned right now.

The problem is that n is not gonna be small in a lot of cases, that's why I'm a bit worried about the solution with arrays too. But it's totally true that copying the Map like that is gonna be super slow when n is big as well. This should improve the situation when adding new requests, which probably is the thing Harald was measuring.

But then there's the cases where we select/right click/update the requests, and that's where the lookups are. I'm not too concerned about the select and right click cases (that's only happening once for a user action), but I am more for the request updates, because it seems to be called in a variety of situations. So it would be a good idea to check if this is happening often...
So we can likely move forward with the solution but not stop there and continue to measure where performance problems are.

Few random thoughts I got when looking at the code (but we need to measure first, to see if lookups are a problem!):

  • we could decide to never filter things out of the array and instead replace values by "null", and let the array grow only. The "delete" case seems super rare so I believe this wouldn't be bad. This would allow to keep a mutable map <requestId, arrayId> for lookups, removing the cache only when clearing the list. The map wouldn't be used elsewhere than in the reducer.

  • we could make sure that the array is always sorted by the id (maybe that's the case already, if the id generation is monotinic). Then we could do a bisect to find the right request in the array, which would be super fast! I see there may be a problem because of the clones, maybe we can keep the clones in a separate array to avoid this issue? Or make sure we insert them in the right place when we create them, using bisect too.
    I also saw that in the selectors we sort the requests... so it would also avoid this extra step if the request list is already sorted!

My preference would be the always-sorted list because it feels right :-)

@harald: in your patch, I believe you could also change the selectors/requests file to remove the .values calls and avoid an unnedeed iteration.

Lookup isn't the offender in the JS, as you can see in the compare view.

This should improve the situation when adding new requests, which probably is the thing Harald was measuring.

It's also what I thought initially, looking at the first profile. Interestingly the general problem is less about add, as updateRequest is being called for each time necko has new data on a request. So one request triggers just one add, it follows up with multiple update. This makes update so much more bursty and impactful in https://perfht.ml/2EaYGVs .

@harald: in your patch, I believe you could also change the selectors/requests file to remove the .values calls and avoid an unnedeed iteration.

Yes, I have local patches that go into more references. I screwed up hg again and it missed some files in this patch, which I have to look into.

Ah, that's interesting!

I focused on updateRequest in Profile 2 in https://perfht.ml/35wz6Xe.

When looking at the "Profile 2" thread, we definitely see that findIndex is quite present (380ms).

Because updateRequest probably applies on the most recently added requests, maybe a simple fix would be to iterate starting at the end, instead of using findIndex that starts at 0.

We can also see that next is present (89ms in that function). I wonder if using .slice() instead of the [...] notation (that uses the iterator protocol) would be faster.

Attachment #9115299 - Attachment is obsolete: true
Pushed by hkirschner@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4bc6c728cffa Store netmonitor requests in Array vs Map r=Honza
Blocks: 1606740
Regressions: 1606749
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 73
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: