Deallocating lots of pipes is slow

NEW
Unassigned

Status

()

Core
Networking
P3
normal
3 years ago
4 months ago

People

(Reporter: bz, Unassigned)

Tracking

(Blocks: 1 bug, {perf})

Trunk
x86
Mac OS X
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [necko-backlog])

Attachments

(2 attachments, 1 obsolete attachment)

See the testcase in bug 1149235.

On that testcase, we're spending over 1s under ~nsPipe for all those data: channels (triggered via nsInputStreamReadyEvent::Release releasing the pipe input stream, which release the pipe.

All of this time is under arena_dalloc.  A bunch of it is under _platform_bzero and the rest under arena_run_dalloc/arena_purge/etc.

At a guess, this is the memory being freed by nsSegmentedBuffer, but I'm not sure...

In any case, we're talking about 100000 pipes here, so 1s means 10us per destructor.  That's a fair amount of time...
Er, I meant bug 1148897.
Blocks: 1148897
No longer blocks: 1149235

Comment 2

3 years ago
Isn't this a direct trade off of doing a single large allocation and many small segmented allocations?

In theory the segments should get free'd as the data is read from the pipe, though.  Do you expect pipes with a lot of unread data left in them in that test?
Nope.  That test should have nothing of the sort.  The pipes involved should be tiny. They all come from this bit:

    a.src = "data:text/javascript,++j";

so they contain 3 bytes each.  And they get read in their entirety, of course.

Comment 4

3 years ago
I guess we trigger a minimum of two deallocations for every ~nsSegmentBuffer:

  if (mSegmentArray) {
    for (uint32_t i = 0; i < mSegmentArrayCount; i++) {
      if (mSegmentArray[i]) {
        moz_free(mSegmentArray[i]);
      }
    }
    nsMemory::Free(mSegmentArray);
    mSegmentArray = nullptr;
  }

Also, it looks like nsMemory::Free() goes through a function pointer vs xpcomFunctions.freeFunc().

Comment 5

3 years ago
Do you have a Cleopatra link?
Flags: needinfo?(bzbarsky)

Comment 6

3 years ago
Seems like ~nsSegmentBuffer could just iterate from mFirstIndexBuffer to mLastIndexBuffer instead of the entire mSegmentArray.
> Do you have a Cleopatra link?

No, this was an Instruments profile.
Flags: needinfo?(bzbarsky)

Updated

3 years ago
Assignee: nobody → bkelly
Status: NEW → ASSIGNED

Comment 8

3 years ago
Created attachment 8586183 [details] [diff] [review]
P1 Add a test to benchmark pipe destruction.

Here is the benchmark I wrote to try to figure out what works or not.

Boris, for reference, what does your machine get on this test?

  ./mach gtest Pipes.DestroyPerf

Thanks.
Attachment #8586183 - Flags: feedback?(bzbarsky)
Comment on attachment 8586183 [details] [diff] [review]
P1 Add a test to benchmark pipe destruction.

Pipes.DestroyPerf took total:172786.342000us mean:1.727863us
Pipes.DestroyPerf took total:185193.900000us mean:1.851939us
Pipes.DestroyPerf took total:172961.337000us mean:1.729613us
Pipes.DestroyPerf took total:167264.762000us mean:1.672648us
Pipes.DestroyPerf took total:171641.188000us mean:1.716412us
Attachment #8586183 - Flags: feedback?(bzbarsky) → feedback+

Comment 10

3 years ago
These results are pretty different from 10us referenced in comment 0.  This seems to suggest there is something else slowing down the deallocations.  Perhaps the other test fragments the heap quite a bit.

Comment 11

2 years ago
I'm not actively working this.  I'm going to upload the patches wip patch I had to get it out of my local queue.
Assignee: bkelly → nobody
Status: ASSIGNED → NEW

Comment 12

2 years ago
Created attachment 8704678 [details] [diff] [review]
P1 Add a test to benchmark pipe destruction.
Attachment #8586183 - Attachment is obsolete: true
Whiteboard: [necko-backlog]
You need to log in before you can comment on or make changes to this bug.