Open Bug 1110596 Opened 10 years ago Updated 2 years ago

All data passing through HttpChannelParent::OnDataAvailable() is copied

Categories

(Core :: Networking: HTTP, defect, P3)

defect

Tracking

()

Tracking Status
e10s + ---

People

(Reporter: n.nethercote, Unassigned)

References

Details

(Whiteboard: [necko-backlog])

With unimportant stuff stripped away, HttpChannelParent::OnDataAvailable()
looks like this:

> NS_IMETHODIMP
> HttpChannelParent::OnDataAvailable(nsIRequest *aRequest,
>                                    nsISupports *aContext,
>                                    nsIInputStream *aInputStream,
>                                    uint64_t aOffset,
>                                    uint32_t aCount)
> {
>   nsCString data;
>   nsresult rv = NS_ReadInputStreamToString(aInputStream, data, aCount);
>   if (NS_FAILED(rv))
>     return rv;
> 
>   nsresult channelStatus = NS_OK;
>   mChannel->GetStatus(&channelStatus);
> 
>   if (mIPCClosed || !SendOnTransportAndData(channelStatus, mStoredStatus,
>                                             mStoredProgress, mStoredProgressMax,
>                                             data, aOffset, aCount)) {
>     return NS_ERROR_UNEXPECTED;
>   }
>   return NS_OK;
> }

|aCount| varies a lot, from hundreds to millions. Here are the top 10 values
(ranked by frequency) after running Massive (the asm.js benchmark suite):

> (  1)       68 (26.2%, 26.2%): OnDataAvailable: 524288
> (  2)       30 (11.5%, 37.7%): OnDataAvailable: 3022
> (  3)       27 (10.4%, 48.1%): OnDataAvailable: 172166
> (  4)       21 ( 8.1%, 56.2%): OnDataAvailable: 426487
> (  5)       15 ( 5.8%, 61.9%): OnDataAvailable: 722
> (  6)       15 ( 5.8%, 67.7%): OnDataAvailable: 786432
> (  7)       13 ( 5.0%, 72.7%): OnDataAvailable: 1065
> (  8)        7 ( 2.7%, 75.4%): OnDataAvailable: 239431
> (  9)        7 ( 2.7%, 78.1%): OnDataAvailable: 1048576
> ( 10)        6 ( 2.3%, 80.4%): OnDataAvailable: 738

The call to NS_ReadInputStreamToString() creates a new string with a buffer
whose capacity is |aCount|, and then puts the stream data into that buffer.

The call to SendOnTransportAndData() then copies that data from the string
buffer into a Pickle buffer -- possibly requiring the Pickle buffer to be
resized -- along with |channelStatus| et al.

It would be nice if we could avoid this copying, e.g. by resizing the Pickle
buffer to the right size and then streaming the new data directly into it.
IPDL's code generation makes this non-trivial, though.

Here's the cumulative heap profiling data that pointed me at this. This is all
in the parent process, and is from running Massive.

> Cumulative {
>   ~136 blocks in heap block record 1 of 6,907
>   ~113,598,461 bytes (~113,096,829 requested / ~501,632 slop)
>   Individual block sizes: 2,625,536 x 4; 2,101,248 x 3; 1,576,960 x 11; 1,052,672 x 51; 528,384 x 48; 110,592; 69,632; 53,248; 20,480 x 2; 12,288 x 3; 8,192 x 10; ~4,093
>   11.26% of the heap (11.26% cumulative)
>   Allocated at {
>     #01: Pickle::Resize(unsigned int) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.cc:635)
>     #02: Pickle::BeginWrite(unsigned int, unsigned int) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.cc:513)
>     #03: Pickle::WriteBytes(void const*, int, unsigned int) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.cc:558)
>     #04: Pickle::WriteUInt64(unsigned long) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.h:139)
>     #05: mozilla::net::PHttpChannelParent::SendOnTransportAndData(tag_nsresult const&, tag_nsresult const&, unsigned long const&, unsigned long const&, nsCString const&, unsigned long const&, unsigned int const&) (/home/njn/moz/mi4/o64dmd/ipc/ipdl/./PHttpChannelParent.cpp:112)
>     #06: mozilla::net::HttpChannelParent::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned long, unsigned int) (/home/njn/moz/mi4/o64dmd/netwerk/protocol/http/../../../../netwerk/protocol/http/HttpChannelParent.cpp:765)
>   }
> }
> 
> Cumulative {
>   ~1,827 blocks in heap block record 2 of 6,907
>   ~95,780,229 bytes (~94,957,030 requested / ~823,199 slop)
>   Individual block sizes: 1,314,816 x 4; 1,290,240 x 4; 1,052,672 x 3; 1,028,096; 954,368 x 2; 790,528 x 11; 765,952 x 3; 700,416 x 2; 692,224 x 3; 528,384 x 51; 503,808 x 6; 438,272 x 6; 430,080 x 13; 266,240 x 48; 241,664 x 2; 204,800 x 3; 176,128 x 23; 167,936 x 7; 135,168; 126,976 x 4; 57,344; 36,864; 28,672; 20,480 x 2; 12,288 x 3; 8,192 x 4; 4,096 x 41; ~4,093 x 1,577
>   9.50% of the heap (20.76% cumulative)
>   Allocated at {
>     #01: nsStringBuffer::Alloc(unsigned long) (/home/njn/moz/mi4/o64dmd/xpcom/string/../../../xpcom/string/nsSubstring.cpp:219)
>     #02: nsACString_internal::MutatePrep(unsigned int, char**, unsigned int*) (/home/njn/moz/mi4/o64dmd/xpcom/string/../../../xpcom/string/nsTSubstring.cpp:133)
>     #03: nsACString_internal::SetCapacity(unsigned int, mozilla::fallible_t const&) (/home/njn/moz/mi4/o64dmd/xpcom/string/../../../xpcom/string/nsTSubstring.cpp:626)
>     #04: nsACString_internal::SetLength(unsigned int, mozilla::fallible_t const&) (/home/njn/moz/mi4/o64dmd/xpcom/string/../../../xpcom/string/nsTSubstring.cpp:664)
>     #05: NS_ReadInputStreamToString(nsIInputStream*, nsACString_internal&, unsigned int) (/home/njn/moz/mi4/o64dmd/netwerk/protocol/ftp/../../../dist/include/nsNetUtil.h:1630)
>     #06: mozilla::net::HttpChannelParent::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned long, unsigned int) (/home/njn/moz/mi4/o64dmd/netwerk/protocol/http/../../../../netwerk/protocol/http/HttpChannelParent.cpp:754)
>   }
> }
> 
> Cumulative {
>   ~233 blocks in heap block record 3 of 6,907
>   ~88,956,907 bytes (~88,200,939 requested / ~755,968 slop)
>   Individual block sizes: 1,314,816 x 4; 1,290,240 x 4; 1,052,672 x 3; 1,028,096; 954,368 x 2; 790,528 x 11; 765,952 x 3; 700,416 x 2; 692,224 x 3; 528,384 x 51; 503,808 x 6; 438,272 x 6; 430,080 x 13; 266,240 x 48; 241,664 x 2; 204,800 x 3; 176,128 x 23; 167,936 x 7; 135,168; 126,976 x 3; 12,288; 4,096 x 29; ~4,093 x 7
>   8.82% of the heap (29.58% cumulative)
>   Allocated at {
>     #01: Pickle::Resize(unsigned int) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.cc:635)
>     #02: Pickle::BeginWrite(unsigned int, unsigned int) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.cc:513)
>     #03: Pickle::WriteBytes(void const*, int, unsigned int) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.cc:558)
>     #04: mozilla::net::PHttpChannelParent::SendOnTransportAndData(tag_nsresult const&, tag_nsresult const&, unsigned long const&, unsigned long const&, nsCString const&, unsigned long const&, unsigned int const&) (/home/njn/moz/mi4/o64dmd/ipc/ipdl/./PHttpChannelParent.cpp:111)
>     #05: mozilla::net::HttpChannelParent::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned long, unsigned int) (/home/njn/moz/mi4/o64dmd/netwerk/protocol/http/../../../../netwerk/protocol/http/HttpChannelParent.cpp:765)
>     #06: mozilla::net::nsHttpChannel::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned long, unsigned int) (/home/njn/moz/mi4/o64dmd/netwerk/protocol/http/../../../../netwerk/protocol/http/nsHttpChannel.cpp:5657)
>   }
> }
> 
> Cumulative {
>   ~27 blocks in heap block record 6 of 6,907
>   ~21,409,786 bytes (~21,336,186 requested / ~73,600 slop)
>   Individual block sizes: 1,904,640 x 2; 1,380,352 x 3; 856,064 x 13; 331,776 x 7; ~4,093 x 2
>   2.12% of the heap (39.12% cumulative)
>   Allocated at {
>     #01: Pickle::Resize(unsigned int) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.cc:635)
>     #02: Pickle::BeginWrite(unsigned int, unsigned int) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.cc:513)
>     #03: Pickle::WriteBytes(void const*, int, unsigned int) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.cc:558)
>     #04: Pickle::WriteUInt32(unsigned int) (/home/njn/moz/mi4/o64dmd/ipc/chromium/../../../ipc/chromium/src/base/pickle.h:133)
>     #05: mozilla::net::PHttpChannelParent::SendOnTransportAndData(tag_nsresult const&, tag_nsresult const&, unsigned long const&, unsigned long const&, nsCString const&, unsigned long const&, unsigned int const&) (/home/njn/moz/mi4/o64dmd/ipc/ipdl/./PHttpChannelParent.cpp:120)
>     #06: mozilla::net::HttpChannelParent::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned long, unsigned int) (/home/njn/moz/mi4/o64dmd/netwerk/protocol/http/../../../../netwerk/protocol/http/HttpChannelParent.cpp:765)
>   }
> }
Whiteboard: [necko-backlog]
Avoiding this intermediate string in HttpChannelParent::OnDataAvailable() could reduce peak memory usage. (I'm not making this blocking bug 1262918 because it would not reduce the size of the message.) This looks like it would be easy to do on the caller side. The tricky part would be dealing with the message on the receiving side, which bounces around the raw data to a few places.
Blocks: e10s-oom
See Also: → 1263028
I think the patch in bug 1263028 would mostly mitigate the memory impact of this copy, by capping the amount of data that gets sent at a time.
Flags: needinfo?(kchen)
The slickest way to fix this would be to add a ParamTraits<> for nsIInputStream.

This could then be used in all of the places where nsIInputStream is converted to a string to send to the other process (many of these places): http://mxr.mozilla.org/mozilla-central/ident?i=NS_ReadInputStreamToString

The problem is that reading from an nsIInputStream is fallible, judging by the definition of NS_ReadInputStreamToString(), and the ParamTraits::Write() method is infallible. Assuming that can't be ignored, some IPDL codegen work would be needed to handle a fallible Write() method.

I don't think this is a big deal for overall stability, but it would be nice.
(In reply to Andrew McCreight [:mccr8] from comment #3)
> The problem is that reading from an nsIInputStream is fallible, judging by
> the definition of NS_ReadInputStreamToString(), and the ParamTraits::Write()
> method is infallible. Assuming that can't be ignored, some IPDL codegen work
> would be needed to handle a fallible Write() method.

FTPChannelParent::OnDataAvailable(),
HttpChannelParent::OnDataAvailable(),
WyciwygChannelParent::OnDataAvailable(), and
ExternalHelperAppChild::OnDataAvailable()

return just the error code from NS_ReadInputStreamToString() if it fails. So the FatalError() for !Read() is not needed for !Write(), am I right?
Assignee: nobody → tchou
Flags: needinfo?(kchen)
andrew, is this something we still need?
Flags: needinfo?(continuation)
(In reply to Jim Mathies [:jimm] from comment #5)
> andrew, is this something we still need?

It would be a nice improvement, though less critical now that Http only sends 128kb at a time.
Flags: needinfo?(continuation)
Priority: -- → P1
Assignee: tchou → janus926
Which type should be used in the receiver if add a ParamTraits<> for nsIInputStream?

If it's nsIInputStream, then the copy is just moved from the sender to the receiver.
If it's nsCString, then SendOnTransportAndData() and RecvOnTransportAndData() will have different parameter types even ipdl specifies nsIInputStream, would that be ok?
:mccr8, what do you think about comment 7? Thanks.
Flags: needinfo?(continuation)
It seems like nsIInputStream would be better. This code is already making a copy of the string, so it wouldn't be making anything worse, as far as I can tell.

That said, it might be better to spend that time helping out Ben Kelly make it so that bug 1093357 can be used to send nsIInputStream data from the parent to the child, as he talked about in his recent dev.platform post. Then all of the networking protocols could be improved to use that instead. Assuming that is something that makes sense.

(In reply to Ting-Yu Chou [:ting] from comment #4)
> return just the error code from NS_ReadInputStreamToString() if it fails. So
> the FatalError() for !Read() is not needed for !Write(), am I right?

It doesn't need to be fatal, but I'd think there needs to be some way to propagate this error back to the caller.

Also, I should say that I don't know anything about networking code specifically. Hopefully I am not giving you any bad advice.
Flags: needinfo?(continuation)
(In reply to Andrew McCreight [:mccr8] from comment #9)
> It seems like nsIInputStream would be better. This code is already making a
> copy of the string, so it wouldn't be making anything worse, as far as I can
> tell.

But if the goal is to remove the copying from calling NS_ReadInputStreamToString()...

> That said, it might be better to spend that time helping out Ben Kelly make
> it so that bug 1093357 can be used to send nsIInputStream data from the
> parent to the child, as he talked about in his recent dev.platform post.

I'll check that see if it could get rid of the copying.
(In reply to Ting-Yu Chou [:ting] from comment #10)
> > That said, it might be better to spend that time helping out Ben Kelly make
> > it so that bug 1093357 can be used to send nsIInputStream data from the
> > parent to the child, as he talked about in his recent dev.platform post.
> 
> I'll check that see if it could get rid of the copying.

No, it still has the copying in sender.

In SendStreamChildImpl::DoRead() [1], it calls Read() to copy the stream's data into a nsCString buffer and passes it to SendBuffer() which calls Write() to copy it to the Pickle buffer.

And if we don't change the receiver to take a nsIInputStream instead of nsCString, we will need another copying. So I don't think bug 1093357 can help here.

However it sends the data in 32k chunk at maximum, so it brings benefits like bug 1263028.

[1] https://dxr.mozilla.org/mozilla-central/rev/c4449eab07d39e20ea315603f1b1863eeed7dcfe/ipc/glue/SendStreamChild.cpp#276-282
If we go ParamTraits<nsIInputStream>, how many bytes need to read from the input stream in Write()? HttpChannelParent::OnDataAvailable sends chunks of 128k and SendStreamChildImpl::DoRead sends chunks of 32k, so I assume the size need to be client dependent, which somehow Write() needs to take the "count" parameter from Send*() and I think doing this in code generator probably won't be straightforward.

So I came up two other ideas to avoid the copying:

1) Add a "customize" keyword for IPDL, which the function won't be generated automatically. Will need to write the function manually instead, so we can do more complicated things like:

   auto PHttpChannelParent::SendOnTransportAndData(
           ...
           const uint32_t& count,
           const nsIInputStream& stream) -> bool
   {
       ...
       msg__->BeginWrite(count, ...);  // Resize to store the stream's data.
       char* buf = msg__->BeginWriteData(count);
       nsresult rv = NS_ReadInputStreamToString(stream, buf, count);
       if (NS_VAILED(rv) {
           return rv;
       }
       ...
   }

2) Modify IPC so when the parameter is a dependent or an adopt string, don't copy it to the pickle buffer. But based on Channel::ChannelImpl::ProcessOutgoingMessages I think this is impossible as WriteFile [1] and sendmsg [2] both take a single buffer.

[1] https://dxr.mozilla.org/mozilla-central/rev/16663eb3dcfa759f25b5e27b101bc79270c156f2/ipc/chromium/src/chrome/common/ipc_channel_win.cc#487-492
[2] https://dxr.mozilla.org/mozilla-central/rev/16663eb3dcfa759f25b5e27b101bc79270c156f2/ipc/chromium/src/chrome/common/ipc_channel_posix.cc#563,615-622
Ben, this bug is for eliminating one copying when send input stream through IPC, and seems you also wanted to avoid it in bug 1093357 [1].

Now I have three options: 

a. ParamTraits<nsIInputStream> from comment 3, and in comment 12 I found it's not straightforward.
b. Non-auto generated IPC Send*() implementation ((1) of comment 12), but this needs to be done for each sending function.
c. Modify IPC not to copy dependent or adopt string, I learnt this from your comment [1]. I guess this could be done by keeping an extra string in Message class, and call WriteFile/sendmsg two times to send the message and the string, not sure if I understand this correctly.

Do you have any other ideas or suggestions? Thanks.

[1] https://dxr.mozilla.org/mozilla-central/rev/c4449eab07d39e20ea315603f1b1863eeed7dcfe/ipc/glue/SendStreamChild.cpp#253-255
Flags: needinfo?(bkelly)
(In reply to Ting-Yu Chou [:ting] from comment #13)
> Ben, this bug is for eliminating one copying when send input stream through
> IPC, and seems you also wanted to avoid it in bug 1093357 [1].

Well, bug 1093357 would eliminate a copy that IPCStream does on the receiving side.  This bug seems to be about the sending side.

> Do you have any other ideas or suggestions? Thanks.

If the concern is peak memory usage then I think we should just plan to use IPCStream.  We can make it use the PSendStream actor with 32KB buffers.  The copy will still happen, but peak memory will not be kept at a more manageable level.

That being said, it would be nice if ipdl accepted a `const nsACString&` instead of requiring `const nsCString&`.  This would allow us to pass dependent strings to ipdl Send methods.  This would then let us convert this Read():

https://dxr.mozilla.org/mozilla-central/source/ipc/glue/SendStreamChild.cpp#276

To a ReadSegments() that passes a nsDependentCString() to PSendStream::SendBuffer().  We would probably need some amount extract logic to handle pathologically small or large amounts of Available() data in the stream, though.

Personally I think just keeping the copy and moving to 32KB buffers would be enough for now.  We can optimize that out later.
Flags: needinfo?(bkelly)
Another short term option would be to call SendOnTransportAndData() multiple times here.  Something like:

NS_IMETHODIMP
HttpChannelParent::OnDataAvailable(nsIRequest *aRequest,
                                   nsISupports *aContext,
                                   nsIInputStream *aInputStream,
                                   uint64_t aOffset,
                                   uint32_t aCount)
{
  while (aCount > 0) {
    static const uint32_t kMaxToRead = 32 * 1024;
    uint32_t numToSend = std::min<uint32_t>(aCount, kMaxToRead);

    nsCString data;
    nsresult rv = NS_ReadInputStreamToString(aInputStream, data, numToSend);
    if (NS_FAILED(rv)) {
      return rv;
    }
    MOZ_ASSERT(data.Length() == numToSend);

    nsresult channelStatus = NS_OK;
    mChannel->GetStatus(&channelStatus);

    if (mIPCClosed || !SendOnTransportAndData(channelStatus, mStoredStatus,
                                              mStoredProgress, mStoredProgressMax,
                                              data, aOffset, numToSend)) {
      return NS_ERROR_UNEXPECTED;
    }
    aCount -= numToSend;
  }

  return NS_OK;
}
I guess you might need to update aOffset as well and possibly the progress values.
(In reply to Ben Kelly [:bkelly] from comment #15)
> Another short term option would be to call SendOnTransportAndData() multiple
> times here.  Something like:

Bug 1263028 did this.
(In reply to Ben Kelly [:bkelly] from comment #14)
> That being said, it would be nice if ipdl accepted a `const nsACString&`
> instead of requiring `const nsCString&`.  This would allow us to pass
> dependent strings to ipdl Send methods.  This would then let us convert this
> Read():
> 
> https://dxr.mozilla.org/mozilla-central/source/ipc/glue/SendStreamChild.
> cpp#276
> 
> To a ReadSegments() that passes a nsDependentCString() to
> PSendStream::SendBuffer().  We would probably need some amount extract logic
> to handle pathologically small or large amounts of Available() data in the
> stream, though.

I am not sure I understand this part.

nsDependentCString is a subclass of nsCString, why do we need to change the ipdl to accept a `const nsACString&` instead? BTW, ReadSegments() is not implemented for all the input stream types, for instance nsSocketInputStream::ReadSegments().
Flags: needinfo?(bkelly)
Per comment 9 and comment 14, it'd be more reasonable to switch to IPCStream at first and then optimize the copy out.
Depends on: 1274343
(In reply to Ting-Yu Chou [:ting] from comment #18)
> nsDependentCString is a subclass of nsCString, why do we need to change the
> ipdl to accept a `const nsACString&` instead?

Well, nsDependentCString must be null terminated.  I don't believe ReadSegments() provides null terminated buffers. So we would really need to pass nsDependentCSubstring which requires nsACString.

> BTW, ReadSegments() is not
> implemented for all the input stream types, for instance
> nsSocketInputStream::ReadSegments().

True, we could fall back to our current Read() approach in that case.
Flags: needinfo?(bkelly)
Whiteboard: [necko-backlog] → [necko-backlog], e10st?
According to comment 6, comment 9, and comment 14, I don't think this is still a P1, but a nice to have.
Assignee: janus926 → nobody
Priority: P1 → P4
Whiteboard: [necko-backlog], e10st? → [necko-backlog]
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P4 → P1
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.