Closed Bug 1277681 Opened 8 years ago Closed 8 years ago

Large IPC message with PNeckoChild::SendPHttpChannelConstructor called from ContinueAsyncOpen()

Categories

(Core :: Networking: HTTP, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
e10s + ---
firefox49 --- affected
firefox50 --- affected
firefox51 --- fixed
firefox52 --- fixed

People

(Reporter: mccr8, Assigned: jdm)

References

Details

(Whiteboard: [necko-active])

Crash Data

Attachments

(1 file, 3 obsolete files)

At least for the last few days, this looks like the top source of bug 1271102 crashes, after all of the various drag and drop crashes. These ContinueAsyncOpen() seem to all be called from HttpChannelChild::ResetInterception(). These crashes look like they might all be from a single user, who is trying to use some kind of mail.ru cloud storage site.

I'm not really sure what could be so large in this message, outside of the URLs, but these messages are really big: one is over 900MB.

The call to SerializeInputStream() in there does jump out at me as something that could end up producing quite a bit of data, but maybe not.

Example crashes:
https://crash-stats.mozilla.com/report/index/c2293fa7-fb10-4ce3-9a44-7b7f02160602
https://crash-stats.mozilla.com/report/index/1fcb692a-41fb-4ac4-afb5-99fcd2160602
https://crash-stats.mozilla.com/report/index/0527df79-19a4-4cef-bd17-824042160530
https://crash-stats.mozilla.com/report/index/e4dc064b-1920-4f26-b7a0-c4d442160527
So someone is doing large uploads on a site with a service worker registered?  The ResetInterception implies service workers are in play.
If you want to poke around, the URLs I saw were from cloud.mail.ru specifically.
probably not related, but been you might want to have a quick glance at https://bugzilla.mozilla.org/show_bug.cgi?id=1277582

3 of the 4 stacks I looked at in that bug had resetinterception() on the stack.
Josh: Honza suggested you're the best person to look at this.  Can you take it?
Flags: needinfo?(josh)
I can reproduce this on cloud.mail.ru uploading a 150mb file.
Assignee: nobody → josh
Flags: needinfo?(josh)
Is this something the Ben Kelly's IPC pipe could fix?
Its actually the original case I broke IPCStream out of dom/cache for.  I discovered, however, that necko expects a fixed length upload stream in order to set the Content-Length via an Available() call.  AFAIK we don't support chunked encoding upload streams.

So to use IPCStream here would require either:

1) an extra async step on the parent side to accumulate to a fixed sized buffer
2) implementing chunked uploads (which I think Patrick said is only really feasible in H2)
To record what I've dug up so far:
* the original upload stream that the XHR is being provided is a multiplex stream containing a string input stream, a buffered input stream, and another string input stream
* the buffered input stream wraps a multiplex stream that contains a single stream
* HttpBaseChannel::EnsureUploadStreamIsCloneable calls NS_InputStreamIsCloneable, which calls nsMultiplexInputStream::GetCloneable, which fails to obtain a cloneable input stream for the buffered input stream
* HttpBaseChannel::EnsureUploadStreamIsCloneable falls back to creating a storage stream that contains the combined contents of the three streams in the original upload stream (which is 159mb in my testcase)
* The storage stream is serialized into an nsCString via SerializeInputStream, which causes us to blow through the IPC message limit

Ben, given that the buffered input stream (presumably) wraps the file I'm loading, is there anything we can do here to not serialize the contents of the file over IPC?
Whiteboard: [necko-active]
Priority: -- → P1
We're going to increase the release assert limit in the parent. However this seems bad regardless and should get fixed.
I think the clearest path forward for fixing this is to find a way to split the body across an arbitrary number of messages and accumulating it in the parent before opening the channel.
Whiteboard: [necko-active] → [necko-active][e10st?]
jdm: are you going to have time to work on this soon?  Should I find a new owner?
Flags: needinfo?(josh)
I totally forgot about this. If there is someone else who could own it, that would be great.
Flags: needinfo?(josh)
I'm going to try writing a patch today.
Whiteboard: [necko-active][e10st?] → [necko-active]
WIP
Comment on attachment 8779433 [details] [diff] [review]
Send HTTP request bodies in chunks

This patch makes uploading a 384mb PDF to cloud.mail.ru work; in nightly the browser goes completely unresponsive and shows the e10s spinner forever. Ben, could you do a sanity check here? I still am not totally confident on how to determine if an async copy is necessary on the deserialized IPCStream, so right now we always do one. That seems like it could be more inefficient than necessary.
Attachment #8779433 - Flags: feedback?(bkelly)
Re-nomming for e10s triage. It's starting to sound like we have a ceiling on uploads with e10s enabled that we don't have with e10s disabled, and that we're likely to crash if we hit that limit.
Comment on attachment 8779433 [details] [diff] [review]
Send HTTP request bodies in chunks

Review of attachment 8779433 [details] [diff] [review]:
-----------------------------------------------------------------

I think this is the right idea.  I agree we have room for more optimization in the future.

I'm going to file a few follow-up bugs for stream improvements that would help here.

::: netwerk/ipc/NeckoChannelParams.ipdlh
@@ +96,5 @@
>    OptionalURIParams           topWindowURI;
>    uint32_t                    loadFlags;
>    RequestHeaderTuples         requestHeaders;
>    nsCString                   requestMethod;
> +  OptionalIPCStream           uploadStream;

You should be able to remove this as well:

OptionalFileDescriptorSet   fds;

::: netwerk/protocol/http/HttpBaseChannel.cpp
@@ +697,5 @@
>    nsCOMPtr<nsIInputStream> source;
>    if (NS_InputStreamIsBuffered(mUploadStream)) {
>      source = mUploadStream;
>    } else {
> +    nsresult rv = NS_NewBufferedInputStream(getter_AddRefs(source), mUploadStream, 4096);

Since PSendStream uses 32kb transfers it might be a slight optimization to use 32kb here, in the pipe, and in NS_AsyncCopy() below.

::: netwerk/protocol/http/HttpChannelChild.cpp
@@ +1874,5 @@
>    mRequestHead.Method(openArgs.requestMethod());
>  
> +  AutoIPCStream autoStream(openArgs.uploadStream());
> +  if (mUploadStream) {
> +    autoStream.Serialize(mUploadStream, gNeckoChild->Manager());

Call autoStream.TakeOptionalValue() here.

@@ +1898,2 @@
>      optionalFDs = mozilla::void_t();
> +    /*} else if (fds.Length() <= kMaxFileDescriptorsPerMessage) {

You should be able to remove all of this file descriptor related code since its handled by AutoIPCStream.

@@ +1987,5 @@
>                                                  openArgs)) {
>      return NS_ERROR_FAILURE;
>    }
>  
> +  autoStream.TakeOptionalValue();

This should be called above where the autoStream is actually used.

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +236,5 @@
> +    Unused << SendFailedAsyncOpen(rv);
> +    return;
> +  }
> +
> +  if (mEnforceSecurity) {

Instead of a new boolean flag, I think you can just get the loadInfo out of the channel to call GetEnforceSecurity().  I think that would be slightly better since we don't have state that can get out of sync.

@@ +269,5 @@
> +};
> +
> +void
> +UploadCopyComplete(void* aClosure, nsresult aStatus) {
> +  // Called on the STS thread by NS_AsyncCopy

MOZ_ASSERT() that its not the main thread?

@@ +274,5 @@
> +  RefPtr<UploadStreamClosure> closure = static_cast<UploadStreamClosure*>(aClosure);
> +  closure->mStatus = aStatus;
> +  NS_DispatchToMainThread(closure);
> +}
> +}

nit: // anonymous namespace

@@ +437,5 @@
>    if (stream) {
> +    delayAsyncOpen = true;
> +
> +    nsCOMPtr<nsIStorageStream> storageStream;
> +    nsresult rv = NS_NewStorageStream(4096, UINT32_MAX,

nit: It might be a slight optimization to use 32kb for the storage stream segment size and the NS_AsyncCopy() below.  That is the amount of data PSendStream tries to send across the IPC layer in each call.

@@ +456,5 @@
> +      return SendFailedAsyncOpen(rv);
> +    }
> +
> +    nsCOMPtr<nsIEventTarget> target =
> +        do_GetService(NS_STREAMTRANSPORTSERVICE_CONTRACTID);

I think this can fail during shutdown.  We should probably verify the target exists. 

Or maybe it can't?  I now see we don't check in the child either.  Probably worth researching, though.

@@ +462,5 @@
> +    RefPtr<UploadStreamClosure> closure(new UploadStreamClosure(this));
> +
> +    rv = NS_AsyncCopy(stream, sink, target, NS_ASYNCCOPY_VIA_READSEGMENTS,
> +                      4096, // copy segment size
> +                      UploadCopyComplete, closure.forget().take());

I agree it would be nice if we could avoid this, but I don't think our stream primitives expose the information necessary.  Really we want to determine if the source stream fits one of:

1) Stream is already complete.  This could be a fixed length string stream or it could be a variable length stream where we know no more data will be written.  For example a storage or pipe where the nsIOutputStream has been closed.  File streams might also fall into this category since we know their total length.
2) Stream is incomplete and is expected to complete even if its never drained.  This would be a storage stream or a pipe with infinite capacity where the nsIOutputStream has not been closed yet.
3) Stream is incomplete, but it may not complete unless bytes are read out of it.  This is a pipe with limited capacity.

For cases (1) and (2) we should have a simple "on stream complete" callback.  This lets you know when all the data is available, but you don't have to read the data out yet.  Then you could pass the stream straight to InternalSetUploadStream() and call UploadCopyComplete() when the stream completes.

For case (3) we probably still need to perform the NS_AsyncCopy().  Otherwise the input stream would stall.

I'll file a bug to implement something like this.
Attachment #8779433 - Flags: feedback?(bkelly) → feedback+
Also note that by buffering this in the parent we are risking taking down the entire browser instead of just crashing one content browser.

We either need to figure out a way to really stream directly to the network socket or we should be writing large uploads to a temp file in order to accumulate them.
Comment on attachment 8779433 [details] [diff] [review]
Send HTTP request bodies in chunks

Review of attachment 8779433 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +462,5 @@
> +    RefPtr<UploadStreamClosure> closure(new UploadStreamClosure(this));
> +
> +    rv = NS_AsyncCopy(stream, sink, target, NS_ASYNCCOPY_VIA_READSEGMENTS,
> +                      4096, // copy segment size
> +                      UploadCopyComplete, closure.forget().take());

In the meantime you can probably infer that the stream is fixed length or complete if it QIs to nsIIPCSerializableInputStream.
Also, I'm fairly certain you can completely remove EnsureUploadStreamIsCloneable() if you make this use the 3-argument form of NS_CloneInputStream():

https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/HttpBaseChannel.cpp#772

Your modified EnsureUploadStreamIsCloneable() basically does the same thing as that now anyway.
(In reply to Ben Kelly [:bkelly] from comment #20)
> Also, I'm fairly certain you can completely remove
> EnsureUploadStreamIsCloneable() if you make this use the 3-argument form of
> NS_CloneInputStream():
> 
> https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/
> HttpBaseChannel.cpp#772
> 
> Your modified EnsureUploadStreamIsCloneable() basically does the same thing
> as that now anyway.

EnsureUploadStreamIsCloneable has a runnable that either executes synchronously or once the stream is finished being cloned, so I'm going to defer that to a followup.
Rebased and comments addressed.
Attachment #8779433 - Attachment is obsolete: true
Ben, the test_file_blob_upload.html shows a case where we redirect a channel that is now using a pipe as the upload stream. SetupRedirectChannel rewinds the upload stream using Seek, which nsPipeInputStream doesn't support. Any ideas what we can do here?
Flags: needinfo?(bkelly)
I think we discussed in IRC possibly keeping the original seekable stream and changing AutoIPCStream to still use PSendStream for large streams (bug 1294450).
Flags: needinfo?(bkelly)
Attachment #8792615 - Attachment is obsolete: true
Attachment #8795047 - Attachment is obsolete: true
Comment on attachment 8795919 [details] [diff] [review]
Send HTTP request bodies in chunks

Instead of restricting HTTP upload channels to channel types that are serializable (thus forcing the full stream contents to be serialized as part of the AsyncOpen IPDL message), this patch uses the OptionalIPCStream which allows including PSendStream actors which can break the stream into separate serialized pieces. In conjunction with bug 1294450, this means that HTTP upload streams over 1MB will be streamed to the parent in pieces, avoiding the IPDL message limit crash.

The difficulty here is that the HTTP code requires the full upload stream contents to be available, so we need to delay the rest of AsyncOpen in the parent until we have received the full stream. This patch chooses to accumulate the pieces in a storage stream in the parent, so the full upload stream will have to reside in memory in the parent process before the request can continue.
Attachment #8795919 - Flags: review?(honzab.moz)
Crash Signature: [@ mozilla::ipc::ProcessLink::SendMessageW ]
See Also: → 1289001
(In reply to Josh Matthews [:jdm] from comment #31)
> Comment on attachment 8795919 [details] [diff] [review]
> Send HTTP request bodies in chunks
> 
> Instead of restricting HTTP upload channels to channel types that are

Did you want to say "HTTP upload channels to *stream* types" ?

> serializable (thus forcing the full stream contents to be serialized as part
> of the AsyncOpen IPDL message), this patch uses the OptionalIPCStream which
> allows including PSendStream actors which can break the stream into separate
> serialized pieces. In conjunction with bug 1294450, this means that HTTP
> upload streams over 1MB will be streamed to the parent in pieces, avoiding
> the IPDL message limit crash.
> 
> The difficulty here is that the HTTP code requires the full upload stream
> contents to be available, so we need to delay the rest of AsyncOpen in the
> parent until we have received the full stream. 

Hmm.. we create nsIMultiplexInputStream from headers and the content wrapped in BufferedInputStream.  None of them are async, but could be turned to - with some effort.  Transaction when sending the data out is capable of waiting for the request stream if that is implementing nsIAsyncInputStream.

> This patch chooses to
> accumulate the pieces in a storage stream in the parent, so the full upload
> stream will have to reside in memory in the parent process before the
> request can continue.

Bad... but better than before.
I'll review the patch as is, the previous comment is more for FYI.
Crash volume for signature 'mozilla::ipc::ProcessLink::SendMessageW':
 - nightly (version 52): 96 crashes from 2016-09-19.
 - aurora  (version 51): 194 crashes from 2016-09-19.
 - beta    (version 50): 184 crashes from 2016-09-20.
 - release (version 49): 353 crashes from 2016-09-05.
 - esr     (version 45): 0 crashes from 2016-06-01.

Crash volume on the last weeks (Week N is from 10-03 to 10-09):
            W. N-1  W. N-2
 - nightly      68      28
 - aurora      165      29
 - beta        152      32
 - release     236     117
 - esr           0       0

Affected platform: Windows

Crash rank on the last 7 days:
           Browser   Content     Plugin
 - nightly #1115     #24
 - aurora  #1684     #11
 - beta    #11956    #18
 - release #498      #98
 - esr
Comment on attachment 8795919 [details] [diff] [review]
Send HTTP request bodies in chunks

Review of attachment 8795919 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry it took that long time.  I was at a meeting for a week.

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +251,5 @@
> +    rv = mChannel->AsyncOpen(mParentListener, nullptr);
> +  }
> +  if (NS_FAILED(rv)) {
> +    Unused << SendFailedAsyncOpen(rv);
> +    return;

no need for return

@@ +287,5 @@
> +};
> +
> +void
> +UploadCopyComplete(void* aClosure, nsresult aStatus) {
> +  // Called on the STS thread by NS_AsyncCopy

please expand STS as Stream Transport Service (it can be ambiguous with socket transport thread).
Attachment #8795919 - Flags: review?(honzab.moz) → review+
Crash Signature: [@ mozilla::ipc::ProcessLink::SendMessageW ] → [@ mozilla::ipc::ProcessLink::SendMessageW ] [@ mozilla::ipc::ProcessLink::SendMessageW | IPC_Message_Name=PNecko::Msg_PHttpChannelConstructor ]
https://hg.mozilla.org/mozilla-central/rev/39d9931967d4
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Hi :jdm, could you please nominate uplift to Aurora51?
Flags: needinfo?(josh)
Comment on attachment 8795919 [details] [diff] [review]
Send HTTP request bodies in chunks

Approval Request Comment
[Feature/regressing bug #]: e10s
[User impact if declined]: If either the patch in bug 1277681 or this patch are not uplifted, it will remain impossible to upload files larger than 128mb because the content process will crash.
[Describe test coverage new/current, TreeHerder]: There's a lot of prior test coverage for HTTP request bodies that did not regress.
[Risks and why]: The biggest risk is that large uploads now need to be buffered entirely in memory in the parent process before an HTTP request can be initiated. This means that if we run out of memory the entire browser could crash, in contrast to the current situation where the content process always crashes after hitting the 128mb limit for a single HTTP upload.
[String/UUID change made/needed]: None.
Flags: needinfo?(josh)
Attachment #8795919 - Flags: approval-mozilla-aurora?
Comment on attachment 8795919 [details] [diff] [review]
Send HTTP request bodies in chunks

Avoid large IPC crash. Take the patch together with the patch of bug 1294450 in 51 aurora.
Attachment #8795919 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: