Closed Bug 1595578 Opened 11 months ago Closed 10 months ago

Multipart POSTs with non-binary data > ~1MB crash the current tab

Categories

(Core :: Networking: HTTP, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla72
Fission Milestone M5
Tracking Status
firefox-esr68 --- unaffected
firefox71 --- unaffected
firefox72 --- fixed

People

(Reporter: mao, Assigned: mattwoodrow)

References

(Regression, )

Details

(Keywords: regression, Whiteboard: [necko-triaged])

Crash Data

Attachments

(3 files)

+++ This bug was initially created as a clone of Bug #1595513 +++

Mozilla/5.0 (X11; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
Version: 72.0a1
Build ID: 20191111100226

Steps to reproduce:

  1. Attempt a multipart form submission whose non-binary (non file) data amounts to more than 1MB (e.g. by running the attached server.js NodeJS test and browsing it at https://localhost:5000 )

Actual results:

The tab usually crashes, see https://crash-stats.mozilla.org/report/index/9675a718-8e79-4937-8959-db1780191111.

Expected results:

The submission should complete normally.

mozregression result

7:42.18 INFO: No more inbound revisions, bisection finished.
7:42.18 INFO: Last good revision: f748a3d2cdf108e9443fd15332efe477c7c398a9
7:42.18 INFO: First bad revision: 25b533bff4051e6a8177bbc83eed49ac460e109e
7:42.18 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=f748a3d2cdf108e9443fd15332efe477c7c398a9&tochange=25b533bff4051e6a8177bbc83eed49ac460e109e

which indicates document channel

Flags: needinfo?(matt.woodrow)
Assignee: nobody → matt.woodrow
Flags: needinfo?(matt.woodrow)

Trying to reproduce this, I get basically the same crash, but in the parent process instead :(

Baku, this looks like we crash trying to deserialize an nsIInputStream member from IPDL.

The linked crash is deserializing the one in ReplacementChannelConfigInit, my one is from DocShellLoadStateInit.

What is the expected handling when these streams are greater than 1mb?

Flags: needinfo?(amarchesini)

It looks like IPCStreamSource::DoRead calls SendBuffer 33 times (32kb per buffer, seems roughly right), and then we get IPCStreamSource::OnEnd(NS_BASE_STREAM_CLOSED) which calls SendClose(NS_OK).

IPCStreamDestination::CloseReceived calls Send__delete__ which deletes the actor, and then our IPDL message referencing this stream fails to deserialize the actor reference.

Talking about inputStreams, in the content process, the test creates a stream like this:
nsMIMEInputStream ( nsMultiplexInputStream ( nsStringInputStream[data=1mb] ))

This stream is serialized and sent to the parent process. Because the steam length is greater than 1mb, the deserialized stream is:
InputStreamLengthWrapper ( nsPipeInputStream )
The InputStreamLengthWrapper is a helper that exposes the total length of the stream without reading it. The pipe is used to receive data from the content process to the parent one.

Then, because of your ReplacementChannelConfigInit, the stream is sent back to the content process. First question: why? Can you tell me more about what ReplacementChannelConfigInit is used for?
The serialized inputStream is: PIPCRemoteStreamParams( PParentToChildStreamParent )
Now we use a Parent-to-child stream to send a stream, already owned by the content process. Even if it works, we have a double pipe:

  1. the parent reading data from the content
  2. that data is sent back to the content process.

The crash happens because the deserialization finds a null PParentToChildStream actor in the content process. I still don't know why this happens. But I hope this first comment helps.

Flags: needinfo?(amarchesini)

(In reply to Andrea Marchesini [:baku] from comment #4)

Then, because of your ReplacementChannelConfigInit, the stream is sent back to the content process. First question: why? Can you tell me more about what ReplacementChannelConfigInit is used for?
The serialized inputStream is: PIPCRemoteStreamParams( PParentToChildStreamParent )
Now we use a Parent-to-child stream to send a stream, already owned by the content process. Even if it works, we have a double pipe:

  1. the parent reading data from the content
  2. that data is sent back to the content process.

The data is first sent to the parent as part of the docshell load state, where we use it to create and configure an nsHttpChannel.

Once we get a response (and have finished handling any redirects), we pick which process should load the final channel, and setup a HttpChannelParent/HttpChannelChild to forward to the right content proccess.

ReplacementChannelConfigInit is a copy of all the data needed from the final 'real' nsHttpChannel in the parent to the new HttpChannelChild in the content process (but it's also used for nsHttpChannel -> nsHttpChannel redirects, so that we can use the same code for all 'configure a replacement channel' operations).

In the case where the initiating docshell, and the docshell that will actually load the document are different (very common once fission is enabled), then this isn't really a double pipe, it'll be required.

We do have the option to optimize the case where we are sending back to the same process, by adding code to configure the new HttpChannelChild by reading known values from the DocumentChannelChild or DocShellLoadState. That's a nice to have, but any issues it fixes will likely just regress again once we enable fission.

Anyway, for my locally I get the crash just on the first pipe, so fixing the double pipe won't be sufficient.

So, It think this is a bug in IPDLParamTraits<nsIInputStream*>::Write.

There's an AutoIPCStream object there to create the IPDL stream actors, but the object's scope is just the local function.

The AutoIPCStream destructor calls ActivateAndCleanupIPCStream, which serializes all the data and then calls SendClose.

Looking at the IPDL traffic, I can see all the messages for the PChildToParentStream being sent (including SendClose) before we send the message that references the stream (PNecko::Msg_PDocumentChannelConstructor).

Callers that use AutoIPCStream manually ensure that the lifetime of this object is long enough so that the destructor is called after the message that uses the stream, but this isn't the case for IPDLParamTraits.

Maybe we should defer ActivateAndCleanupIPCStream to the end of the current IPDL 'task' somehow?

Whiteboard: [necko-triaged]
Fission Milestone: --- → M5
Attached patch Very WIP fixSplinter Review

How would you feel abut something like this (very much WIP at this point).

Attachment #9109873 - Flags: feedback?(amarchesini)
Duplicate of this bug: 1596282
Duplicate of this bug: 1596789
Attachment #9109873 - Flags: feedback?(jed+bmo)
Attachment #9109873 - Flags: feedback?(jed+bmo) → feedback?(jld)
Blocks: 1596682
No longer blocks: document-channel

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/36219303ac01
Don't release the AutoIPCStream immediately when called from IPDLParamTraits, since we need it to outlive the message currently being serialized. r=baku,kmag
Duplicate of this bug: 1599861
Status: NEW → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72
Comment on attachment 9109873 [details] [diff] [review]
Very WIP fix

Review of attachment 9109873 [details] [diff] [review]:
-----------------------------------------------------------------

We don't need this patch anymore.
Attachment #9109873 - Flags: feedback?(amarchesini)
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.