Closed Bug 1333899 Opened 3 years ago Closed 2 years ago

nsDataChannel::OpenContentStream uses about 3.5X transient memory

Categories

(Core :: Networking, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox60 --- fixed

People

(Reporter: erahm, Assigned: erahm)

References

(Blocks 1 open bug)

Details

(Whiteboard: [MemShrink:P2][necko-triaged])

Attachments

(3 files, 9 obsolete files)

3.46 KB, patch
erahm
: review+
Details | Diff | Splinter Review
9.57 KB, patch
erahm
: review+
Details | Diff | Splinter Review
4.55 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
|nsDataChannel::OpenContentStream| uses about 3.5X the memory of the channel's URI  
when setting up the input stream.

#1 - We get the ascii spec which does a full copy of URI's data
#2 - We parse the URI, which again does essentially a fully copy of the data portion of the URI
#3 - We base64 decode the data into a new buffer so that's another 3/4 of the URI's data size
#4 - We fully copy that data into the output stream's buffer

So we have sizeof(URI) + sizeof(URI) + 3/4 sizeof(URI) + 3/4 sizeof(URI)

This can be rather large when we're dealing with data URIs (think photo upload sites, etc). We've seen this function show up in OOM large reports.

Possible mitigations:
  - Don't get the full spec, just the data portion of the URI (maybe that's path?)
  - Don't require a full copy from the URI, we only want read-only acccess
  - Again don't do a full copy of the data buffer in ParseURI, just return a range?
  - Perhaps provide a "streaming" base64 decoder that just wraps the URI rather than a pipe

[1] http://searchfox.org/mozilla-central/rev/bf98cd4315b5efa1b28831001ad27d54df7bbb68/netwerk/protocol/data/nsDataChannel.cpp#21
So a few things:

1) Just getting the data portion of the URI may not help that much.  The reason is that we do still need the escaping behavior if there's non-ASCII in there, so need to copy.  That said, maybe we could scan for non-ascii and skip copying if there isn't any.  I don't see a GetAsciiPath offhand....
2) It's a bit silly that we first concatenate mPath and mQuery and mRef, and then parse out the mRef.  Ideally we'd have a thing that returned just the mPath+mQuery.  How often that needs to copy depends on how often '?' appears in data: URIs.  For the base64-encoded case, that's never, of course.
3) Unfortunately, we don't just want read-only access, because of the escaping thing.  Of course then nsDataChannel::OpenContentStream goes and unescapes.  So maybe we don't need the escaping after all?
4) We can't rely on the URI not changing under us while the channel is being read from.  So we do need to give the channel/stream its own copy of the data in some way.  It could be shared and copied on write, of course.
5) It's not obvious to me why we're using a pipe, apart maybe from the ability to drop data as it's partially-read.  If we didn't care about that, we could probably use a string input stream that either adopts the data or shares it (leaving it owned by the channel).  That would eliminate the need to do yet another copy to set up the stream.
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #1)
> 5) It's not obvious to me why we're using a pipe, apart maybe from the
> ability to drop data as it's partially-read.  If we didn't care about that,
> we could probably use a string input stream that either adopts the data or
> shares it (leaving it owned by the channel).  That would eliminate the need
> to do yet another copy to set up the stream.

Pipes use segmented buffers internally.  Switching this to a string would lead to a potentially large contiguous allocation vs many smaller allocations.  Maybe that's not too relevant given what else is going on here, but its generally not good to hold things that can be arbitrarily large in a single contiguous allocation.
> Switching this to a string would lead to a potentially large contiguous allocation vs many smaller allocations.

Well, right, unless we just had the stream share or adopt an allocation we've _already_ made.  Without doing that, switching away from pipe doesn't make sense for sure.
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #1)
> So a few things:
> 
> 1) Just getting the data portion of the URI may not help that much.  The
> reason is that we do still need the escaping behavior if there's non-ASCII
> in there, so need to copy.  That said, maybe we could scan for non-ascii and
> skip copying if there isn't any.  I don't see a GetAsciiPath offhand....
> 2) It's a bit silly that we first concatenate mPath and mQuery and mRef, and
> then parse out the mRef.  Ideally we'd have a thing that returned just the
> mPath+mQuery.  How often that needs to copy depends on how often '?' appears
> in data: URIs.  For the base64-encoded case, that's never, of course.
> 3) Unfortunately, we don't just want read-only access, because of the
> escaping thing.  Of course then nsDataChannel::OpenContentStream goes and
> unescapes.  So maybe we don't need the escaping after all?

Okay so really we don't want |GetAsciiSpec| at all, we want |GetPathIgnoringRef| which doesn't exist, but could similar to |GetSpecIgnoringRef|. I agree given that we're escaping in |GetAsciiSpec| and unescaping in |OpenContentStream| it seems silly that we're doing this round-trip. In the base64 case we'd just get a ref to the |mPath| var instead of a copy.

We could update/split up |DataHandler::ParseURI| to handle just the path portion (it's only used in 2 places), so again we could avoid copies by just dealing with ranges for the data buffer portion.

We could then pump that into the base64 decoder and decode into a new buffer, in the non-base64 case we'd just have the string from |GetSpecIgnoringRef| which is possibly just a ref to the URI's |mPath|.

> 4) We can't rely on the URI not changing under us while the channel is being
> read from.  So we do need to give the channel/stream its own copy of the
> data in some way.  It could be shared and copied on write, of course.
> 5) It's not obvious to me why we're using a pipe, apart maybe from the
> ability to drop data as it's partially-read.  If we didn't care about that,
> we could probably use a string input stream that either adopts the data or
> shares it (leaving it owned by the channel).  That would eliminate the need
> to do yet another copy to set up the stream.

The string input stream adopting our decoded buffer seems like the best way to go. So now we're looking at 3/4 sizeof(URI) rather than 3.5. Pretty nice!
I think I have enough to go on, I'll put together some WIP patches.
Assignee: nobody → erahm
> I agree given that we're escaping in |GetAsciiSpec| and unescaping in |OpenContentStream| it seems silly that we're doing this round-trip.

Just to be clear, we _have_ to keep unescaping in OpenContentStream, because the URI may contain already-escaped things to start with.  The only thing we can skip is the escaping, because it only escapes non-ascii and hence can't affect the parsing of the URL (because all the things the parsing cares about are ASCII).
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #6)
> > I agree given that we're escaping in |GetAsciiSpec| and unescaping in |OpenContentStream| it seems silly that we're doing this round-trip.
> 
> Just to be clear, we _have_ to keep unescaping in OpenContentStream, because
> the URI may contain already-escaped things to start with.  The only thing we
> can skip is the escaping, because it only escapes non-ascii and hence can't
> affect the parsing of the URL (because all the things the parsing cares
> about are ASCII).

But if we don't use |GetAsciiSpec| and just |GetSpec| (or whatever replacement) instead it won't be escaped to start with right?
> it won't be escaped to start with right?

It can be.  It's clearly documented so in nsIURI.idl:

     * Some characters may be escaped.

Here's a simple testcase:

  data:text/html,<a%20href="http://example.com">link</a>

in which I'm quite sure the actual thing stored in the URI impl will have the "%20", not a space character.
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #8)
> > it won't be escaped to start with right?
> 
> It can be.  It's clearly documented so in nsIURI.idl:
> 
>      * Some characters may be escaped.
> 
> Here's a simple testcase:
> 
>   data:text/html,<a%20href="http://example.com">link</a>
> 
> in which I'm quite sure the actual thing stored in the URI impl will have
> the "%20", not a space character.

I see I assumed the % was escaped when |NS_URLEscape| was called when originally setting the schem, but I guess not [1].

[1] http://searchfox.org/mozilla-central/source/xpcom/io/nsEscape.cpp#412-416
This adds a new attribute that can be used to avoid copying the path portion
of a URI when the ref is not needed.

There's still a TODO for the rust implementation, I'm not sure if really need
to implement that or not.

MozReview-Commit-ID: 5s2uizyLPQe
Attachment #8831004 - Flags: review?(bzbarsky)
This adds a version of |ParseURI| that only deals with the path portion of the
URI and expects the ref not to be present. It's mainly copy and pasted from
|ParseURI| but updated to use nsString methods rather than NSPR.

Additionaly it only returns a substring to the path passed in for the data
buffer if requested.

|ParseURI| is updated to advance past the scheme and trim the ref, then it
just uses |ParseURIWithoutRef|.

MozReview-Commit-ID: F5h6bcSLd8P
Attachment #8831005 - Flags: review?(valentin.gosu)
This switches over to using the new |GetPathIgnoringRef| which will most likely
avoid a copy. The new |ParsePathWithoutRef| is used, again to avoid needing a
copy of the path substring.

A pipe is no longer used for the input stream, instead we use a string stream
which adopts the data buffer rather than copying it.

For the base64 case in the best case we only allocate the decoded buffer, worst
case there's another buffer for the URL decoded data. So either 3/4 transient
or 1 3/4 transient, both of which are better than 3.5.

There's still a TODO about whether or not we want to use an infallible version
of NS_URLEscape. It's an interesting tradeoff. If we preallocate the string
fallibly we can OOM gracefully but have more transient memory. If use the
infallible version we can avoid the transient memory but might OOM. Given
we've seen OOMs in here before it might make sense to preallocate.

MozReview-Commit-ID: 4tYI3iyxMCl
Attachment #8831006 - Flags: review?(bzbarsky)
Whiteboard: [MemShrink] → [MemShrink][necko-active]
Comment on attachment 8831004 [details] [diff] [review]
Part 1: Add |pathIgnoringRef| attribute to nsIURI

This could probably use review from a necko peer, not least because they might know something about the Rust situation, whereas I don't.

That said, it does look like RustURL::GetPathIgnoringRef is unlikely to have the efficiency properties we want...

>+++ b/netwerk/base/nsIURI.idl
>+    [must_use] readonly attribute AUTF8String pathIgnoringRef;

It might be worth documenting that this exists because it can be more efficient, in terms of string sharing, than doing .path and then truncating at the '#' position.  That will get people in the right mood when implementing it.

>+nsStandardURL::GetPathIgnoringRef(nsACString &result)

I didn't spend the time to dig into the URLSegment and whatnot bits here.  I _can_ do that if we don't find someone who actually knows that code to review, but it would be better to just have someone who knows it review this.

So r+ as far as it goes, but please do get review from a necko peer here!
Attachment #8831004 - Flags: review?(bzbarsky) → review+
Comment on attachment 8831004 [details] [diff] [review]
Part 1: Add |pathIgnoringRef| attribute to nsIURI

Valentin, can you look at this as well?
Attachment #8831004 - Flags: review+
Comment on attachment 8831005 [details] [diff] [review]
Part 2: Add nsDataHandler::ParsePathWithoutRef

>+        // TODO(ER): Currently can't just use a dependent substring because
>+        // they don't seem to have access to Find

See FindInReadable.

>    const char* roBuffer = (const char*) PL_strcasestr(spec.get(), "data:");

It might be worth it to just do everything here on ACStrings as opposed to going to const char* and then back to an nsDependentCSubstring.  Especially because this bit:

>+    nsDependentCSubstring pathWithoutRef(roBuffer,
>+                                         roHash ? roHash - roBuffer : -1);

end up having to scan the whole string to determine length, in the common !roHash case.
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #13)
> Comment on attachment 8831004 [details] [diff] [review]
> Part 1: Add |pathIgnoringRef| attribute to nsIURI
> 
> This could probably use review from a necko peer, not least because they
> might know something about the Rust situation, whereas I don't.

I tagged Valentin as he's r? on part 2 as well.

> That said, it does look like RustURL::GetPathIgnoringRef is unlikely to have
> the efficiency properties we want...

Yeah, although data URIs use the |nsSimpleURI| implementation so I'm not super concerned. Hopefully we can just add a follow up to add this to the rust side.

> >+++ b/netwerk/base/nsIURI.idl
> >+    [must_use] readonly attribute AUTF8String pathIgnoringRef;
> 
> It might be worth documenting that this exists because it can be more
> efficient, in terms of string sharing, than doing .path and then truncating
> at the '#' position.  That will get people in the right mood when
> implementing it.

I'll add a note.

> >+nsStandardURL::GetPathIgnoringRef(nsACString &result)
> 
> I didn't spend the time to dig into the URLSegment and whatnot bits here.  I
> _can_ do that if we don't find someone who actually knows that code to
> review, but it would be better to just have someone who knows it review this.
> 
> So r+ as far as it goes, but please do get review from a necko peer here!

Lets see what Valentin thinks.
Attachment #8831004 - Flags: review+ → review?(valentin.gosu)
Comment on attachment 8831006 [details] [diff] [review]
Part 3: Avoid extra copies in nsDataChannel::OpenContentStream

>+    rv = nsDataHandler::ParsePathWithoutRef(path, contentType, &contentCharset,
>+                                 lBase64, &dataRange);

Please fix the indentation.

>+    // This will avoid a copy if nothing needs to to escaped.

You mean unescaped, right?

>+    // TODO(ER): But this is infallibe which is unfortunate, so the question

Yeah, that's not great.  That said, is this worse than what we have right now, which seems to do a bunch of copies, and kinda treats them all as infallible?

Worth a followup in any case.

>+    // is how often do we expect to escape large buffers.

"unescape large buffers".

>+    nsAutoCString escapedBuffer;

unescapedBuffer.

>+    bool escaped = NS_UnescapeURL(dataRange.Data(), dataRange.Length(),

"bool hadEscapes" perhaps?

>+    nsACString& data = dataRange;
>+    if (escaped) {
>+      data = escapedBuffer;
>+    }

I believe this is a complicated way of writing:

  if (escaped) {
    dataRange = escapedBuffer;
  }

which does in fact compile and work, I think, though morally speaking it really shouldn't.... and it's not clear to me whether you really meant to do that.

A possibly-better way to write this is like so:

  nsAutoCString unescapedBuffer;
  nsSubstring& data = NS_UnescapeURL(dataRange, /* flags = */ 0, unescapedBuffer);
  if (lBase64 && &data == &unescapedBuffer) {
    // We actually used unescapedBuffer, comments about spaces and whatnot here.
    // Stripping whitespace on either "data" or "unescapedBuffer" should have the
    // same effect inside this block, since they're the same object.
  }

>-        if (dataLen >= 1 && dataBuffer[dataLen-1] == '=') {

That's hilarious.  We left all this in when we switched to the nsACString version of Base64Decode?  :(

>+        rv = Base64Decode(data, decodedData);

It might be nice if we had a version of Base64Decode that decoded in-place.  That said, it wouldn't help here in the case when lBase64 and we didn't end up doing any unescaping (in that we'd need to make a copy in that case anyway), and it would involve a buffer that is only 3/4 full once we're done...  So it's probably not worth worrying about too much.

>+        rv = NS_NewCStringInputStream(getter_AddRefs(bufInStream), decodedData);
>+        rv = NS_NewCStringInputStream(getter_AddRefs(bufInStream), data);

Just so we're clear, these will _copy_ the data (at least conceptually), not adopt it.  Mostly this affects the commit message...

Now there are some cases in which these will be able to share the data instead.  Those are cases in which "data" or "decodedData" has a stringbuffer; in those cases they will do a copy-on-write share.  "decodedData" should have a stringbuffer, assuming it's at all long.  But for "data" the situation is more complicated.  If we ended up unescaping, it has a stringbuffer.  Otherwise, it comes from the ParsePathWithoutRef call, and is set up via a Rebind(), so doesn't have a stringbuffer.  So we'll end up making a copy here on the !lbase64 && !unescaped path.

That's probably OK, in the grand scheme of things.  It means we have one copy if !lbase64 && !unescaped, one copy (when unescaping) if !lbase64 && unescaped, one copy (when base64-decoding) if lbase64 && !unescaped, and two copies only when lbase64 && unescaped, which I kinda hope is the rare case anyway.  And even then, avoiding a copy would entail having a 4/3 too bit buffer hanging around...

Anyway, r=me with that above nits addressed and the commit message fixed to make it clear what the story is with the string stream.
Attachment #8831006 - Flags: review?(bzbarsky) → review+
You may want to file a followup to switch other consumers of GetPath that don't want the ref (e.g. nsUrlClassifierUtils::GetKeyForURI) to your new API.
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #15)
> Comment on attachment 8831005 [details] [diff] [review]
> Part 2: Add nsDataHandler::ParsePathWithoutRef
> 
> >+        // TODO(ER): Currently can't just use a dependent substring because
> >+        // they don't seem to have access to Find
> 
> See FindInReadable.

I'll switch over to that, although it's a bit more verbose.

> 
> >    const char* roBuffer = (const char*) PL_strcasestr(spec.get(), "data:");
> 
> It might be worth it to just do everything here on ACStrings as opposed to
> going to const char* and then back to an nsDependentCSubstring.  Especially
> because this bit:
> 
> >+    nsDependentCSubstring pathWithoutRef(roBuffer,
> >+                                         roHash ? roHash - roBuffer : -1);
> 
> end up having to scan the whole string to determine length, in the common
> !roHash case.

Good point! I guess I'll take a bit more blame.
Updated per bz's suggestions.
Attachment #8831385 - Flags: review?(valentin.gosu)
Attachment #8831005 - Attachment is obsolete: true
Attachment #8831005 - Flags: review?(valentin.gosu)
In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #17)
> Comment on attachment 8831006 [details] [diff] [review]
> Part 3: Avoid extra copies in nsDataChannel::OpenContentStream
> 
> >+    rv = nsDataHandler::ParsePathWithoutRef(path, contentType, &contentCharset,
> >+                                 lBase64, &dataRange);
> 
> Please fix the indentation.

Will update.
 
> >+    // This will avoid a copy if nothing needs to to escaped.
> 
> You mean unescaped, right?

Yes, I'll update the rest.

> >+    // TODO(ER): But this is infallibe which is unfortunate, so the question
> 
> Yeah, that's not great.  That said, is this worse than what we have right
> now, which seems to do a bunch of copies, and kinda treats them all as
> infallible?

I think everything used to be fallible: the copy in |GetAsciiSpec| was recently made fallible, the copy in |ParseURI| is fallible, the |NS_UnescapeUrl(nsCString)| version is in-place so no copy, |StripWhitespace| is in-place, |Base64Decode| is fallible, the pipe write is fallible.

So strictly speaking it's worse in the needs unescape case, but we're also not using as much memory so...

> Worth a followup in any case.
> 
> >+    // is how often do we expect to escape large buffers.
> 
> "unescape large buffers".
> 
> >+    nsAutoCString escapedBuffer;
> 
> unescapedBuffer.
> 
> >+    bool escaped = NS_UnescapeURL(dataRange.Data(), dataRange.Length(),
> 
> "bool hadEscapes" perhaps?
> 
> >+    nsACString& data = dataRange;
> >+    if (escaped) {
> >+      data = escapedBuffer;
> >+    }
> 
> I believe this is a complicated way of writing:
> 
>   if (escaped) {
>     dataRange = escapedBuffer;
>   }
> 
> which does in fact compile and work, I think, though morally speaking it
> really shouldn't.... and it's not clear to me whether you really meant to do
> that.

Whoa, that's pretty sketchy! Is that going to make a copy of the string buffer or will it end up shared?

> A possibly-better way to write this is like so:
> 
>   nsAutoCString unescapedBuffer;
>   nsSubstring& data = NS_UnescapeURL(dataRange, /* flags = */ 0,
> unescapedBuffer);
>   if (lBase64 && &data == &unescapedBuffer) {
>     // We actually used unescapedBuffer, comments about spaces and whatnot
> here.
>     // Stripping whitespace on either "data" or "unescapedBuffer" should
> have the
>     // same effect inside this block, since they're the same object.
>   }

Ah yes, comparing the address of the refs will work and makes things cleaner. The problem here is that |NS_UnescapeURL| returns a |const nsCSubstring&| which won't work since we want to modify it. I guess I could const_cast it b/c I know what's going on but that feels a bit off. 

> >-        if (dataLen >= 1 && dataBuffer[dataLen-1] == '=') {
> 
> That's hilarious.  We left all this in when we switched to the nsACString
> version of Base64Decode?  :(
> 
> >+        rv = Base64Decode(data, decodedData);
> 
> It might be nice if we had a version of Base64Decode that decoded in-place. 
> That said, it wouldn't help here in the case when lBase64 and we didn't end
> up doing any unescaping (in that we'd need to make a copy in that case
> anyway), and it would involve a buffer that is only 3/4 full once we're
> done...  So it's probably not worth worrying about too much.

Yeah I thought this over as well. There are some benefits though:
  - We could just bite the bullet and do a fallible data copy after |ParsePathWithoutRef|
  - Go back to the in-place version of |NS_UnescapeURL| (so no weirdness about which buffer has what)
  - Determine if the whitespace trim is necessary by comparing before and after sizes of the path
  - The base64 decode would be in-place so no allocation
  - We could truncate the decoded string to the new size which ideally would free up some memory

> >+        rv = NS_NewCStringInputStream(getter_AddRefs(bufInStream), decodedData);
> >+        rv = NS_NewCStringInputStream(getter_AddRefs(bufInStream), data);
> 
> Just so we're clear, these will _copy_ the data (at least conceptually), not
> adopt it.  Mostly this affects the commit message...

Yeah I guess I mean that it's going to share the string buffer (and when the function goes out of scope it will be the sole owner) except in the no-unescape-not-base64-case in which case it'll allocate, but that'll be the only allocation.

> Now there are some cases in which these will be able to share the data
> instead.  Those are cases in which "data" or "decodedData" has a
> stringbuffer; in those cases they will do a copy-on-write share. 
> "decodedData" should have a stringbuffer, assuming it's at all long.  But
> for "data" the situation is more complicated.  If we ended up unescaping, it
> has a stringbuffer.  Otherwise, it comes from the ParsePathWithoutRef call,
> and is set up via a Rebind(), so doesn't have a stringbuffer.  So we'll end
> up making a copy here on the !lbase64 && !unescaped path.

Yeah I see what your saying at a strings-are-more-complicated-than-you-think level :) I guess saying something like:

> ... will generally be able to share buffers rather than copying ...

would be a bit more honest.

> That's probably OK, in the grand scheme of things.  It means we have one
> copy if !lbase64 && !unescaped, one copy (when unescaping) if !lbase64 &&
> unescaped, one copy (when base64-decoding) if lbase64 && !unescaped, and two
> copies only when lbase64 && unescaped, which I kinda hope is the rare case
> anyway.  And even then, avoiding a copy would entail having a 4/3 too bit
> buffer hanging around...
> 
> Anyway, r=me with that above nits addressed and the commit message fixed to
> make it clear what the story is with the string stream.

Thanks for the thorough feedback, I'll get things updated.
(In reply to Eric Rahm [:erahm] from comment #21)
> Ah yes, comparing the address of the refs will work and makes things
> cleaner. The problem here is that |NS_UnescapeURL| returns a |const
> nsCSubstring&| which won't work since we want to modify it. I guess I could
> const_cast it b/c I know what's going on but that feels a bit off. 

Oh right, and |nsCSubstring| doesn't have a |StripWhitespace|.
> I think everything used to be fallible

Ah.  We should try to preserve that, then, because I'm pretty sure we used to be hitting OOMs here somewhat often in the past.  Maybe we should just add a fallible version of unescape.

> Whoa, that's pretty sketchy!

Yes, our whole string setup where we nominally have static types but under the hood they're all the same class (but with different flags passed from constructors), so really dynamically typed, is pretty sketchy.

> Is that going to make a copy of the string buffer or will it end up shared?

Should end up sharing, iirc; it's just like any other string assignment.

> The problem here is that |NS_UnescapeURL| returns a |const nsCSubstring&| which
> won't work since we want to modify it.

Hmm.  We want to modify it in the &data == &unescapedBuffer case, right?

We could modify unescapedBuffer, right?  I agree it's still kinda sketchy, and not sure how it plays with C++ aliasing rules, but maybe less sketchy than const_cast?

Or we could introduce a version of NS_UnescapeURL that takes a non-const ref for the string to unescape and returns a non-const ref, right?  That might be the least-sketchy thing...

> There are some benefits though:

That's true.  If we're willing to pay the price of possibly over-allocating when escaped or base64-encoded, it _does_ help simplify various things.

> We could truncate the decoded string to the new size which ideally would free up some memory

String truncation never reallocs to a smaller size, unfortunately.  I guess we could try to change that...

> would be a bit more honest.

Yes.

One other thing I just thought of.  The old setup, with the pipe, would allow the pipe to empty out as it was read, and when the data URI was done loading we would no longer have any data in the pipe.  The new setup sets up the string stream, then hands it out.  What happens after that?  When done reading, do we drop its data?

I sort of traced through the code a bit (basechannel, input stream transport, etc), and I _think_ we should end up calling Close() on the string stream, which should drop the data.  But it would be good to verify this explicitly.

Also of note, I think the upshot of using a string stream here is that we will in fact go via the input stream transport, because string stream is not an nsIAsyncInputStream.  And then we will end up in nsInputStreamTransport::OpenInputStream which will go ahead and set up a pipe anyway.  Now it will do the writes to the pipe's output stream end via NS_AsyncCopy, and I _think_ it will limit the pipe size to 24 4-kilobyte pages.  But it's also worth double-checking what the memory usage is in practice (or again asking someone more familiar with necko to double-check).
Updated per bz's review. I've left the TODO in and will file a follow up to
either add a fallible NS_URLEscape or just preallocate the buffer.

For the was unescaped case we can just trim the whitespace fo the non-const
|unescapedBuffer| which is what |data| is pointing at.
Attachment #8831006 - Attachment is obsolete: true
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #23)
> > I think everything used to be fallible
> 
> Ah.  We should try to preserve that, then, because I'm pretty sure we used
> to be hitting OOMs here somewhat often in the past.  Maybe we should just
> add a fallible version of unescape.

That's what made me file this bug (fixing an OOM) so I guess I should do the right thing :)

> > Whoa, that's pretty sketchy!
> 
> Yes, our whole string setup where we nominally have static types but under
> the hood they're all the same class (but with different flags passed from
> constructors), so really dynamically typed, is pretty sketchy.
> 
> > Is that going to make a copy of the string buffer or will it end up shared?
> 
> Should end up sharing, iirc; it's just like any other string assignment.
> 
> > The problem here is that |NS_UnescapeURL| returns a |const nsCSubstring&| which
> > won't work since we want to modify it.
> 
> Hmm.  We want to modify it in the &data == &unescapedBuffer case, right?
> 
> We could modify unescapedBuffer, right?  I agree it's still kinda sketchy,
> and not sure how it plays with C++ aliasing rules, but maybe less sketchy
> than const_cast?

That's what I went with.

> Or we could introduce a version of NS_UnescapeURL that takes a non-const ref
> for the string to unescape and returns a non-const ref, right?  That might
> be the least-sketchy thing...

The work around seems okay, but I'd rather add a fallible Unescape.

> > There are some benefits though:
> 
> That's true.  If we're willing to pay the price of possibly over-allocating
> when escaped or base64-encoded, it _does_ help simplify various things.

I think it's a good follow up unless you feel like it's better to simplify things now. I'm familiar with that code so it's not a big issue to add it.

> > We could truncate the decoded string to the new size which ideally would free up some memory
> 
> String truncation never reallocs to a smaller size, unfortunately.  I guess
> we could try to change that...

Oh dear. I think it's still better than having both allocations live. I guess it really just depends on what sizes we're talking about here. Of maybe we could decode into an array instead, which I think will resize. I wonder if there's an ArrayStream type thing?

> One other thing I just thought of.  The old setup, with the pipe, would
> allow the pipe to empty out as it was read, and when the data URI was done
> loading we would no longer have any data in the pipe.  The new setup sets up
> the string stream, then hands it out.  What happens after that?  When done
> reading, do we drop its data?
> 
> I sort of traced through the code a bit (basechannel, input stream
> transport, etc), and I _think_ we should end up calling Close() on the
> string stream, which should drop the data.  But it would be good to verify
> this explicitly.

Good point, I'll trace through further.

> Also of note, I think the upshot of using a string stream here is that we
> will in fact go via the input stream transport, because string stream is not
> an nsIAsyncInputStream.  And then we will end up in
> nsInputStreamTransport::OpenInputStream which will go ahead and set up a
> pipe anyway.  Now it will do the writes to the pipe's output stream end via
> NS_AsyncCopy, and I _think_ it will limit the pipe size to 24 4-kilobyte
> pages.  But it's also worth double-checking what the memory usage is in
> practice (or again asking someone more familiar with necko to double-check).

Maybe we can tag Valentin on this as well. I'll take a look next week though as I should probably try to understand the life-cycle.
> The work around seems okay, but I'd rather add a fallible Unescape.

I was talking about just in terms of the things it takes and returns, with identical behavior to the existing one.  This is orthogonal to adding fallibility.

> I wonder if there's an ArrayStream type thing?

Not really.  Or rather, it's called "stringstream".

Well, I guess there's ArrayBufferInputStream, but we do _not_ want to go there.  ;)
Comment on attachment 8831004 [details] [diff] [review]
Part 1: Add |pathIgnoringRef| attribute to nsIURI

Review of attachment 8831004 [details] [diff] [review]:
-----------------------------------------------------------------

I prefer to avoid adding new methods to the nsIURI interface. Is this one really needed?
There are several ways of achieving the same, such as cloneIgnoringRef, or FindChar('#') and Truncate.
If this is the only way to avoid an OOM, then it might be OK to add it.

::: netwerk/base/nsStandardURL.cpp
@@ +1301,5 @@
> +    } else {
> +        result = Path();
> +    }
> +    // TODO(ER): coordinate with rust?
> +    //CALL_RUST_GETTER_STR(result, GetPathIgnoringRef, result);

You can uncomment this. During normal builds it has no effects. You can flip the network.standard-url.enable-rust pref, then check the logs for nsStandardURL:5 to make sure they work properly.

Also, please add unit tests to both test_standardurl.js and test_rusturl.js
> There are several ways of achieving the same

Hmm.  cloneIgnoringRef followed by GetPath() might work, I guess, as long as we clearly comment it.  It's a pretty nasty hack just to avoid having the scheme prepended to the string we care about.

If necko guaranteed that we'd have an nsSimpleURI here, we could just try to cast directly to that and use internal methods, but necko of course doesn't guarantee any such thing.

We could, similarly, add a way to QI to nsSimpleURI and use it plus internal methods...

> or FindChar('#') and Truncate.

This obviously won't do the right (no-copy) thing, in that it will make a copy at that point if there is a ref.
(In reply to Valentin Gosu [:valentin] from comment #27)
> Comment on attachment 8831004 [details] [diff] [review]
> Part 1: Add |pathIgnoringRef| attribute to nsIURI
> 
> Review of attachment 8831004 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I prefer to avoid adding new methods to the nsIURI interface. Is this one
> really needed?

Yes to avoid the transient memory spike (and associated OOMs) we need to avoid copying string buffers.

> There are several ways of achieving the same, such as cloneIgnoringRef, or
> FindChar('#') and Truncate.

|cloneIgnoringRef| would work, but that's only acceptable if we know for sure we're working with a |nsSimpleURI|. For example |nsStandardURL| is heavy handed [1], makes unnecessary copies of the cache for the ignoring ref case [2], and will cause a copy of the buffer if there was a ref [3]. 

> If this is the only way to avoid an OOM, then it might be OK to add it.
> 
> ::: netwerk/base/nsStandardURL.cpp
> @@ +1301,5 @@
> > +    } else {
> > +        result = Path();
> > +    }
> > +    // TODO(ER): coordinate with rust?
> > +    //CALL_RUST_GETTER_STR(result, GetPathIgnoringRef, result);
> 
> You can uncomment this. During normal builds it has no effects. You can flip
> the network.standard-url.enable-rust pref, then check the logs for
> nsStandardURL:5 to make sure they work properly.

I haven't added this to the rust url library, it seems like that would cause failures right? Or is that just calling into nsRustURL?

> Also, please add unit tests to both test_standardurl.js and test_rusturl.js

Will do (assuming we don't go the clone route).

[1] http://searchfox.org/mozilla-central/rev/f31f538622d0dfb2d869d4babc4951f6001e9be2/netwerk/base/nsStandardURL.cpp#2320-2366
[2] http://searchfox.org/mozilla-central/rev/f31f538622d0dfb2d869d4babc4951f6001e9be2/netwerk/base/nsStandardURL.cpp#2348-2350
[3] http://searchfox.org/mozilla-central/rev/f31f538622d0dfb2d869d4babc4951f6001e9be2/netwerk/base/nsStandardURL.cpp#2935-2941
Flags: needinfo?(valentin.gosu)
> but that's only acceptable if we know for sure we're working with a |nsSimpleURI|

In normal operation, you will be.  The only way to get a non-nsSimpleURI through here is if someone creates an nsStandardURL (e.g. http://something) and then changes its protocol to "data:" and opens a channel for the resulting thing.  That's not something that unprivileged code can really do, ever since we fixed bug 1245264.

So I don't think we can rely on having an nsSimpleURI here enough to just cast to it, but you can rely enough to not worry about inefficiency, or even not opening the channel at all, if it happens.
Comment on attachment 8831004 [details] [diff] [review]
Part 1: Add |pathIgnoringRef| attribute to nsIURI

Considering it will almost always be a nsSimpleURI lets just go the simpler CloneIgnoringRef route.
Attachment #8831004 - Attachment is obsolete: true
Flags: needinfo?(valentin.gosu)
Attachment #8831004 - Flags: review?(valentin.gosu)
This adds a fallible version of |NS_UnescapeURL| that can be used to
gracefully handle allocation failures when the string needs to be unescaped.

MozReview-Commit-ID: 1eXPzeB2yrI
Attachment #8831873 - Flags: review?(nfroyd)
This has changed enough it could probably use a re-review. It's updated to use
a new fallible version of |NS_UnescapeURL| along with a local wrapper that
helps with returning the string that was used.

This is also updated to clone the URI rather than use the now abandoned
|pathIgnoringRef|.
Attachment #8831877 - Flags: review?(bzbarsky)
Attachment #8831873 - Attachment is obsolete: true
Attachment #8831873 - Flags: review?(nfroyd)
This switches over to using the new |GetPathIgnoringRef| which will most likely
avoid a copy. The new |ParsePathWithoutRef| is used, again to avoid needing a
copy of the path substring.

A pipe is no longer used for the input stream, instead we use a string stream
which in most cases will be able to share the string data buffer rather than
copying it.

For the base64 case in the best case we only allocate the decoded buffer, worst
case there's another buffer for the URL decoded data. So either 3/4 transient
or 1 3/4 transient, both of which are better than 3.5.

MozReview-Commit-ID: 4tYI3iyxMCl
Attachment #8831878 - Flags: review?(bzbarsky)
Attachment #8831389 - Attachment is obsolete: true
Comment on attachment 8831877 [details] [diff] [review]
Part 1: Add a fallible NS_UnescapeURL

Sorry forgot to qpush, just looking for review from Nathan.
Attachment #8831877 - Flags: review?(bzbarsky) → review?(nfroyd)
Comment on attachment 8831878 [details] [diff] [review]
Part 3: Avoid extra copies in nsDataChannel::OpenContentStream

>This switches over to using the new |GetPathIgnoringRef|

Not anymore.  Please fix.

>+++ b/netwerk/protocol/data/nsDataChannel.cpp
>+const nsACString& Unescape(const nsACString& aStr, nsACString& aBuffer,
>+                           nsresult& rv)

I would really prefer it if the outparam were "nsresult*", to make it clearer that it's an outparam.

r=me with that
Attachment #8831878 - Flags: review?(bzbarsky) → review+
Comment on attachment 8831877 [details] [diff] [review]
Part 1: Add a fallible NS_UnescapeURL

Review of attachment 8831877 [details] [diff] [review]:
-----------------------------------------------------------------

This is such an awkward API, but I don't know if there are better options.  r=me with a fallible_t argument added.

::: xpcom/io/nsEscape.h
@@ +148,5 @@
> +nsresult NS_UnescapeURL(const char* aStr,
> +                        int32_t aLen,
> +                        uint32_t aFlags,
> +                        nsACString& aResult,
> +                        bool& aAppended);

We should make this accept a |const mozilla::fallible_t&| argument, as per many other fallible functions we have.
Attachment #8831877 - Flags: review?(nfroyd) → review+
Comment on attachment 8831385 [details] [diff] [review]
Part 2: Add nsDataHandler::ParsePathWithoutRef

Review of attachment 8831385 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/data/nsDataHandler.cpp
@@ +201,5 @@
> +    } else {
> +        auto mediaType = Substring(aPath, 0, commaIdx);
> +
> +        // Determine if the data is base64 encoded.
> +        nsACString::size_type base64;

I think this needs to be initialized to 0 to avoid warnings if FindOffsetOf fails.
Attachment #8831385 - Flags: review?(valentin.gosu) → review+
(In reply to Nathan Froyd [:froydnj] from comment #38)
> Comment on attachment 8831877 [details] [diff] [review]
> Part 1: Add a fallible NS_UnescapeURL
> 
> Review of attachment 8831877 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This is such an awkward API, but I don't know if there are better options. 
> r=me with a fallible_t argument added.
> 
> ::: xpcom/io/nsEscape.h
> @@ +148,5 @@
> > +nsresult NS_UnescapeURL(const char* aStr,
> > +                        int32_t aLen,
> > +                        uint32_t aFlags,
> > +                        nsACString& aResult,
> > +                        bool& aAppended);
> 
> We should make this accept a |const mozilla::fallible_t&| argument, as per
> many other fallible functions we have.

Updated locally.
(In reply to Valentin Gosu [:valentin] from comment #39)
> Comment on attachment 8831385 [details] [diff] [review]
> Part 2: Add nsDataHandler::ParsePathWithoutRef
> 
> Review of attachment 8831385 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: netwerk/protocol/data/nsDataHandler.cpp
> @@ +201,5 @@
> > +    } else {
> > +        auto mediaType = Substring(aPath, 0, commaIdx);
> > +
> > +        // Determine if the data is base64 encoded.
> > +        nsACString::size_type base64;
> 
> I think this needs to be initialized to 0 to avoid warnings if FindOffsetOf
> fails.

Good catch! Looks like valgrind was unhappy as well. Will update locally.
(In reply to Eric Rahm [:erahm] from comment #41)
> (In reply to Valentin Gosu [:valentin] from comment #39)
> > Comment on attachment 8831385 [details] [diff] [review]
> > Part 2: Add nsDataHandler::ParsePathWithoutRef
> > 
> > Review of attachment 8831385 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: netwerk/protocol/data/nsDataHandler.cpp
> > @@ +201,5 @@
> > > +    } else {
> > > +        auto mediaType = Substring(aPath, 0, commaIdx);
> > > +
> > > +        // Determine if the data is base64 encoded.
> > > +        nsACString::size_type base64;
> > 
> > I think this needs to be initialized to 0 to avoid warnings if FindOffsetOf
> > fails.
> 
> Good catch! Looks like valgrind was unhappy as well. Will update locally.

Although 0 isn't quite right, that would indicate a match was found at the beginning of the string (before nullptr indicated no match), I'll go with replacing |base64 == 0| with |mediaType.IsEmpty| which is what we were really checking for.
Whiteboard: [MemShrink][necko-active] → [MemShrink:P2][necko-active]
Is this just blocked on that reftest failure?
Blocks: 1369317
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #47)
> Is this just blocked on that reftest failure?

Yep.
Flags: needinfo?(erahm)
OK, so I did a try run with these patches: https://treeherder.mozilla.org/#/jobs?repo=try&revision=93e68cb464abea0f2ad84e94216c39b3c32fb2ca&selectedJob=104353133

Summary:
1) all 20 runs on Linux opt passed, e10s and non-10s.
2) There was one failure on Linux64-opt, e10s mode.
3) There were two failures on "Linux x64 QuantumRender opt", e10s mode.

So it's a lovely intermittent failure...  it's possible that this is happening because the patch ends up with undefined behavior in some way or something...
OK, I did a try run with more logging at https://treeherder.mozilla.org/#/jobs?repo=try&revision=c586ab39aa1306909adffdc9a42de881052a94ff

That shows that we're getting exactly the same data right before the NS_NewCStringInputStream call in the passing and failing cases.  Also, some of the failures are in layout/reftests/svg/as-image/svg-stylesheet-datauri-1.html, not in layout/reftests/svg/as-image/svg-image-datauri-1.html ("stylesheet" vs "image").

If you actually look at the svg-stylesheet-datauri-1.html failures, the test file shows the rect with no fill at all, not a blue fill.  That's pretty odd.

OK, so what actually changed when we switched from the pipe to the string stream?  We used to pass "async" to the pipe for the aNonBlockingInput.  This should have been "true" for all the cases here, which are all using AsyncOpen.  So the pipe stream should have returned true from IsNonBlocking().  Same thing for string streams.  But pipe streams implement nsIAsyncInputStream and string streams do not.  So in nsInputStreamPump::AsyncRead we end up taking different codepaths for them: the pipe case just sets mAsyncStream to the given pipe input stream, while the string stream case kicks off an input stream transport to read things on a different thread.  Is it possible that this changes timing of things so that already-racy testcases (e.g. if they're not waiting for this load to actually complete before firing onload for some reason?) end up racing slightly differently?
If it helps we could probably make StringInputStream implement nsIAsyncInputStream.  Its been something I've wanted a few times in the past.
I'm doing some more try pushes with more logging to see whether I can figure out what's going on here, because really, the off-thread stuff should not cause failures if the tests are not already racy.

But yes, implementing nsIAsyncInputStream on stringstream might be a good idea anyway.
OK, https://treeherder.mozilla.org/#/jobs?repo=try&revision=86cd53893b38d7071e6f908037116d5b01fbb018&selectedJob=104633007 shows there are no weird flags like LOAD_BACKGROUND going on here.

What's not clear to me is whether onload works correctly at all with SVG images with subresources (which must be data: because we block other protocols in there).  In particular, we create a separate loadgroup for the SVG image, and that loadgroup is where the subresource loads go.  That loadgroup has its loadgroup set to the loadgroup of the nsIRequest loading the SVG image (see SVGDocumentWrapper::SetupViewer).  But that loadgroup may not match the loadgroup of the imgIRequestProxy, as far as I know....  I guess I'll add some more logging around that stuff.
Has the more logging, but I'm a bit lost in all the stuff.  I could keep debugging it, but I suspect that the basic analysis above is what's going on: either an image cache hit or an image cache miss (not sure which one) ends up not blocking onload on the subresources of the image, effectively...

Timothy, do you recall how the loadgroup bits work in imagelib?

In the short term, I expect that implementing nsIAsyncInputStream will solve the orange.
Flags: needinfo?(tnikkel)
(In reply to Boris Zbarsky [:bz] (if a patch has no decent message, automatic r-) from comment #54)
> In the short term, I expect that implementing nsIAsyncInputStream will solve
> the orange.

Hopefully that doesn't trigger orange in a different area.  (I still think we should do it, though.)
(In reply to Boris Zbarsky [:bz] (if a patch has no decent message, automatic r-) from comment #54)
> Has the more logging, but I'm a bit lost in all the stuff.  I could keep
> debugging it, but I suspect that the basic analysis above is what's going
> on: either an image cache hit or an image cache miss (not sure which one)
> ends up not blocking onload on the subresources of the image, effectively...
> 
> Timothy, do you recall how the loadgroup bits work in imagelib?

I haven't had the pleasure of working with that code but I just took a look at it now. Based on what you said I'm guessing this is what happens:

doc 1 loads the svg image, the imgRequest is created and added to the loadgroup of doc 1. The svg document uses a child loadgroup of the loadgroup of doc 1 (so doc 1 won't see a problem). The created imgRequestProxy also uses the loadgroup from doc 1.

Before the svg image is loaded, doc 2 also asks to load the image, it is handed back an imgRequestProxy (to the same imgRequest) which uses the loadgroup of doc 2. The imgRequest is still in the loadgroup of doc 1, that hasn't changed.

The imgRequestProxy for doc 2 removes itself from the loadgroup when it gets OnStopRequest for the svg document (imgRequest::OnStopRequest -> imgRequestProxy::OnLoadComplete). This happens when we have the svg document but not the subresources.

doc 1 doesn't see a problem because it waits for the subresources of the svg doc.
Flags: needinfo?(tnikkel)
OK, so...

I wrote some patches for bug 1371699.  They reduce the frequency of the imagelib orange a good bit: I got one orange run out of about 600.  But the intermittent orange is still there.

The reason for that is that compared to the pipe we take an extra turn through the event loop.  That happens because of this code at the end of nsInputStreamPump::OnStateTransfer: http://searchfox.org/mozilla-central/rev/a798ee4fc323f9387b7576dbed177859d29d09b7/netwerk/base/nsInputStreamPump.cpp#660-667

In the pipe case, we've dropped our ref to the output, which closed the output.  So as soon as we finish reading all the data from the input, Available() starts returning NS_BASE_STREAM_CLOSED (see http://searchfox.org/mozilla-central/rev/12afa8cc9cea4a141494f767541b411845eef7a0/xpcom/io/nsPipe3.cpp#1387-1389 for details).  So that first call to nsInputStreamPump::OnStateTransfer reads all the data, then immediately goes to STATE_STOP.

In the string stream case, we get NS_OK (and 0) from Available(), stay in STATE_TRANSFER, and queue another AsyncWait event  When it fires, we land in nsInputStreamPump::OnStateTransfer, get a 0 return from Available(), the "if (avail)" test on line 660 there tests false, and we fall through to STATE_STOP finally.

I tried to change stringstream to return NS_BASE_STREAM_CLOSED from Available() when 0 bytes are available, but that breaks all sorts of tests (filereader, CSP, fetch, etc, etc).  I guess we should consider ourselves lucky those codepaths don't get run on pipes.  I expect the problem is people treating NS_BASE_STREAM_CLOSED as a fatal error, which is not really right for Available()...

Anyway, my temptation is to just do the bits in bug 1371699 and deal with the fairly low intermittent rate, and file a followup on the fundamental imagelib raciness here.  The other option is to try to hunt down and fix all the Available() callsites involved.
I filed bug 1372410 on maybe implementing the "close at EOF" behavior that would save us an event trip loop here.
Bulk priority update: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Priority: P1 → P2
Whiteboard: [MemShrink:P2][necko-active] → [MemShrink:P2]
Whiteboard: [MemShrink:P2] → [MemShrink:P2[necko-triaged]
Whiteboard: [MemShrink:P2[necko-triaged] → [MemShrink:P2][necko-triaged]
Eric, if were going to do bug 1392241, especially with a Rust impl, it would be good to have this landed first, so we can have the right API there....  At least in terms of the parsing bits, even if we don't change the stream parts.
Flags: needinfo?(erahm)
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #61)
> Eric, if were going to do bug 1392241, especially with a Rust impl, it would
> be good to have this landed first, so we can have the right API there.... 
> At least in terms of the parsing bits, even if we don't change the stream
> parts.

Good point, I'll look into rebasing and splitting up part 3.
This adds a fallible version of |NS_UnescapeURL| that can be used to
gracefully handle allocation failures when the string needs to be unescaped.
This adds a version of |ParseURI| that only deals with the path portion of the
URI and expects the ref not to be present. It's mainly copy and pasted from
|ParseURI| but updated to use nsString methods rather than NSPR.

Additionaly it only returns a substring to the path passed in for the data
buffer if requested.

|ParseURI| is updated to advance past the scheme and trim the ref, then it
just uses |ParseURIWithoutRef|.
This switches over to cloning the URI without it's ref which will most likely
avoid a copy. The new |ParsePathWithoutRef| is used, again to avoid needing a
copy of the path substring.

bz, this changed enough that you might want to give it a once over.
Attachment #8947991 - Flags: review?(bzbarsky)
A pipe is no longer used for the input stream, instead we use a string stream
which in most cases will be able to share the string data buffer rather than
copying it.
Attachment #8831878 - Attachment is obsolete: true
Flags: needinfo?(erahm)
Attachment #8831877 - Attachment is obsolete: true
Attachment #8831385 - Attachment is obsolete: true
Comment on attachment 8947989 [details] [diff] [review]
Part 1: Add a fallible NS_UnescapeURL

Carrying forward r+
Attachment #8947989 - Flags: review+
Comment on attachment 8947990 [details] [diff] [review]
Part 2: Add nsDataHandler::ParsePathWithoutRef

Carrying forward r+
Attachment #8947990 - Flags: review+
Comment on attachment 8947991 [details] [diff] [review]
Part 3: Avoid extra copies in nsDataChannel::OpenContentStream

We should document in nsEscape what the "appended" argument to this overload of NS_UnescapeURL means.  Right now it's quite mysterious...

>+        rv = bufOutStream->Write(dataBuffer.get(), data.Length(), &contentLen);

First arg should be data.get() for this to compile.

r=me with that.
Attachment #8947991 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #69)
> Comment on attachment 8947991 [details] [diff] [review]
> Part 3: Avoid extra copies in nsDataChannel::OpenContentStream
> 
> We should document in nsEscape what the "appended" argument to this overload
> of NS_UnescapeURL means.  Right now it's quite mysterious...

I'll file a follow up for that.

> >+        rv = bufOutStream->Write(dataBuffer.get(), data.Length(), &contentLen);
> 
> First arg should be data.get() for this to compile.
> 
> r=me with that.

Bah, goofed that when splitting / rebasing. Will fix.
Pushed by erahm@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/33865671a67b
Part 1: Add a fallible NS_UnescapeURL. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/fc9b9ea20e7c
Part 2: Add nsDataHandler::ParsePathWithoutRef. r=valentin
https://hg.mozilla.org/integration/mozilla-inbound/rev/b68e9ed93b9f
Part 3: Avoid extra copies in nsDataChannel::OpenContentStream. r=bz
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #70)
> (In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from
> comment #69)
> > Comment on attachment 8947991 [details] [diff] [review]
> > Part 3: Avoid extra copies in nsDataChannel::OpenContentStream
> > 
> > We should document in nsEscape what the "appended" argument to this overload
> > of NS_UnescapeURL means.  Right now it's quite mysterious...
> 
> I'll file a follow up for that.

Realized only the new method needed docs so I just updated before pushing.
Blocks: 1435899
I filed bug 1435899 as a follow up for the string stream portion.
No longer depends on: 1371699
Attachment #8947992 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.