SourceBuffer::Compact reallocates on nearly every image

RESOLVED FIXED in Firefox 56

Status

()

Core
ImageLib
P3
normal
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: aosmond, Assigned: aosmond)

Tracking

({perf})

unspecified
mozilla56
Points:
---

Firefox Tracking Flags

(firefox56 fixed)

Details

(Whiteboard: [gfx-noted])

Attachments

(5 attachments, 3 obsolete attachments)

5.76 KB, patch
tnikkel
: review+
Details | Diff | Splinter Review
4.81 KB, patch
tnikkel
: review+
Details | Diff | Splinter Review
1.65 KB, patch
tnikkel
: review+
Details | Diff | Splinter Review
1.17 KB, patch
mak
: review+
Details | Diff | Splinter Review
6.40 KB, patch
tnikkel
: review+
Details | Diff | Splinter Review
(Assignee)

Description

a year ago
While profiling an unrelated issue, I found we spent an unusual amount of time in SourceBuffer::Compact on the image decoder threads. SourceBuffer::ExpectLength preallocates a buffer given the expected size, but the SourceBuffer::CreateChunk uses the default aRoundUp = true. This causes us to allocate a buffer with a size rounded up to MIN_CHUNK_CAPACITY (4096). Later on, when we complete decoding, SourceBuffer::Compact is called, it will the find capacity is too big and so it copies the data into a properly sized buffer.

Because our predicted size (if given) is typically correct, this means we almost always are allocating something too big, and then reallocating afterwards, wasting precious cycles and memory bandwidth.

We should change the allocation to 1) use realloc to potentially avoid a copy of the first chunk if we are able to simply grow the current buffer, and 2) trust the expected size given to us as a good first guess. Even if the chunk size is wrong and we need to allocate more room, we aren't spending more time reallocating and copying than we do now, because we do it every time (unless the image size is a factor of 4096 ;)).
(Assignee)

Updated

a year ago
Assignee: nobody → aosmond
Status: NEW → ASSIGNED
Priority: -- → P3
Whiteboard: [gfx-noted]
(Assignee)

Comment 1

a year ago
Aside from the network lying, there is a case in Gecko where the given expected length was wrong. Bug 1315554 fixes how the ICO decoder would give the wrong predicted size for BMP resources (e.g. if the AND mask was used, we only gave the decoder half the predicted data, and this would trigger a Compact as well) -- while we still give the same bigger-than-expected size, it creates a new iterator for the ICO's buffer (limited to the resource size) which won't trigger a compact when it is freed (and even if it did, it would be for the ICO buffer, not the previous copied resource buffer).
(Assignee)

Comment 2

a year ago
Created attachment 8889058 [details] [diff] [review]
Part 1. SourceBuffer::Compact should use realloc to grow the first chunk to try to avoid a copy., v1
(Assignee)

Comment 3

a year ago
Created attachment 8889059 [details] [diff] [review]
Part 2. When SourceBuffer::ExpectLength creates the initial buffer, it should not round up., v1

Note that this should cause the IMAGE_DECODE_CHUNKS to rise a little in the telemetry. That is not a bad thing as it means we are more efficient in general. (But we shouldn't see a giant spike.)
(Assignee)

Comment 4

a year ago
Created attachment 8889086 [details] [diff] [review]
Part 1. SourceBuffer::Compact should use realloc to grow the first chunk to try to avoid a copy., v2

Fix a theoretical memory leak (as I don't believe we ever do a move into a non-empty SourceBuffer::Chunk). Improve commit message.
Attachment #8889058 - Attachment is obsolete: true
(Assignee)

Comment 5

a year ago
Created attachment 8889087 [details] [diff] [review]
Part 2. When SourceBuffer::ExpectLength creates the initial buffer, it should not round up., v2

Update gtests due to new behaviour. Improve commit message.

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5930dc6fe430b17ec92b0f3956293c0a9794db9b
Attachment #8889059 - Attachment is obsolete: true
(Assignee)

Updated

a year ago
Keywords: perf
(Assignee)

Updated

a year ago
Attachment #8889086 - Flags: review?(tnikkel)
(Assignee)

Updated

a year ago
Attachment #8889087 - Flags: review?(tnikkel)
(Assignee)

Comment 6

a year ago
Created attachment 8889111 [details] [diff] [review]
Part 3. SourceBuffer::mChunks should be an AutoTArray instead of FallibleTArray., v1

This one is really just a micro-optimization, but telemetry and reasoning about the code suggests that the vast majority of the time, we have a single chunk, so we should avoid the alloc for the common case. Even if there are multiple elements it will get compacted after decode, and appending chunks remains fallible.
Attachment #8889111 - Flags: review?(tnikkel)
(Assignee)

Comment 7

a year ago
Created attachment 8889125 [details] [diff] [review]
Part 4. imgTools::DecodeImage should set the source size hint to optimize the allocation., v1

Another cause of reallocations is we don't even set the source hint with imgTools::DecodeImage. Make it do that.
Attachment #8889125 - Flags: review?(tnikkel)
(Assignee)

Comment 8

a year ago
Created attachment 8889126 [details] [diff] [review]
Part 5. PageIconProtocolHandler should set the content length when creating a channel., v1

PageIconProtocolHandler has the content length but it doesn't set the hint on the channel. With the previous parts, this means it will still need to do a realloc, even though we really do know the true size.

At this point, it seems to be only bad web servers giving us incorrect sizes :).

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c213d3d680a572307a9376936c7781a6304d222f
Attachment #8889126 - Flags: review?(mak77)
(Assignee)

Comment 9

a year ago
Created attachment 8889129 [details] [diff] [review]
Part 4. imgTools::DecodeImage should set the source size hint to optimize the allocation., v2

I realized the way I did this was inconsistent with how it was already done in ImageFactory::CreateRasterImage. I reused that functionality instead of exposing the RasterImage::SetSourceSizeHint detail to higher levels.
Attachment #8889125 - Attachment is obsolete: true
Attachment #8889125 - Flags: review?(tnikkel)
Attachment #8889129 - Flags: review?(tnikkel)
(Assignee)

Comment 10

a year ago
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d9f23fc8cac506d17623befaba7092e8ca6e8bf7

I thought I was done, but it seems it isn't always the servers fault necessarily. If we go through nsHTTPCompressConv, we seem to also not provide a size hint. Not sure if it is possible to determine a hint in this case, will explore that later.
(Assignee)

Comment 11

a year ago
(In reply to Andrew Osmond [:aosmond] from comment #10)
> I thought I was done, but it seems it isn't always the servers fault
> necessarily. If we go through nsHTTPCompressConv, we seem to also not
> provide a size hint. Not sure if it is possible to determine a hint in this
> case, will explore that later.

Nope, no more to do. It doesn't know the size in advance.
Comment on attachment 8889126 [details] [diff] [review]
Part 5. PageIconProtocolHandler should set the content length when creating a channel., v1

Review of attachment 8889126 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you!
Attachment #8889126 - Flags: review?(mak77) → review+
Comment on attachment 8889086 [details] [diff] [review]
Part 1. SourceBuffer::Compact should use realloc to grow the first chunk to try to avoid a copy., v2

Do we use the move constructor and move assignment operator on Chunks? If so can we just clear the mData pointer on the aOther Chunk? That should preserve move semantics, no?

Also, doesn't this mean new chunks now contain uninitialized memory instead of zeroed memory?
Attachment #8889087 - Flags: review?(tnikkel) → review+
Attachment #8889111 - Flags: review?(tnikkel) → review+
Attachment #8889129 - Flags: review?(tnikkel) → review+
(Assignee)

Comment 14

a year ago
(In reply to Timothy Nikkel (:tnikkel) from comment #13)
> Comment on attachment 8889086 [details] [diff] [review]
> Part 1. SourceBuffer::Compact should use realloc to grow the first chunk to
> try to avoid a copy., v2
> 
> Do we use the move constructor and move assignment operator on Chunks?

We use the move assignment operator:

http://searchfox.org/mozilla-central/rev/09c065976fd4f18d4ad764d7cb4bbc684bf56714/image/SourceBuffer.cpp#160

The move constructor isn't used by us directly, but it may be in the nsTArray implementation.

> If so
> can we just clear the mData pointer on the aOther Chunk? That should
> preserve move semantics, no?
> 

Isn't this what the patch does? In both Chunk(Chunk&& aOther) and Chunk& operator=(Chunk&& aOther) it clears mData. I didn't need to add it because it was already there, I took a double take when changing the code myself :).

> Also, doesn't this mean new chunks now contain uninitialized memory instead
> of zeroed memory?

The original "new (fallible) char[mCapacity]" doesn't initialize the memory to zero, although had it been "new (fallible) char[mCapacity]()", it would have.
(Assignee)

Comment 15

a year ago
(In reply to Andrew Osmond [:aosmond] from comment #14)
> (In reply to Timothy Nikkel (:tnikkel) from comment #13)
> > Comment on attachment 8889086 [details] [diff] [review]
> > Part 1. SourceBuffer::Compact should use realloc to grow the first chunk to
> > try to avoid a copy., v2
> >
> > Also, doesn't this mean new chunks now contain uninitialized memory instead
> > of zeroed memory?
> 
> The original "new (fallible) char[mCapacity]" doesn't initialize the memory
> to zero, although had it been "new (fallible) char[mCapacity]()", it would
> have.

Hm, belay that, I think you are indeed correct. Is it a problem?
(Assignee)

Comment 16

a year ago
(In reply to Andrew Osmond [:aosmond] from comment #15)
> (In reply to Andrew Osmond [:aosmond] from comment #14)
> > (In reply to Timothy Nikkel (:tnikkel) from comment #13)
> > > Comment on attachment 8889086 [details] [diff] [review]
> > > Part 1. SourceBuffer::Compact should use realloc to grow the first chunk to
> > > try to avoid a copy., v2
> > >
> > > Also, doesn't this mean new chunks now contain uninitialized memory instead
> > > of zeroed memory?
> > 
> > The original "new (fallible) char[mCapacity]" doesn't initialize the memory
> > to zero, although had it been "new (fallible) char[mCapacity]()", it would
> > have.
> 
> Hm, belay that, I think you are indeed correct. Is it a problem?

Also, it should be poisoned anyways (bug 1365460 removed disabling that as well), so technically it will be initialized to 0xe5.
Comment on attachment 8889086 [details] [diff] [review]
Part 1. SourceBuffer::Compact should use realloc to grow the first chunk to try to avoid a copy., v2

Okay
Attachment #8889086 - Flags: review?(tnikkel) → review+

Comment 18

a year ago
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4f8e0cb21016
Part 1. SourceBuffer::Compact should use realloc to grow the first chunk to try to avoid a copy. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/d8223b2142a9
Part 2. When SourceBuffer::ExpectLength creates the initial buffer, it should not round up. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/8d5e41b9498b
Part 3. SourceBuffer::mChunks should be an AutoTArray instead of FallibleTArray. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/729621d1ffac
Part 4. imgTools::DecodeImage should set the source size hint to optimize the allocation. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/41faadcda947
Part 5. PageIconProtocolHandler should set the content length when creating a channel. r=mak
You need to log in before you can comment on or make changes to this bug.