Closed Bug 1023585 Opened 10 years ago Closed 10 years ago

Leak of array buffer in JSStructuredCloneWriter::transferOwnership()

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: mccr8, Assigned: mccr8)

References

(Blocks 1 open bug)

Details

(Keywords: memory-leak, Whiteboard: [MemShrink:P2])

Attachments

(2 files)

This is happening from some test in M1 that landed recently.  I'll try to figure out which test it is.
This is probably not the fault of the JSStructuredCloneWriter code.
Waldo, any ideas?
Flags: needinfo?(jwalden+bmo)
Whiteboard: [MemShrink] → [MemShrink:P2]
Well, odds are something further up the stack is doing something bad that is leaking the buffer.  If I can recreate this locally, I'll post a better stack.
What about just running M1 with a a conditional breakpoint looking for that particular ArrayBuffer contents length?  Probably 40152 is relatively unique in the right, being large and not actually a page multiple.  (I checked MXR, don't see that number in anything suspicious-looking, nor /2 /4 /8 versions of it.)  JS stack for that allocation is probably enough to figure it out, or at least to investigate much more simply.
Flags: needinfo?(jwalden+bmo)
Looks like some kind of worker XHR issue.  I'll try poking around for arrays of that particular size, as Waldo suggested.
Two of these four leaks happen with content/base/test/test_bug1008126.html
Assignee: nobody → continuation
This looks like some kind of array buffer bug.

We're in ArrayBufferObject::stealContents(), we allocate newData, then go into the |buffer->hasStealableContents()| case, which sets that the buffer doesn't own the data.  Then we call neuter with buffer and newData.  In neuter, newData != buffer->dataPointer(), so we call setNewOwnedData, but ownsData() is false, so we don't free anything.  Maybe because the buffer doesn't think it owns newData it doesn't free it when it dies?

Calling free on newData instead of neuter fixes the leak, but presumably that is not the right behavior. :)

If I immediately free newData after neuter, then I get a double free on some other test case.

Calling |buffer->setOwnsData(OwnsData);| after neuter does not fix the leak.

Does somebody know what is going on here?
> Does somebody know what is going on here?

Waldo and/or sfink, hopefully.
Flags: needinfo?(sphink)
Flags: needinfo?(jwalden+bmo)
(In reply to Andrew McCreight [:mccr8] from comment #7)
> This looks like some kind of array buffer bug.
> 
> We're in ArrayBufferObject::stealContents(), we allocate newData, then go
> into the |buffer->hasStealableContents()| case, which sets that the buffer
> doesn't own the data.  Then we call neuter with buffer and newData.  In
> neuter, newData != buffer->dataPointer(), so we call setNewOwnedData, but
> ownsData() is false, so we don't free anything.  Maybe because the buffer
> doesn't think it owns newData it doesn't free it when it dies?

So far, this sounds like the correct behavior. You're stealing the contents, so the caller is taking ownership of the data in the buffer at the beginning. We allocate a crashpad safety buffer and trade it with the ArrayBuffer for its real data, and make the ArrayBuffer own the crashpad buffer. ownsData() is false because that data is being handed over to the caller -- we'd better not free it! But when the AB is eventually finalized, it should free the crashpad data.

> Calling free on newData instead of neuter fixes the leak, but presumably
> that is not the right behavior. :)

Urk? You allocate a crashpad, then immediately free it?

> 
> If I immediately free newData after neuter, then I get a double free on some
> other test case.

Yes, there would be a double-free when the AB is finalized.

> Calling |buffer->setOwnsData(OwnsData);| after neuter does not fix the leak.
> 
> Does somebody know what is going on here?

The caller isn't freeing the data, it sounds like. What makes you think that it's newData that is leaking? I would expect it to be oldData.
Flags: needinfo?(sphink)
(In reply to Steve Fink [:sfink] from comment #9)
> (In reply to Andrew McCreight [:mccr8] from comment #7)
> The caller isn't freeing the data, it sounds like. What makes you think that
> it's newData that is leaking? I would expect it to be oldData.

Ah! Just looked at the stack. That's why you think it's newData.

I wonder if this is newData that gets stolen again. Which would be weird, since the buffer should be neutered at that point. Seems like we're missing an assert that you don't call stealContents on a neutered buffer.
(In reply to Steve Fink [:sfink] from comment #10)
> (In reply to Steve Fink [:sfink] from comment #9)
> > (In reply to Andrew McCreight [:mccr8] from comment #7)
> > The caller isn't freeing the data, it sounds like. What makes you think that
> > it's newData that is leaking? I would expect it to be oldData.
> 
> Ah! Just looked at the stack. That's why you think it's newData.
> 
> I wonder if this is newData that gets stolen again. Which would be weird,
> since the buffer should be neutered at that point. Seems like we're missing
> an assert that you don't call stealContents on a neutered buffer.

But that wouldn't match these symptoms anyway. A neutered buffer would never return true for hasStealableContents().
To reproduce this, you can go into content/base/test/file_bug1008126_worker.js and comment everything from the "tests" list except test_sync_xhr_data1 or test_async_xhr_data1.

Then do
./mach mochitest-plain --start-at content/base/test/test_bug1008126.html --end-at content/base/test/test_bug1008126.html
Ok, so what would happen in a double-steal case?

The first time through, stealContents() would hand back a pointer to the real data and leave the buffer with an OwnsData pointer to a crashpad.

The second time through, stealContents() would make crashpad2, and not call setNewOwnedData at all. Then it would return crashpad2, leaving the buffer with its OwnsData pointer to crashpad1.

That all seems fine. But say the caller leaks crashpad2. You'll end up with a leak of a pointer allocated in AllocateArrayBufferData, exactly what you have in the attached stack.

I'm suspicious of how structured clone handles duplicate transferable arraybuffers.
It looks like stealContents is only being called on the buffer once.
Bug 1037358 has a testcase patch that'll leak and leak and leak and leak and so on, that's probably easier to debug than the steps in comment 12.  (Or so it appears.)
The leaked size in this bug is 40152, which is the multiple of data_1.txt size in the test case.

Archive:  file_bug945152.jar
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
     281  Defl:N      178  37% 1970-01-17 12:16 b88d9a68  manifest.webapp
    2133  Defl:N      844  60% 1970-01-17 12:16 02655c5f  index.html
   10038  Stored    10038   0% 1970-01-17 12:16 16a9a4b8  data_1.txt
   10022  Defl:N       52 100% 1970-01-17 12:16 3a215594  data_2.txt
40000000  Defl:N    77601 100% 2014-06-10 16:00 2ed1e617  data_big.txt
--------          -------  ---                            -------
40022474            88713 100%                            5 files

Since data_1.txt in the JAR file is uncompressed and page aligned, when "dom.mapped_arraybuffer.enabled" is true, array buffer object from XHR will be created as memory-mapped.

So this bug seems to be leakage of memory-mapped array buffer, and the patch in bug 1037358 ensures to free the safety buffer of a neutered mapped array buffer.  Could you try if it helps?

BTW, the leak allocation stack looks cool and useful, is there any MDN documents about it?
Flags: needinfo?(continuation)
> Could you try if it helps?
Sure, I can do that.

> BTW, the leak allocation stack looks cool and useful, is there any MDN documents about it?
This is from LeakSanitizer, which is a special execution mode of AddressSanitizer.  When you run ASan mochitests, it automatically enables LeakSanitizer, or there are some special flags you can set in an ASan build to enable it.  There's more info here: https://developer.mozilla.org/en-US/docs/Mozilla/Testing/Firefox_and_Address_Sanitizer#LeakSanitizer
Flags: needinfo?(continuation)
Flags: needinfo?(continuation)
Great!  That fixes it for at least the test in comment 12.  You can either remove the whitelisting for this in build/sanitizers/lsan_suppressions.txt like so:

-# Bug 1023585 - Leak of array buffer in JSStructuredCloneWriter::transferOwnership(). m1
-leak:AllocateArrayBufferContents
-

...or we can just do it separately in a followup, once we've confirmed all the leaks are actually fixed.  If you remove it from the whitelist, make sure that you do an ASan try run, because we've had a bunch of different leaks with this suppression, so something else may have snuck.  I checked today, and this suppression is only being used in M1.  I tried to do a local run with your fix, but it was hitting some presumably unrelated crash.
Depends on: 1037358
Flags: needinfo?(jwalden+bmo)
Flags: needinfo?(continuation)
Patches to fix this leak landed in bug 1037358, so I landed a patch to remove the suppression with r=khuey over IRC.

https://hg.mozilla.org/integration/mozilla-inbound/rev/84845cf0f398
https://hg.mozilla.org/mozilla-central/rev/84845cf0f398
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: