Closed Bug 1742540 Opened 3 years ago Closed 3 years ago

Memory leak on nsPipe when building ClonedMessageData of Blob.

Categories

(Core :: DOM: File, defect, P3)

Firefox 96
defect

Tracking

()

RESOLVED FIXED
98 Branch
Tracking Status
firefox-esr91 --- wontfix
firefox96 --- wontfix
firefox97 --- wontfix
firefox98 --- fixed

People

(Reporter: vul3vm06, Assigned: shelly)

Details

(Keywords: memory-leak)

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:94.0) Gecko/20100101 Firefox/94.0

Steps to reproduce:

Post a message with local file (> 100KB) to service worker, and do not reference the file on service worker side at all.

test page: https://vul3vm06.github.io/post-file-to-sw/

Actual results:

The following unreported blocks accumulates steadily every time the file is posted to service worker.

Unreported {
18 blocks in heap block record 60 of 15,643
73,728 bytes (73,728 requested / 0 slop)
Individual block sizes: 4,096 x 18
0.06% of the heap (28.91% cumulative)
0.15% of unreported (76.41% cumulative)
Allocated at {
#01: mozilla::dmd::StackTrace::Get(mozilla::dmd::Thread*) (/mnt/md0/gecko-dev/memory/replace/dmd/DMD.cpp:663)
#02: mozilla::dmd::AllocCallback(void*, unsigned long, mozilla::dmd::Thread*) (/mnt/md0/gecko-dev/memory/replace/dmd/DMD.cpp:1024)
#03: replace_malloc(unsigned long) (/mnt/md0/gecko-dev/memory/replace/dmd/DMD.cpp:1092)
#04: nsSegmentedBuffer::AppendNewSegment() (/mnt/md0/gecko-dev/xpcom/io/nsSegmentedBuffer.cpp:60)
#05: nsPipe::GetWriteSegment(char*&, unsigned int&) (/mnt/md0/gecko-dev/xpcom/io/nsPipe3.cpp:822)
#06: nsPipeOutputStream::WriteSegments(nsresult ()(nsIOutputStream, void*, char*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) (/mnt/md0/gecko-dev/xpcom/io/nsPipe3.cpp:1665)
#07: nsStreamCopierOB::DoCopy(nsresult*, nsresult*) (/mnt/md0/gecko-dev/xpcom/io/nsStreamUtils.cpp:560)
#08: nsAStreamCopier::Process() (/mnt/md0/gecko-dev/xpcom/io/nsStreamUtils.cpp:302)
}
}

Actually, there is a nsPipe object whose refcnt remains 3, and therefore is never deleted.

Expected results:

The nsPipe should be deleted, and no accumulated blocks on dmd dump.

See the attachment for details: https://drive.google.com/file/d/1JuHHQt8_-wVSANXS_iu7vajAMmRl0iKt/view?usp=sharing

Steps:

  1. Open firefox, browse the test page: https://vul3vm06.github.io/post-file-to-sw/.
    (Mimimize memory usage and colloect gc/cc/dmd, save to about-memory-0 in the attachment)
  2. Upload a file, and click the button to post message to service worker.
    (Mimimize memory usage and colloect gc/cc/dmd, save to about-memory-1 in the attachment)
  3. Repeat step 2, upload a file and click send.
    (Mimimize memory usage and colloect gc/cc/dmd, save to about-memory-2 in the attachment)
  4. Close the test page.
    (Mimimize memory usage and colloect gc/cc/dmd, save to about-memory-3 in the attachment)

In step 2, there is a nsPipe object whose refcnt remains 3, and therefore is never deleted.

The backtrace of the nsPipe::AddRef and nsPipe::Release are captured and in the the attachment too, see nsPipe-addref-release folder.

The gecko-dev git HEAD on this experiment is
commit 702199bca53babc925e47fd8f86ed56487d42492
Date: Sun Nov 21 13:50:24 2021 +0000

This was originally found by b2g device MTBF test.

Component: General → DOM: Service Workers
Keywords: memory-leak
Component: DOM: Service Workers → DOM: postMessage
Severity: -- → S2
Priority: -- → P3
Flags: needinfo?(bugs)
Assignee: nobody → shellylin
Attachment #9255176 - Attachment description: WIP: Bug 1742540 - Close input stream when destructing StreamBlobImpl. → Bug 1742540 - Close input stream when destructing StreamBlobImpl.

Patch seems simple but the root cause is complicated, I try to describe it as brief as possible.

When building ClonedMessageData of a Blob, gecko will clone the inputStream [1] and wrap it as a RemoteLazyStream in serialize phase.

If the inputStream is not cloneable (i.e. FileStream), NS_CloneInputStream [2] creates a new pipe and return two cloned stream (reader and writer). Then first, clone reader stream, wrap this cloned inputStream as RemoteLazyStream, serialize it with IPC. Second, copy the source data to writer stream.

To start the copy phase, copier holds references of writer stream (this adds two extra refcnt of pipe). And process data copy iff there is read request from child side. Then close the writer stream and release them when done with copy. When child has done with RemoteLazyStream, RemoteLazyInputStreamParent will try to release the cloned inputStream, closing itself explicitly [3].

Logic in nsPipe::OnInputStreamException [3] shows that gecko only shutdown the entire pipe iff there is one stream left and it is the given stream. So here we will not shutdown the pipe because there are two inputStreams left, one is the original inputStream, the other is the cloned inputStream. The cloned inputStream will be closed by ActorDestroy of RemoteLazyInputStreamParent and decrease one refcnt of this pipe.

In nsPipe::Release [4], shows that gecko only release the original inputStream iff the refcnt of pipe decrease to one. In the scenario where no read request from child, copier never finishes, two extra refcnt of pipe hold by copier, original inputStream will not be release. Thus, this is why I try to close the inputStream (a.k.a. the original inputStream) in advance when destructing StreamBlobImpl.

[1] https://searchfox.org/mozilla-central/source/dom/file/StreamBlobImpl.cpp#73
[2] https://searchfox.org/mozilla-central/source/xpcom/io/nsStreamUtils.cpp#859
[3] https://searchfox.org/mozilla-central/source/xpcom/io/nsPipe3.cpp#905-910
[4] https://searchfox.org/mozilla-central/source/xpcom/io/nsPipe3.cpp#516-519

Attachment #9255176 - Attachment description: Bug 1742540 - Close input stream when destructing StreamBlobImpl. → Bug 1742540 - Close input stream when destructing StreamBlobImpl. r=smaug

So in normal case when someone does read the data, what calls close on the original input stream?
[3] doesn't seem to link to any closing step.

The patch is probably fine, but why aren't we closing this all in ActorDestroy of RemoteLazyInputStream?

Flags: needinfo?(bugs)
Flags: needinfo?(shellylin)

(In reply to Olli Pettay [:smaug] from comment #4)

So in normal case when someone does read the data, what calls close on the original input stream?
[3] doesn't seem to link to any closing step.

Normally if child side request reading data, we will process and finish with the copier, close the sink and null out holding references.
https://searchfox.org/mozilla-central/source/xpcom/io/nsStreamUtils.cpp#366-367

The patch is probably fine, but why aren't we closing this all in ActorDestroy of RemoteLazyInputStream?

Currently the inputStream(the cloned one) is closed by ActorDestroy of RemoteLazyInputStreamParent already (via RemoteLazyInputStreamStorage::ForgetStream). I'm not sure why the "original" inputStream(owned by StreamBlobImpl) is not closed when destructing StreamBlobImpl (the destructing happens way earlier then ActorDestroy), I can expose that original inputStream to RemoteLazyInputStreamParent, just that it will involve some amount of interface changes to serialization related functions [1] though.
[1] https://searchfox.org/mozilla-central/source/dom/file/ipc/IPCBlobUtils.cpp#133-134

I also tried using NS_CancelAsyncCopy in ActorDestroy of RemoteLazyInputStreamParent (plus lots of interface changing just to expose copier context), end up no help since nsAStreamCopier::Cancel won't lead to process copier or pipe shutdown....

Flags: needinfo?(shellylin)

(In reply to Shelly Lin [:shelly] from comment #5)

(In reply to Olli Pettay [:smaug] from comment #4)

Currently the inputStream(the cloned one) is closed by ActorDestroy of RemoteLazyInputStreamParent already (via RemoteLazyInputStreamStorage::ForgetStream). I'm not sure why the "original" inputStream(owned by StreamBlobImpl) is not closed when destructing StreamBlobImpl (the destructing happens way earlier then ActorDestroy),

Answering myself, reference of original inputStream is still hold by the pipe.

Flags: needinfo?(bugs)

Depends on D133717

Flags: needinfo?(bugs)
Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3ec889d687e4 Close input stream when destructing StreamBlobImpl. r=smaug https://hg.mozilla.org/integration/autoland/rev/14ca6fe48a2b Add test case to reproduce leaks. r=smaug

Backed out for causing mochitest failures on test_bug1742540.html.

Push with failures

Failure log

Backout link

[task 2022-01-11T11:08:28.582Z] 11:08:28     INFO -  TEST-START | dom/file/tests/test_bug1742540.html
[task 2022-01-11T11:13:51.919Z] 11:13:51     INFO -  Buffered messages logged at 11:08:24
[task 2022-01-11T11:13:51.919Z] 11:13:51     INFO -  add_task | Entering test setupPrefs
[task 2022-01-11T11:13:51.919Z] 11:13:51     INFO -  add_task | Leaving test setupPrefs
[task 2022-01-11T11:13:51.920Z] 11:13:51     INFO -  add_task | Entering test send_file_to_serviceworker
[task 2022-01-11T11:13:51.920Z] 11:13:51     INFO -  TEST-PASS | dom/file/tests/test_bug1742540.html | ServiceWorker is activated
[task 2022-01-11T11:13:51.922Z] 11:13:51     INFO -  Buffered messages finished
[task 2022-01-11T11:13:51.922Z] 11:13:51  WARNING -  TEST-UNEXPECTED-FAIL | dom/file/tests/test_bug1742540.html | Test timed out. -
[task 2022-01-11T11:13:51.922Z] 11:13:51  WARNING -  TEST-UNEXPECTED-FAIL | dom/file/tests/test_bug1742540.html | This test left a service worker registered without cleaning it up
[task 2022-01-11T11:13:51.922Z] 11:13:51     INFO -      SimpleTest.ok@SimpleTest/SimpleTest.js:417:16
[task 2022-01-11T11:13:51.922Z] 11:13:51     INFO -      afterCleanup@SimpleTest/SimpleTest.js:1436:18
[task 2022-01-11T11:13:51.922Z] 11:13:51  WARNING -  TEST-UNEXPECTED-FAIL | dom/file/tests/test_bug1742540.html | Left over worker: http://mochi.test:8888/tests/dom/file/tests/worker_bug1742540.js (scope: http://mochi.test:8888/tests/dom/file/tests/)
[task 2022-01-11T11:13:51.922Z] 11:13:51     INFO -      SimpleTest.ok@SimpleTest/SimpleTest.js:417:16
[task 2022-01-11T11:13:51.922Z] 11:13:51     INFO -      afterCleanup@SimpleTest/SimpleTest.js:1441:20
[task 2022-01-11T11:13:51.922Z] 11:13:51     INFO -  TEST-OK | dom/file/tests/test_bug1742540.html | took 323501ms
[task 2022-01-11T11:13:51.922Z] 11:13:51  WARNING -  TEST-UNEXPECTED-FAIL | /tests/dom/file/tests/test_bug1742540.html logged result after SimpleTest.finish(): This test left a service worker registered without cleaning it up
[task 2022-01-11T11:13:51.923Z] 11:13:51  WARNING -  TEST-UNEXPECTED-FAIL | /tests/dom/file/tests/test_bug1742540.html logged result after SimpleTest.finish(): Left over worker: http://mochi.test:8888/tests/dom/file/tests/worker_bug1742540.js (scope: http://mochi.test:8888/tests/dom/file/tests/)
[task 2022-01-11T11:13:51.923Z] 11:13:51     INFO -  TEST-START | dom/file/tests/test_file_from_blob.html
Flags: needinfo?(shellylin)
Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0d9949fa97e2 Close input stream when destructing StreamBlobImpl. r=smaug https://hg.mozilla.org/integration/autoland/rev/84cae67e60f6 Add test case to reproduce leaks. r=smaug
Status: UNCONFIRMED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 98 Branch

Seems like we can let this fix ride the trains, but feel free to nominate for uplift if you feel otherwise.

Flags: needinfo?(shellylin) → in-testsuite+
Component: DOM: postMessage → DOM: File
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: