Closed
Bug 1329104
Opened 8 years ago
Closed 8 years ago
Thread leak
Categories
(Core :: Audio/Video: Playback, defect)
Tracking
()
RESOLVED
FIXED
mozilla53
Tracking | Status | |
---|---|---|
firefox50 | --- | unaffected |
firefox51 | --- | unaffected |
firefox52 | --- | unaffected |
firefox53 | + | fixed |
People
(Reporter: bugzilla.mozilla.org, Assigned: kkoorts)
References
Details
(Keywords: regression, Whiteboard: [MemShrink])
Attachments
(2 files)
Build ID 20170105030229
Nightly seems to be leaking threads in content processes under some circumstances. After 3 hours of browsing with 3 content processes it has accumulated about 3000-4000 threads in each (growing over time).
This leads to large amounts of wasted memory in the form of thread stacks (see attached VMMap screenshot):
4,970.67 MB (100.0%) -- explicit
├──4,291.42 MB (86.33%) ── heap-unclassified
├────265.90 MB (05.35%) -- window-objects
The number of JS compartments is far lower, so those threads probably do not represent workers:
227 (100.0%) -- js-main-runtime-compartments
├──169 (74.45%) -- system
│ ├──156 (68.72%) ++ (156 tiny)
│ ├────9 (03.96%) ── [System Principal], outOfProcessTabChildGlobal [9]
│ └────4 (01.76%) ── [System Principal], Addon-SDK (from: resource://gre/modules/commonjs/toolkit/loader.js:414) [4]
└───58 (25.55%) ++ user
Another curious number in the parent process is
├────218.93 MB (06.87%) -- dom
│ ├──218.11 MB (06.84%) -- memory-file-data
│ │ ├──217.08 MB (06.81%) ── stream [6543]
The number of streams does not add up to the total amount of leaked threads, but it's still suspiciously large. Could the threads belong to some background IO?
Updated•8 years ago
|
Component: General → DOM
Product: Firefox → Core
I left a restarted browser open for several hours with just 1 tab loaded and threads did not accumulate.
After some regular browsing and addon feature use threads started to accumulate in 2 of 3 content processes again. So maybe a small leak in web pages or an addon is entraining a larger one in the form of threads.
This leaks remains even after closing all tabs but a blank one.
6,309.29 MB (100.0%) -- explicit
├──6,180.52 MB (97.96%) ── heap-unclassified
├─────87.45 MB (01.39%) ++ heap-overhead
└─────41.32 MB (00.65%) ++ (18 tiny)
The unclassified amount seems to be consistent with the leaked thread stacks. Would that show up in a DMD build? If so I'd need a windows 64 dmd build.
Comment 4•8 years ago
|
||
Does this reproduce in safe mode? If not can you try selectively re-enabling add-ons?
Flags: needinfo?(bugzilla.mozilla.org)
Whiteboard: [MemShrink]
I copied my tabs over to a profile without addon and it still happens, is that good enough?
Flags: needinfo?(bugzilla.mozilla.org)
Comment 6•8 years ago
|
||
(In reply to The 8472 from comment #5)
> I copied my tabs over to a profile without addon and it still happens, is
> that good enough?
Safe mode will also disable some prefs (graphics mostly I believe), so it's possible something is going on there. Also, if you feel comfortable, it would be *really* helpful to get a list of sites that reproduce this issue.
Flags: needinfo?(bugzilla.mozilla.org)
The issue persists in safe-mode. And no, I don't want to share my tabs.
Flags: needinfo?(bugzilla.mozilla.org)
Comment 8•8 years ago
|
||
(In reply to The 8472 from comment #7)
> The issue persists in safe-mode. And no, I don't want to share my tabs.
Can you elaborate on technologies used in the tabs? ie HTML5 video, webworkers, etc
- no audio
- webm, mp4 and gif are used
- about:debugging shows no workers
- about:memory shows no wasm guard pages, thus no wasm
- none of the opt-in features (camera, screen capture, webrtc) are used
Comment 10•8 years ago
|
||
Any chance you have the visual studio debugger (or other debugger) available and can run Firefox with the debugger attached so you can view the list of threads with names? Unfortunately, the way thread names are reported on windows is to generate a specially constructed exception that the debugger listens for and uses to annotate the thread names. Attaching to Firefox after the threads have already spawned is too late for the existing threads, but for an actively growing leak, the new threads should have names. (See http://searchfox.org/mozilla-central/source/nsprpub/pr/src/md/windows/ntthread.c#292 for the implementation likely in use.)
Reporter | ||
Comment 11•8 years ago
|
||
I have windbg installed
Comment 12•8 years ago
|
||
Okay, so I don't want to pretend I'm a windows super-debugging expert, but I was able to do this and it seems promising:
- Firefox can already be running!
- Run windbg, I did x86 because I've got a 32-bit nightly, but I assume/hope x64 works fine.
- Use about:memory to determine the PID of the content process I'm interested in.
- Use "File... Attach to process (F6)" to locate the PID and attach.
- This suspends the process.
- Type "g" and hit return to cause the process to resume execution.
- Wait a bit for stuff to happen/leak.
- Use "Debug... Break" or press Ctrl-break in the debugger window to suspend the debugee.
- Type "~" and hit return. I see names for the new threads. In my case, "EncodingRunnable #1" and "DOM Worker". I've also seen stream transport threads.
- Type "g" to resume afterwards until you want to Ctrl-break again.
From https://developer.mozilla.org/en-US/docs/Mozilla/How_to_get_a_stacktrace_with_WinDbg it looks like there's commands to log what gets printed to a disk, but I would hope/presume it should be fairly obvious from scrolling through the list of threads what went wrong.
Reporter | ||
Comment 13•8 years ago
|
||
It does not show thread names for me, but after fetching symbols I was able to get thread stacks:
2560 Id: 1b290.1feb4 Suspend: 1 Teb: 000000e0`18148000 Unfrozen
Child-SP RetAddr Call Site
000000e0`29dff6b8 00007ff9`146a75ff ntdll!NtWaitForSingleObject+0x14
000000e0`29dff6c0 00007ff8`c8bf8d60 KERNELBASE!WaitForSingleObjectEx+0x8f
000000e0`29dff760 00007ff9`14b2cab0 xul!thread_decoding_proc(void * p_data = <Value unavailable error>)+0x44 [c:\builds\moz2_slave\m-cen-w64-ntly-000000000000000\build\src\media\libvpx\vp8\decoder\threading.c @ 637]
000000e0`29dff7b0 00007ff9`17498364 ucrtbase!o__realloc_base+0x60
000000e0`29dff7e0 00007ff9`177070d1 KERNEL32!BaseThreadInitThunk+0x14
000000e0`29dff810 00000000`00000000 ntdll!RtlUserThreadStart+0x21
2566 Id: 1b290.1a7c0 Suspend: 1 Teb: 000000e0`180d0000 Unfrozen
Child-SP RetAddr Call Site
000000e0`225ff928 00007ff9`146a75ff ntdll!NtWaitForSingleObject+0x14
000000e0`225ff930 00007ff8`c8bf8d60 KERNELBASE!WaitForSingleObjectEx+0x8f
000000e0`225ff9d0 00007ff9`14b2cab0 xul!thread_decoding_proc(void * p_data = <Value unavailable error>)+0x44 [c:\builds\moz2_slave\m-cen-w64-ntly-000000000000000\build\src\media\libvpx\vp8\decoder\threading.c @ 637]
000000e0`225ffa20 00007ff9`17498364 ucrtbase!o__realloc_base+0x60
000000e0`225ffa50 00007ff9`177070d1 KERNEL32!BaseThreadInitThunk+0x14
000000e0`225ffa80 00000000`00000000 ntdll!RtlUserThreadStart+0x21
2567 Id: 1b290.1dfe4 Suspend: 1 Teb: 000000e0`180d2000 Unfrozen
Child-SP RetAddr Call Site
000000e0`227ffc08 00007ff9`146a75ff ntdll!NtWaitForSingleObject+0x14
000000e0`227ffc10 00007ff8`c8bf8d60 KERNELBASE!WaitForSingleObjectEx+0x8f
000000e0`227ffcb0 00007ff9`14b2cab0 xul!thread_decoding_proc(void * p_data = <Value unavailable error>)+0x44 [c:\builds\moz2_slave\m-cen-w64-ntly-000000000000000\build\src\media\libvpx\vp8\decoder\threading.c @ 637]
000000e0`227ffd00 00007ff9`17498364 ucrtbase!o__realloc_base+0x60
000000e0`227ffd30 00007ff9`177070d1 KERNEL32!BaseThreadInitThunk+0x14
000000e0`227ffd60 00000000`00000000 ntdll!RtlUserThreadStart+0x21
2581 Id: 1b290.1f870 Suspend: 1 Teb: 000000e0`18134000 Unfrozen
Child-SP RetAddr Call Site
000000e0`289ffc68 00007ff9`146a75ff ntdll!NtWaitForSingleObject+0x14
000000e0`289ffc70 00007ff8`c8bf8d60 KERNELBASE!WaitForSingleObjectEx+0x8f
000000e0`289ffd10 00007ff9`14b2cab0 xul!thread_decoding_proc(void * p_data = <Value unavailable error>)+0x44 [c:\builds\moz2_slave\m-cen-w64-ntly-000000000000000\build\src\media\libvpx\vp8\decoder\threading.c @ 637]
000000e0`289ffd60 00007ff9`17498364 ucrtbase!o__realloc_base+0x60
000000e0`289ffd90 00007ff9`177070d1 KERNEL32!BaseThreadInitThunk+0x14
000000e0`289ffdc0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Comment 14•8 years ago
|
||
How long does it take for this to reproduce for you? Any chance you could use mozregression to narrow down when the problem started? I believe it has options to use an existing profile, etc.
http://mozilla.github.io/mozregression/
Reporter | ||
Comment 15•8 years ago
|
||
2017-01-14T02:35:44: INFO : Narrowed inbound regression window from [7ce2094b, ce672399] (4 revisions) to [073d993c, ce672399] (2 revisions) (~1 steps left)
2017-01-14T02:35:44: DEBUG : Starting merge handling...
2017-01-14T02:35:44: DEBUG : Using url: https://hg.mozilla.org/integration/autoland/json-pushes?changeset=ce67239948a0319df63deef8fccaf023731f6a29&full=1
2017-01-14T02:35:45: DEBUG : Found commit message:
Bug 1321076 - In the case of alpha, VPXDecoder uses overloaded CreateAndCopy that takes alpha plane. r=jya
MozReview-Commit-ID: AIJxPRjGvrg
2017-01-14T02:35:45: INFO : The bisection is done.
2017-01-14T02:35:45: INFO : Stopped
Comment 16•8 years ago
|
||
It would appear bug 1321076 added a decoder, |mVPXAlpha|, but doesn't clean it up [1].
[1] http://searchfox.org/mozilla-central/rev/0aed9484bd3e97206fd1949ee4a4992ef300a81f/dom/media/platforms/agnostic/VPXDecoder.cpp#87-91
Blocks: 1321076
Component: DOM → Audio/Video: Playback
Flags: needinfo?(amarchesini) → needinfo?(kkoorts)
Comment 17•8 years ago
|
||
[Tracking Requested - why for this release]:
Memory leak regression impacting UX.
tracking-firefox53:
--- → ?
Keywords: regression
Comment 18•8 years ago
|
||
By the way, thank you for your help and persistence tracking this down!
Reporter | ||
Comment 19•8 years ago
|
||
I think another issue is that thread stacks only show up as heap-unclassified. If about:memory had provided more information this would have been easier to track down.
Comment hidden (mozreview-request) |
Comment 21•8 years ago
|
||
mozreview-review |
Comment on attachment 8827282 [details]
Bug 1329104 - Shutdown context used for WebM alpha decoding.
https://reviewboard.mozilla.org/r/104996/#review105786
::: dom/media/platforms/agnostic/VPXDecoder.cpp:91
(Diff revision 1)
>
> void
> VPXDecoder::Shutdown()
> {
> vpx_codec_destroy(&mVPX);
> + if (mInfo.HasAlpha()) {
seeing that we don't test the return values, the test appears unecessary
Attachment #8827282 -
Flags: review?(jyavenard) → review+
Comment hidden (mozreview-request) |
Assignee | ||
Comment 24•8 years ago
|
||
mozreview-review |
Comment on attachment 8827282 [details]
Bug 1329104 - Shutdown context used for WebM alpha decoding.
https://reviewboard.mozilla.org/r/104996/#review106018
Flags: needinfo?(kkoorts)
Keywords: checkin-needed
Updated•8 years ago
|
status-firefox50:
--- → unaffected
status-firefox51:
--- → unaffected
status-firefox52:
--- → unaffected
Comment 25•8 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/5d4f34a2196c
Shutdown context used for WebM alpha decoding. r=jya
Keywords: checkin-needed
Comment 26•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Updated•8 years ago
|
Assignee: nobody → kkoorts
You need to log in
before you can comment on or make changes to this bug.
Description
•