Closed
Bug 1420608
Opened 7 years ago
Closed 7 years ago
very high CPU / GPU / power use on page with mp4 videos (all playing at the same time, autoplay)
Categories
(Core :: Audio/Video: Playback, defect, P3)
Core
Audio/Video: Playback
Tracking
()
RESOLVED
FIXED
mozilla59
People
(Reporter: jkleverson77, Assigned: jwwang)
References
Details
(Keywords: power)
Attachments
(13 files)
189.82 KB,
image/jpeg
|
Details | |
2.87 MB,
application/x-7z-compressed
|
Details | |
770.84 KB,
text/csv
|
Details | |
125.65 KB,
image/jpeg
|
Details | |
219.62 KB,
image/jpeg
|
Details | |
96.22 KB,
image/jpeg
|
Details | |
233.54 KB,
image/jpeg
|
Details | |
100.48 KB,
image/jpeg
|
Details | |
93.51 KB,
image/jpeg
|
Details | |
59 bytes,
text/x-review-board-request
|
mozbugz
:
review+
|
Details |
59 bytes,
text/x-review-board-request
|
alwu
:
review+
|
Details |
108.44 KB,
image/jpeg
|
Details | |
7.76 KB,
patch
|
jwwang
:
review+
gchang
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0 Build ID: 20171125100325 Steps to reproduce: - create a new profile, exit - start Firefox in the previous profile and visit https://www.jwz.org/blog/2017/10/anime-floppy-disks/ - scroll down to verify that all the videos are (automatically) playing - scroll to the end of the page, verify that the last video is playing - open task manager, observe high CPU - scroll up and down on the page to begin and end, wait in between - the CPU doubles again and stays at that use Actual results: - 45% total CPU use at 3.15 GHz on i7-3630QM CPU (4 cores / 8 threads) Windows 8.1 Expected results: - 40 times less CPU is needed! Proof: the same page on IE 11: 3% of total CPU use on 1.15 GHz speed on the same machine: that is 40 times less CPU constantly used! Attached picture shows both result: foreground: Firefox Nightly, background: IE 11, same page loaded, videos automatically playing in both cases
The Local copy of the https://www.jwz.org/blog/2017/10/anime-floppy-disks/ page, in case the page content changes on the site. The page was saved using Save as / Web page complete. I can reproduce the behavior described opening this local html.
Summary: 40 times very high CPU use on page with mp4 videos → 40 times very high CPU use on page with mp4 videos (all playing at the same time, autoplay)
(The picture attached to the main report is a Photoshop composite of two independent runs, only one browser was running for each of the measurements, no other pages were loaded during the measurement).
The difference between the period when the CPU is cca 25% and when the CPU is cca 50% is that in the former no new threads are constantly created, but once the CPU jumps the constant thread creation and destruction inside of a Firefox.exe process can be observed. Attached the log file from PROCMON where the thread creation is recorded (saved as a csv file). More than 8 threads are created and destroyed per second in that cca 50% phase.
Comment on attachment 8931864 [details]
composite-screenshot-FF-high-CPU-vs-IE-low.jpg
The composite file: foreground: the screenshot of Firefox nightly using 45% CPU in Turbo mode (3.15 GHz). Background: the screenshot of IE 11 using 3% in minimal mode (1.15 GHz). The measurements were made independently and then were combined in a single image for easy comparison.
Attachment #8931864 -
Attachment description: 40 times more constant CPU use than IE 11 on the same 4 core / 8 threads computer → composite-screenshot-FF-high-CPU-vs-IE-low.jpg
I've tried to figure out some stack traces. I don't know how it comes there, but maybe the thread creation happens because of this? Something creates: class SupportsConfigEvent : public Runnable { public: SupportsConfigEvent(DXVA2Manager* aDXVA2Manager, IMFMediaType* aMediaType, float aFramerate) : Runnable("SupportsConfigEvent") , mDXVA2Manager(aDXVA2Manager) , mMediaType(aMediaType) , mFramerate(aFramerate) , mSupportsConfig(false) { } NS_IMETHOD Run() { MOZ_ASSERT(NS_IsMainThread(), "Must be on main thread."); mSupportsConfig = mDXVA2Manager->SupportsConfig(mMediaType, mFramerate); return NS_OK; } which in turn runs: CanCreateDecoder(desc, aFramerate); which does: RefPtr<IDirectXVideoDecoder> decoder = CreateDecoder(aDesc); return decoder.get() != nullptr; It seems that it's that, as I see a lot of threads starting somewhere in msmpeg2vdec.dll which is probably the result of the CreateDecoder call. Note it's not the whole story about the slowdowns, only the second phase. Even if this specific scenario is fixed, the CPU is much, much higher than in IE in the first phase before that process creation in the loop starts (before the CPU jumps to 45% on my machine).
The number of threads Firefox created in every second, made by processing the high-cpu-procmon-log.CSV and roughly compared with the observed CPU use from other sampling. The graphs are not made during the same sampling, there were two runs, so 1 to 1 correspondence of the seconds doesn't exist, but the two distinct periods could be recognized (roughly 25% and roughly 50% CPU usage).
CPU uses as drawn by Process Explorer (clicking on the CPU graph) in two phases when Nightly shows https://www.jwz.org/blog/2017/10/anime-floppy-disks/ First phase: no new threads are created, Second: continuous creation of new threads
Finally, the same CPU graph (including context switches) as in nightly-ctx-switches.jpg, but for the reference case of IE 11. Note that the CPU speed there remains very low compared to both Nightly cases, and compare also the number of contexts switches and interrupts to the nightly-ctx-switches.jpg.
Even more statistics, quite telling, IMHO: CPU Power draw, measured using "Intel® Power Gadget 3.0": IE 11 reference: 6.3 W Nightly Phase 1: 17.6 W (almost 3 times than IE 11) Phase 2: 26.5 W (more than 4 times than IE 11) GPU Load, measured using TechPowerUp GPU-Z 2.5.0: IE 11 reference: 5-6% when on videos, 0% (!) on the bottom of the page Nightly: Phase 1: 15-18% when on videos, 7% on the bottom of the page Phase 2: (not measured) 10-11% (!) on the bottom of the page
Reporter | ||
Comment 10•7 years ago
|
||
Wall plug power draw of the notebook, while battery removed: Writing this with Nightly in https://bugzilla.mozilla.org/show_bug.cgi?id=1420608 : 20 W Showing https://www.jwz.org/blog/2017/10/anime-floppy-disks/ : IE 11 reference: 27 W 35% more than idle Nightly Phase 1: 40 W cca 50% more than IE 11 for the same page, or 2 X more than idle Nightly Phase 2: 53 W cca 2 X more than IE 11 for the same page, or 2.6 X more than idle
Updated•7 years ago
|
Reporter | ||
Comment 11•7 years ago
|
||
I've also tried to see a kind of maximum wall plug draw: I've opened 6 tabs in Nightly, all with the jwz page: the draw reached a maximum of 68 W but didn't keep it long, probably the power or temperature management scaled it down to 64 W. Then, to compare, I've opened 6 tabs of the same page in IE 11, it drew on average 33 W. So with more tabs open with the pages with more videos, Nightly still sucks up the battery twice as fast! Moreover, the power draw above the idle is even worse than what the Windows Task Manager displays as values. Specifically, if the maximum draw is around 68 W and idle draw is 20 W, then is Nightly Phase 1 (40W) is around 40 percent of the span between 20W and 68W. The phase 2 (53W) is then around 60% of the span between 20 and 68W. I've seen some bug report where somebody claimed that the task manager isn't correct, but in this example task manager is even optimistic, compared to the actual power draw: the corresponding values were around 23% and 45%, so definitely not too high, when comparing with the total power draw. But this also means that I shouldn't have used CPU speed from it to additionally scale the estimates, that is, now I know that it shows reasonable values without the need for the additional scaling.
Summary: 40 times very high CPU use on page with mp4 videos (all playing at the same time, autoplay) → very high CPU / power use on page with mp4 videos (all playing at the same time, autoplay)
Summary: very high CPU / power use on page with mp4 videos (all playing at the same time, autoplay) → very high CPU / GPU / power use on page with mp4 videos (all playing at the same time, autoplay)
The thread creation is probably due to creating a new decoder for each loop of each video. I tried a couple of experiments: 1. By setting media.wmf.dxva.max-videos=999 I got all the videos to be hardware decoded. The page still used a lot of CPU. 2. Pausing all of the videos by right-clicking. This reduces the CPU usage significantly.
Priority: -- → P3
Updated•7 years ago
|
Flags: needinfo?(ajones)
Reporter | ||
Comment 13•7 years ago
|
||
I've extracted the corresponding lengths of all videos in that page: 02.07 03.07 01.63 02.97 01.97 01.17 01.17 01.90 02.40 00.27 01.17 02.57 01.83 01.43 02.90 04.57 And all number of frames: 62 92 49 89 59 35 35 57 72 8 35 77 55 43 87 137 There are 16 of videos inside, the longest is less than 4.57 seconds. All are 30 fps effectively. There is no sound in any of them physically. Moreover I believe all are marked in HTML Here is a number of the threads created in every second in the example for which I've made a graph: atSecond, numbers of threads created: 009 58 threads created 010-068 -- no threads created 069 6 070 14 071 16 072 15 073 14 074 41 087 6 088 46 0 - 2 per sec 129-154 -- no threads created 154 4 155 38 156 36 157 38 158 38 159 41 160 41 161 42 162 41 163 42 164 43 165 43 166 39 167 33 168 42 169 36 170 34 171 36 172 41 173 39 174 43 175 34 176 40 177 45 178 45 and then continues around 45 per second. Why is creating 45 of threads *per second* needed to loop 16 same videos on the page on which nothing else changes? This computer on which I've measured is reasonably fast. I've observed a user scrolling his Facebook feed (with, of course, videos in the page, starting one after another) using Firefox on a less powerful computer and Firefox becoming not responsive after not too much scrolling. I assume the same effect happens as this one I've isolated here. The "constant thread creation" phase is obviously triggered at some point and it's obviously too much. But it's not the only problem, the amount of work (and the power used!) needed to play the videos even without that thread creation is still double compared with the old IE 11. That it at least twice as less time on the battery --- especially for people who open more tabs. A "normal" user doesn't have ad blockers or video blockers. He opens more pages in more tabs. If the videos are playing (ads and whatever), is not going to search them on every page and every tab to pause them. Just having open Firefox with the open tabs will drain his battery. I've observed the problems with videos "from the distance" for at least many months, and now after a lot of progress in development, and still seeing it... Anthony, did I understand correctly: you can reproduce what I described? Shouldn't the bug then be marked "New"? I see "needinfo"? Can I help more?
Reporter | ||
Comment 14•7 years ago
|
||
I've set media.wmf.dxva.max-videos=999 too. The results: Nightly, after opening the page (before the "constant thread creation phase"): 14% CPU use (2.4 GHz), 12% GPU load IE 11 is still: 6% CPU (1.15 Ghz), 5% CPU load
Reporter | ||
Comment 15•7 years ago
|
||
As I've posted before (with the code quoted, see above) it appears to me that there's some call which results in the creation of the decoder which is immediately discarded afterwards, because the call only asks "can it be created" and for that purpose the whole decoder is created, the bool value true is remembered to be returned, and then the decoder is discarded. Strangely, it seems something has to happen to trigger that phase, as there is a period without the creation of the new decoders. Interestingly around 45 threads per second created is close to three times the number of videos on the page (3 * 16 = 48).
Reporter | ||
Comment 16•7 years ago
|
||
The two big issues here are, as I see it: 1) there is some *trigger* which initiates the loop where 45 threads are created and destroyed *per second* (observable using Sysinternals Process Monitor), the CPU/GPU/electricity use of course jumps again and the number of CPU interrupts per second raises 7 times (as seen in nightly-ctx-switches.jpg). 2) even before the "constant creation phase" occurs, a lot of CPU/GPU/electricity is needed compared to IE. As I've said, both look similar to what I've observed "over the shoulder" by a normal user using Facebook in which autostart videos occur. It seems that when the 1) happens the Firefox gets even unresponsive, helped by a lot of processing needed for 2) anyway in the background tabs (or maybe are even other tabs already in that phase, I don't know). In short, I believe these two issues affect very common experience ("i.e. a normal user simply browsing the Facebook and opening a few more pages in tabs, everything with videos") and aren't some specially constructed case. The only thing that's special is is that I've managed to locate a single page where I can repeat it, without having to give somebody's personal Facebook login to demonstrate the issues.
(In reply to Johnny Kl from comment #13) > The "constant thread creation" phase is obviously triggered at some point > and it's obviously too much. Each video is decoded by Windows Media Foundation using 7 threads. Each loop of the video will restart all of those threads. That seems avoidable, especially given that we have seamless looping on audio. > But it's not the only problem, the amount of > work (and the power used!) needed to play the videos even without that > thread creation is still double compared with the old IE 11. That it at > least twice as less time on the battery --- especially for people who open > more tabs. The decoders shut down when you open new tabs. However with setting media.use-blank-decoder=true it still uses a lot of CPU. > A "normal" user doesn't have ad blockers or video blockers. He > opens more pages in more tabs. If the videos are playing (ads and whatever), > is not going to search them on every page and every tab to pause them. Just > having open Firefox with the open tabs will drain his battery. I've observed > the problems with videos "from the distance" for at least many months, and > now after a lot of progress in development, and still seeing it... In general you can't fully understand the impact of an issue until you know the cause. We don't know the cause yet. It is intersting to note that Chrome uses a similar amount of CPU.
Flags: needinfo?(ajones)
Updated•7 years ago
|
Flags: needinfo?(ajones)
Reporter | ||
Comment 18•7 years ago
|
||
(In reply to Anthony Jones from comment #17) > It is intersting to note that Chrome uses a similar amount of CPU. I don't agree! I've tried Chromium which can't play the videos at all, but I've installed the latest Opera which probably has the same video code as Chrome, and I can easily show you that where Nightly uses 53% CPU, Opera uses 12% and IE 11 only 6%. See the new attachment and its comment which will immediately follow this reply. Thank you.
Reporter | ||
Comment 19•7 years ago
|
||
The case, reduced to the most obvious procedure to demonstrate, only then check the other images for more details. It's how two *background* tabs with videos playing (the default behaviour) affect the CPU much worse than Opera (based on Chromium) or IE 11: - on Windows OS - copy the URL https://www.jwz.org/blog/2017/10/anime-floppy-disks/ to the clipboard - open Task Manager / Performance - open Nightly, paste the URL once, wait for all videos loaded and playing - keeping this tab open, open a New tab, paste the URL to in that new tan, wait for all videos loaded and playing - open a New empty tab, keeping both tabs with videos playing in the background - wait 15 seconds - check Task Manager / Performance: note CPU is 53% - In the Process Monitor, making sure that only "Process and Thread activity" is on, more than 40 threads are created and closed per second - the same but opening Opera (close Nightly before!): CPU is 12% - the same but opening IE 11: CPU is 6% - in the attached two-pages-backgr.jpg all three cases compared, each time starting from no application running. That is, once again: Firefox Nightly 53%, Opera 12%, IE11 6%, or if we normalize to IE 11, the factors are Firefox 8.8x Opera 2x, IE11 1. It's a huge difference. Anthony I understood you can also see the same behavior? Shouldn't the bug then be considered confirmed, if you can repeat it, especially in the form I describe now? Thank you.
Reporter | ||
Comment 20•7 years ago
|
||
(In reply to Anthony Jones from comment #17) > Each video is decoded by Windows Media Foundation using 7 threads. Each loop of the video will restart all of those threads. It doesn't appear to be what I observe. The longest video is less than 5 seconds long, the shortest just 0.27 seconds. If your claim would hold, at least every 5 seconds the new threads should be created. But it I just open a single tab with all the videos looping, and if I don't scroll down, I can observe for minutes no new threads being created. Only some additional action by me will move Firefox to the "thread creation phase" as I've described in all the comments before Comment 19. > The decoders shut down when you open new tabs. Can you please explain which action you believe would shut them down? See Comment 19: what I observe is: I open one tab with videos, leave it opened, then open a new tab with the videos, leave it opened, then open a new, third tab, keep it empty, wait just some seconds and then I can see the threads being created and closed, many of them per second. The tabs in the background (that is, not visible) produce 53% of CPU use, whereas Opera (Chromium based) in the same scenario produces 12% of CPU use and IE11 only 6%. From what I observe, Firefox at this moment handles videos in the background tabs many times worse than any competition. But note that once the "phase with the thread creation" starts (which doesn't happen immediately after the only tab with videos is opened) it's always visibly worse than the competition, even if it's only tab in the browser. Please use the Process Monitor https://docs.microsoft.com/en-us/sysinternals/downloads/procmon to confirm that these two different phases do exist (at best while keeping the Task Manager / Performance opened, to see the difference in the CPU use in the corresponding phases -- the one where the threads are created uses obviously more CPU, and in the phase where no threads are created (single tab) the videos are still playing and looping) and let me know if you need help observing what I've described.
Reporter | ||
Comment 21•7 years ago
|
||
(In reply to Anthony Jones from comment #17) > Each video is decoded by Windows Media Foundation using 7 threads. Each loop of the video will restart all of those threads Here is the video I've recorded. For minutes no new threads are created, I've uploaded one minute total: the last 30 "quiet" seconds before I do "something" (scroll the page down, move the mouse over something) when the "thread creation phase" starts, using more CPU (the next 30 seconds): https://youtu.be/07ZBfXliY6Y
Reporter | ||
Comment 22•7 years ago
|
||
Using windbg, I can see that in the "thread creation phase" (see how the phase starts from 30th second of the video from the Comment 21 ) the H264Converter::CreateDecoder is called all the time, whereas in the "no rapid thread creation" phase (as seen in the first 30 seconds of the video from the Comment 21 ) there are effectively no calls to that. I also see that these calls result in the new threads created. I hope that's a good enough hint for the developers to search for the cause.
Reporter | ||
Comment 23•7 years ago
|
||
In case it's not so easy for others to achieve this, this is the full call stack that creates new threads for new decoders in the "thread creation phase." I'm otherwise not familiar with Firefox design, so I have no idea why it comes in that "unstable" situation where all the creations repeat too much too often: Call Site xul!mozilla::MFTDecoder::SendMFTMessage+0x22 [z:\build\build\src\dom\media\platforms\wmf\mftdecoder.cpp @ 179] xul!mozilla::MFTDecoder::SetMediaTypes+0x98 [z:\build\build\src\dom\media\platforms\wmf\mftdecoder.cpp @ 105] xul!mozilla::WMFVideoMFTManager::SetDecoderMediaTypes+0x22d [z:\build\build\src\dom\media\platforms\wmf\wmfvideomftmanager.cpp @ 836] xul!mozilla::WMFVideoMFTManager::InitInternal+0x342 [z:\build\build\src\dom\media\platforms\wmf\wmfvideomftmanager.cpp @ 732] xul!mozilla::WMFVideoMFTManager::Init+0x43 [z:\build\build\src\dom\media\platforms\wmf\wmfvideomftmanager.cpp @ 625] xul!mozilla::WMFDecoderModule::CreateVideoDecoder+0x8d [z:\build\build\src\dom\media\platforms\wmf\wmfdecodermodule.cpp @ 109] xul!mozilla::H264Converter::CreateDecoder+0x181 [z:\build\build\src\dom\media\platforms\wrappers\h264converter.cpp @ 273] xul!mozilla::H264Converter::H264Converter+0x1b0 [z:\build\build\src\dom\media\platforms\wrappers\h264converter.cpp @ 37] xul!mozilla::PDMFactory::CreateDecoderWithPDM+0x19c [z:\build\build\src\dom\media\platforms\pdmfactory.cpp @ 292] xul!mozilla::PDMFactory::CreateDecoder+0xbe [z:\build\build\src\dom\media\platforms\pdmfactory.cpp @ 235] xul!mozilla::MediaFormatReader::DecoderFactory::DoCreateDecoder+0x1c3 [z:\build\build\src\dom\media\mediaformatreader.cpp @ 774] xul!mozilla::MediaFormatReader::DecoderFactory::RunStage+0x42 [z:\build\build\src\dom\media\mediaformatreader.cpp @ 713] xul!<lambda_15decc4741528b208ebb37c365c9455f>::operator()+0x4e [z:\build\build\src\dom\media\mediaformatreader.cpp @ 692] xul!mozilla::MozPromise<RefPtr<mozilla::GlobalAllocPolicy::Token>,bool,1>::InvokeMethod+0x1a [z:\build\build\src\obj-firefox\dist\include\mozilla\mozpromise.h @ 520] xul!mozilla::MozPromise<RefPtr<mozilla::GlobalAllocPolicy::Token>,bool,1>::InvokeCallbackMethod+0x41 [z:\build\build\src\obj-firefox\dist\include\mozilla\mozpromise.h @ 565] xul!mozilla::MozPromise<RefPtr<mozilla::GlobalAllocPolicy::Token>,bool,1>::ThenValue<<lambda_15decc4741528b208ebb37c365c9455f>,<lambda_71d8e683077d8a4f5eb73812268d8b57> >::DoResolveOrRejectInternal+0x58 [z:\build\build\src\obj-firefox\dist\include\mozilla\mozpromise.h @ 770] xul!mozilla::MozPromise<mozilla::dom::U2FSignResult,enum nsresult,1>::ThenValueBase::DoResolveOrReject+0x80 [z:\build\build\src\obj-firefox\dist\include\mozilla\mozpromise.h @ 497] xul!mozilla::MozPromise<mozilla::dom::U2FRegisterResult,enum nsresult,1>::ThenValueBase::ResolveOrRejectRunnable::Run+0x51 [z:\build\build\src\obj-firefox\dist\include\mozilla\mozpromise.h @ 403] xul!mozilla::AutoTaskDispatcher::TaskGroupRunnable::Run+0x54 [z:\build\build\src\obj-firefox\dist\include\mozilla\taskdispatcher.h @ 217] xul!mozilla::TaskQueue::Runner::Run+0xea [z:\build\build\src\xpcom\threads\taskqueue.cpp @ 244] xul!nsThreadPool::Run+0x328 [z:\build\build\src\xpcom\threads\nsthreadpool.cpp @ 230] xul!nsThread::ProcessNextEvent+0xe16 [z:\build\build\src\xpcom\threads\nsthread.cpp @ 1034] xul!NS_ProcessNextEvent+0x26 [z:\build\build\src\xpcom\threads\nsthreadutils.cpp @ 508] xul!mozilla::ipc::MessagePumpForNonMainThreads::Run+0xdd [z:\build\build\src\ipc\glue\messagepump.cpp @ 365] xul!MessageLoop::RunHandler+0x1b [z:\build\build\src\ipc\chromium\src\base\message_loop.cc @ 320] xul!MessageLoop::Run+0x3e [z:\build\build\src\ipc\chromium\src\base\message_loop.cc @ 300] xul!nsThread::ThreadFunc+0xe4 [z:\build\build\src\xpcom\threads\nsthread.cpp @ 425] nss3!_PR_NativeRunThread+0xee [z:\build\build\src\nsprpub\pr\src\threads\combined\pruthr.c @ 419] nss3!pr_root+0xa [z:\build\build\src\nsprpub\pr\src\md\windows\w95thred.c @ 96] ucrtbase!thread_start<unsigned int (__cdecl*)(void * __ptr64)>+0x35 KERNEL32!BaseThreadInitThunk+0x22
(In reply to Johnny Kl from comment #19) > Anthony I understood you can also see the same behavior? Shouldn't the bug > then be considered confirmed, if you can repeat it, especially in the form I > describe now? The unconfirmed/new workflow isn't used in this component but if it makes you happy I'll do that. (In reply to Johnny Kl from comment #23) > xul!mozilla::WMFVideoMFTManager::Init+0x43 > [z:\build\build\src\dom\media\platforms\wmf\wmfvideomftmanager.cpp @ 625] This is where the decoder is initialised, which supports my previous statement about WMF. The thread creation itself seems like a red herring given that the performance of the blank decoder is still poor.
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Lets try that again.
Status: RESOLVED → REOPENED
Ever confirmed: true
Flags: needinfo?(ajones)
Resolution: FIXED → ---
Reporter | ||
Comment 26•7 years ago
|
||
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #24) > The thread creation itself seems like a red herring > given that the performance of the blank decoder is still poor. No. Firefox can obviously use the existing decoders for loops, and then something can trigger the creation of many of the decoders per second. If you please see my video, it's easy to see: there is a phase where no decoders are created all the time, before I scroll down the page and hoover over something to make that "rapid thread / decoder creation" start can last as much as you want. I know I've shown thread creations and not decoders in the video I've posted. But here is the log of the times at which the H264Converter::H264Converter is called, taken automatically with windbg. Since that's the only constructor of that class, I guarantee there were no creations of that object before I scroll the page, and the vidoes loop at that time just as in the video I've recorded. The format is a counter of H264Converter created, space, minute since the start of a program : second . fractionOfSecond: 01 2:53.365 02 2:53.372 03 2:53.381 04 2:53.395 05 2:53.486 06 2:53.494 07 2:53.506 08 2:53.518 09 2:53.564 10 2:53.575 11 2:53.589 12 2:53.603 13 2:53.635 14 2:53.646 15 2:53.656 16 2:53.669 <-- the 16th H264Converter is created, here I don't do nothing and wait almost 3 minutes 17 5:46.824 <-- after I trigger the "rapid" state H264Converter is created many times per second 18 5:46.834 19 5:46.843 20 5:46.957 21 5:47.034 22 5:47.089 23 5:47.124 24 5:47.151 25 5:47.194 26 5:47.227 27 5:47.258 28 5:47.290 29 5:47.323 30 5:47.350 31 5:47.361 32 5:47.399 33 5:47.421 34 5:47.440 35 5:47.473 36 5:47.506 37 5:47.536 38 5:47.569 39 5:47.604 40 5:47.638 41 5:47.669 42 5:47.702 43 5:47.735 44 5:47.769 45 5:47.799 46 5:47.812 47 5:47.854 48 5:47.860 49 5:47.880 50 5:47.910 51 5:47.938 52 5:47.966 53 5:47.985 54 5:48.014 55 5:48.050 56 5:48.084 57 5:48.105 58 5:48.124 59 5:48.161 60 5:48.194 61 5:48.232 62 5:48.272 63 5:48.307 64 5:48.342 65 5:48.379 66 5:48.419 67 5:48.455 68 5:48.493 69 5:48.523 70 5:48.538 71 5:48.575 72 5:48.608 73 5:48.641 74 5:48.673 75 5:48.708 76 5:48.741 77 5:48.776 78 5:48.811 79 5:48.848 80 5:48.889 81 5:48.926 82 5:48.963 etc... The pattern of Firefox being able to not create new H264Converters as long as it's not touched and entering another state via some trigger action is obvious. Here why I'm certain that it's exactly the construction of H264Converter::H264Converter that creates the most of the new threads in the "rapid thread creation phase" -- I observed the stack at these moments: The stack at the moment of the thread creation that happens at the construction of H264Converter is: xul!mozilla::MFTDecoder::SendMFTMessage()+0x22 [z:\build\build\src\dom\media\platforms\wmf\mftdecoder.cpp @ 179] xul!mozilla::MFTDecoder::SetMediaTypes()+0x98 [z:\build\build\src\dom\media\platforms\wmf\mftdecoder.cpp @ 105] xul!mozilla::WMFVideoMFTManager::SetDecoderMediaTypes(void)+0x22d [z:\build\build\src\dom\media\platforms\wmf\wmfvideomftmanager.cpp @ 836] xul!mozilla::WMFVideoMFTManager::InitInternal(void)+0x342 [z:\build\build\src\dom\media\platforms\wmf\wmfvideomftmanager.cpp @ 732] xul!mozilla::WMFVideoMFTManager::Init(void)+0x43 [z:\build\build\src\dom\media\platforms\wmf\wmfvideomftmanager.cpp @ 625] xul!mozilla::WMFDecoderModule::CreateVideoDecoder()+0x8d [z:\build\build\src\dom\media\platforms\wmf\wmfdecodermodule.cpp @ 109] xul!mozilla::H264Converter::CreateDecoder()+0x181 [z:\build\build\src\dom\media\platforms\wrappers\h264converter.cpp @ 273] So as the result of all the calls after the construction of H264Converter, eventually: hr = SendMFTMessage(MFT_MESSAGE_NOTIFY_BEGIN_STREAMING, 0); NS_ENSURE_TRUE(SUCCEEDED(hr), hr); is called which results in 8 levels of calls inside of msmpeg2vdec.dll for which I can't get the symbols from Microsoft's server, so that part of the symbol resolution is not accurate and I don't quote it but from there the thread creation in KERNEL32 is initiated. To repeat once again, I see typically only 16 H264Converter objects constructed after the page loads, then the videos can loop for minutes, and only once I scroll down another, more CPU consuming phase starts with the rapid creation of H264Converter objects, and for each one exactly one thread is being created from msmpeg2vdec.
It looks like we create a new decoder every loop of the video when in decoder shutdown mode. Steps to reproduce: * Navigate to https://www.jwz.org/blog/2017/10/anime-floppy-disks/ * Open new tab * Wait 15 seconds * Use gecko profiler Expected results: mozilla::CreateDXVAManagerEvent::Run should not show up on the profile Actual results: mozilla::CreateDXVAManagerEvent::Run is using 9% CPU on GeckoMain
(In reply to Johnny Kl from comment #26) > To repeat once again, I see typically only 16 H264Converter objects > constructed after the page loads, then the videos can loop for minutes, and > only once I scroll down another, more CPU consuming phase starts with the > rapid creation of H264Converter objects, and for each one exactly one thread > is being created from msmpeg2vdec. It looks like we're fine until we go into decoder shutdown. Once we're in decoder shutdown mode, we flip-flop between WMF and blank decoder, I'm guessing once per video loop.
[Tracking Requested - why for this release]: This issue causes excessive CPU usage on short looping videos.
status-firefox57:
--- → affected
status-firefox58:
--- → affected
status-firefox59:
--- → affected
tracking-firefox58:
--- → ?
Having given this some thought, I'm guessing that we're starting the decoder up at the end of each loop in order to show the last frame.
Reporter | ||
Comment 31•7 years ago
|
||
As I've noted in previous comments, there are two issues behind this entry: one is this - "rapid creation phase" and another is - generally worse playing of media compared to other browsers (as far as I observe it). I think both issues should be addressed, and now that at last somebody (if I understood correctly? thanks Anthony!) finally confirmed what I've reported I hope both issues could somehow be handled, even if I reported them only in one bug. I've however tried to see if anybody else created bugs about generally inferior media playing, and I can at least somehow confirm this case here: https://bugzilla.mozilla.org/show_bug.cgi?id=1419047#c4 Maybe people working for Mozilla can now create appropriate bug reports that will lead both to the serious investigation high CPU/GPU/power resource load for media playing, as well as addressing this "rapid creation"? Thanks and good luck.
(In reply to Johnny Kl from comment #31) > - generally worse playing of media compared to other browsers (as far as I > observe it). I can see out of order frames clearly on the 2nd video on the 2nd row.
(In reply to Johnny Kl from comment #31) > I've however tried to see if anybody else created bugs about > generally inferior media playing, and I can at least somehow confirm this > case here: > > https://bugzilla.mozilla.org/show_bug.cgi?id=1419047#c4 Except that issue has nothing to with media.
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #32) > (In reply to Johnny Kl from comment #31) > > - generally worse playing of media compared to other browsers (as far as I > > observe it). > > I can see out of order frames clearly on the 2nd video on the 2nd row. I created bug 1422660 because it is likely unrelated to the decoder shutdown issue.
Assignee | ||
Comment 35•7 years ago
|
||
https://queue.taskcluster.net/v1/task/brg2IUyNRW6RpeoSplmTuw/runs/0/artifacts/public/build/setup.exe Can you try if this build fix the problem for you?
Reporter | ||
Comment 36•7 years ago
|
||
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #32) > (In reply to Johnny Kl from comment #31) > > - generally worse playing of media compared to other browsers (as far as I > > observe it). > > I can see out of order frames clearly on the 2nd video on the 2nd row. I've just meant "worse regarding the CPU/GPU/power usage" as I described in the Comment 19 : Firefox Nightly 53% CPU Opera 12% CPU IE11 6% CPU Regarding "frame skipping" I haven't believed the example is relevant as these MP4 were created from the GIFs! https://www.jwz.org/blog/2017/10/gifs-as-mp4s/
Comment 37•7 years ago
|
||
Likely wontfix for 57 at this point.
Reporter | ||
Comment 38•7 years ago
|
||
The difference in CPU use when opening the https://www.jwz.org/blog/2017/10/anime-floppy-disks/ before the "rapid creation phase" that is, when no new decoders are constantly created. Above: defaults, when the page is opened in the new, just created profile CPU use: cca 25% Below: when the page is opened in the profile with browser.tabs.remote.autostart = false media.wmf.dxva.max-videos = 999 CPU use: cca 9% I observe 16% of CPU use overhead just for these two defaults, even when the bug with the "rapid creation" is not triggered. I hope somebody can investigate this too? (btw the setup.exe from the Comment 35 is not installing any firefox binaries at my computer, I've already contacted jwwang directly about that. Trying to check what it did, I see that setup.exe creates something like system.dll and uac.dll. I didn't know there is uac highjacking built-in in the setup. Anyway that setup doesn't report failure when nothing is installed, instead it creates the shortcuts to non-existing firefox.exe)
Assignee | ||
Comment 39•7 years ago
|
||
Try this build? https://drive.google.com/file/d/1XwZe4MOf7czwmHeXZ46GnbumsVOqRMkR/view
Assignee | ||
Comment 40•7 years ago
|
||
Somehow I can no longer open this link: https://www.jwz.org/blog/2017/10/anime-floppy-disks/. Is the site still up?
Reporter | ||
Comment 41•7 years ago
|
||
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #40) > Somehow I can no longer open this link: > https://www.jwz.org/blog/2017/10/anime-floppy-disks/. Is the site still up? You can download the jwz-high-cpu.7z attached in Comment 1 containing the whole page content, which I archived days ago, and test locally, even with no internet. The page loads for me.
Assignee | ||
Comment 42•7 years ago
|
||
https://searchfox.org/mozilla-central/source/dom/media/MediaDecoderStateMachine.cpp#820-827 I guess this is the culprit. 1. MDSM turns off blank decoder when seeking starts. 2. MDSM turns on blank decoder when decoding starts for the tab is still in the background. 3. It looks like flipping between WMF and blank decoder could result in hight CPU usage. 4. 3 is even more significant when looping a small file because you will seek (to the beginning) very often. Btw, I can't repro the issue on my Windows 10 machine. I guess step 3 is more expensive on some Windows machines than others.
Reporter | ||
Comment 43•7 years ago
|
||
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #42) > I can't repro the issue on my Windows 10 machine. I guess step 3 is > more expensive on some Windows machines than others. Thanks. If you can add the trace statement on the point of the creation of the decoders and see them being rapidly created every second you are definitely reproducing the "rapid creation phase." On the significantly faster computer than mine there would be surely less visible CPU problems, but on all of the slower systems the problems they surely are. Once that bug is fixed, I hope somebody can create a relevant bug based on the observations from the Comment 36 and Comment 38 ?
That build fixes the issue for me. The CPU usage drops once I open a new tab.
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8935599 -
Flags: review?(gsquelart)
Assignee: nobody → jwwang
Comment 46•7 years ago
|
||
mozreview-review |
Comment on attachment 8935599 [details] Bug 1420608. P1 - don't switch off blank decoder when seeking begins. https://reviewboard.mozilla.org/r/206478/#review212194
Attachment #8935599 -
Flags: review?(gsquelart) → review+
Assignee | ||
Comment 47•7 years ago
|
||
Thanks!
Comment 48•7 years ago
|
||
Pushed by jwwang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5a1a53d1fd02 don't switch off blank decoder when seeking begins. r=gerald
Comment 49•7 years ago
|
||
Backed out for browser chrome mochitest failures on toolkit/content/tests/browser/browser_resume_bkg_video_on_tab_hover.js https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=5a1a53d1fd0282be82ae1355896213a0a093e216&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-searchStr=bc https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=5a1a53d1fd0282be82ae1355896213a0a093e216&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-searchStr=bc https://hg.mozilla.org/integration/autoland/rev/c9c66aa3cfe76422f25746fb802a042fe2c15730
Flags: needinfo?(jwwang)
Assignee | ||
Comment 50•7 years ago
|
||
https://searchfox.org/mozilla-central/rev/f5f1c3f294f89cfd242c3af9eb2c40d19d5e04e7/toolkit/content/tests/browser/browser_resume_bkg_video_on_tab_hover.js#12 The test is racy! Sometimes the 'mozentervideosuspend' event is fired before any event handler is registered and the test times out for waiting an event that will never come.
Flags: needinfo?(jwwang)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8936192 -
Flags: review?(alwu)
Comment 53•7 years ago
|
||
mozreview-review |
Comment on attachment 8936192 [details] Bug 1420608. P2 - fix the test timeout. https://reviewboard.mozilla.org/r/206956/#review212992 ::: commit-message-65c69:3 (Diff revision 1) > +Bug 1420608. P2 - fix the test timeout. > + > +See comment 50 for the cause. We call load() and play() to start playback nit : it's better to mention that the "mozentervideosuspend" would happen before we start register the lisnter in comment msg.
Attachment #8936192 -
Flags: review?(alwu) → review+
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 56•7 years ago
|
||
Thanks!
Comment 57•7 years ago
|
||
Pushed by jwwang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/69a0e0c9d40e P1 - don't switch off blank decoder when seeking begins. r=gerald https://hg.mozilla.org/integration/autoland/rev/772fef647fd7 P2 - fix the test timeout. r=alwu
Comment 58•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/69a0e0c9d40e https://hg.mozilla.org/mozilla-central/rev/772fef647fd7
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Reporter | ||
Comment 59•7 years ago
|
||
If I understand one from two different issues I've reported in the comments here is fixed. Regarding the second, maybe a good manifestation is the one in Comment 38 ? What are your opinions on the issue from Comment 38 and should a new bug be made for it? Thank you.
Comment 60•6 years ago
|
||
Hey Johnny! This is more of an automated comment that is made after we do the merges, in this case from autoland to central. We usually try to do the merges right before the nightly tests are triggered. Yes, the first part now is considered fix and for what you refer to in comment 38 I suggest you open another bug and put this bug in the see also section.
Reporter | ||
Comment 61•6 years ago
|
||
With the latest Nightly arrived I can confirm that the "rapid creation phase" as I saw it before is gone. Thanks everybody for that one! Unsurprisingly I still observe the other performance problems: specifically, in this comment I can show that Firefox has some strange periods where the CPU use is still much, much higher than in IE11 or Opera, which is especially visible if I try to open the jwz anime floppy page in four tabs, one after another. In these cases my worry is less that the battery is consumed more long term (although the used CPU is twice of IE11 even after these periods, which is significant) but that these relatively long high "flames" point to something that I can imagine on the slower computers potentially blocks the responsiveness to the user, and therefore it would be worthy to analyze what's going on there. To reproduce: 1 copy the link https://www.jwz.org/blog/2017/10/anime-floppy-disks/ 2 open the browser and open the Task Manager/Performance from the OS 3 open the link in the first tab, wait to see it's loaded, then 4 open the second tab, open the link, wait to see it's loaded, then 5 open the third tab, open the link, wait to see it's loaded, then 6 open the fourth tab, open the link, wait to see it's loaded, then Observe the "high and wide flames." If I don't wait to the end of the load, I can even observe even wider periods of 100% CPU use (the flames are even higher and even wider). I can also induce that the end speed doesn't end to be the same as in the snapshot. All in all it seems there is some visibly unnecessary work which potentially overuses the CPU resources. The variation: open the same link in the same tab with the short delays, Firefox ends in some other higher CPU constant use, compared to one when it was opened only once! I know that this bug under this number is considered "fixed" but as I am not an "insider" I still don't know how to present my case optimally, to avoid that the report is declared "non existing" in some way. That was why I took the effort to do a lot of various observations here and illustrate them or additionally prove them, but I would not like to start from the point zero (as I do this in my free time) and I also don't have the access to the various OSes, which is maybe of benefit, just the browsers I've used. In this bug entry ( 1420608 ) there are I believe different scenarios which point to more than one issue that still can be solved simply by using as a test this one page I've used, also opening it in various timings more than once or in more than one tab, or with the various settings, and I really believe the best would be if the bugs are filled by those who are more versed and know more about what is going on "under the hub" in the Firefox (and can recognize the using the built-in profiler and their knowledge). Please see the previous comments for inspiration, maybe at least the comment 38 (with the attached 8934568: e10s-overhead.jpg ) and this comment now, including the "variation" which is also somehow telling could be used as as the starting point for opening new bugs. For the starting identification of the potential problems, maybe it's worthy doing the visual comparison of the CPU use graphs doing the same actions on other OSes too. Note: the configuration for the picture attached is default after install, i.e. media.wmf.dxva.max-videos is the default 8, not 999, and tabs.remote.autostart is also the default.
Comment 62•6 years ago
|
||
JW, Ihis bug looks serious and worth uplifting. What do you think?
Flags: needinfo?(jwwang)
Assignee | ||
Comment 63•6 years ago
|
||
Approval Request Comment [Feature/Bug causing the regression]:none [User impact if declined]:high CPU usage when viewing a page with many videos on Windows. [Is this code covered by automated tests?]:yes [Has the fix been verified in Nightly?]:yes [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]:none [Is the change risky?]:low [Why is the change risky/not risky?]:the change is simple. We just move the action of suspending video decoding from place A to B to reduce the chance of recreating decoders. Even if we get it wrong, it will at worst remain high CPU on those pages as if no such fix is applied. [String changes made/needed]:none
Flags: needinfo?(jwwang)
Attachment #8937891 -
Flags: review+
Attachment #8937891 -
Flags: approval-mozilla-beta?
Comment 64•6 years ago
|
||
Comment on attachment 8937891 [details] [diff] [review] 1420608_fix_beta_58.patch Fix a high cpu usage issue. Beta58+.
Attachment #8937891 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 65•6 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/5a2fd74ee373
Comment 66•6 years ago
|
||
Is this bug related to this issue I have https://bugzilla.mozilla.org/show_bug.cgi?id=1411575 ?
Comment 67•6 years ago
|
||
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #63) > [Is this code covered by automated tests?]:yes > [Needs manual test from QE? If yes, steps to reproduce]: no Based on this assessment on manual testing needs and it's automated coverage, marking this as qe-verify-.
Flags: qe-verify-
You need to log in
before you can comment on or make changes to this bug.
Description
•