Open Bug 1400787 Opened 7 years ago Updated 2 years ago

Using software video decoders (include VP9) ramps up computer fans on Macbook.

Categories

(Core :: Audio/Video: Playback, enhancement, P3)

x86_64
macOS
enhancement

Tracking

()

Performance Impact medium

People

(Reporter: kaku, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: perf:resource-use, Whiteboard: [media-performance])

The issue is discussed for a while and is also reported here: https://www.reddit.com/r/firefox/comments/6xa21i/firefox_using_way_more_resources_than_chrome_on/

Somehow, we're using abundant resources while playing a local WebM/VP9 videos. This leads to high device heat and ramps up the fans. But this does not happen on Chrome.
I conducted a series of experiments which I measured power usage and CPU/GPU frequency while playing a WebM/VP9 video with different decoders(ffvpx, blank decoder, and null decoder) combined with rendering/no-rendering.

Here is the result data:
https://drive.google.com/open?id=1RRL7DX79CNw6ms7K-iG2vI14QKgchEZy7aOGa0bOPU4

From experimental #1 and #2, we can see that Firefox is using much more power than Chrom and the difference is mainly on GPU.

From experimental #2 and #3: we can see that switching FFVPX decoder to blank decoder does not help. 
The FFVPX is a software decoder, and the blank decoder is a decoder that does nothing but creating a pure-color data at runtime. So, the blank decoder saves some CPU resources, which is shown on the core power usage (column H) and CPU frequency (column M). This confirms that FFVPX is not the root cause of high heat.

Experiment #5 uses a null decoder which creates null VideoData and sends to MDSM. Then, VideoSink skips this kind of data, so no rendering at all.
Experimental #6 creates a HTMLMediaElement by JavaScript without appending it to DOM tree. The element plays a WebM/VP9 video file with FFVPX decoder.
From experimental #2, #5, and #6: we can see that it's the rendering pipeline consumes GPU resources.

From experimental #2 and #4: we can see that switching FFVPX decoder to H264 HW decoder helps. H264 HW decoder uses GPU to decoder video data. However, this is also the weirdest part that this experiment uses fewer GPU resources than #2!? I have a gut feeling that something that we're not expecting on transferring system memory (the decoded data by SW decoder) to GPU memory (fro rendering).

Experiment #7 ~ #12 are just a duplication on another MacBook Pro (mid-2015). Experiment #1 ~ #6 was conducted on MacBook Pro (mid-2014).
Milan - can we have some graphics team support on figuring out why our texture upload path is slow on Mac?
Flags: needinfo?(milan)
[Tracking Requested - why for this release]: This is unnecessary and excessive power usage. We need to find a fix for this in 57. We can ameliorate the problem by disabling VP9 on YouTube but that won't fix Netflix.
Most likely bug 1265824 and/or bug 1282797.  Not a 57 type of work at this point.
Flags: needinfo?(milan)
Is this something you want to / can workaround on the media side considering comment 4?
Flags: needinfo?(ajones)
Per discussion with Anthony, it would be better to disable VP9 and re-enable it once those bugs in comment 4 are fixed. 
Kaku,
Can you help disable VP9 decoder on Mac?
Flags: needinfo?(ajones) → needinfo?(kaku)
Priority: -- → P2
(In reply to Blake Wu [:bwu][:blakewu] from comment #6)
> Per discussion with Anthony, it would be better to disable VP9 and re-enable
> it once those bugs in comment 4 are fixed. 
> Kaku,
> Can you help disable VP9 decoder on Mac?

Sure, bug 1403412.
Depends on: 1403412
Flags: needinfo?(kaku)
Depends on: 1403618
Move the 57 track to bug 1403412.
See Also: → 1404042
Summary: Watching VP9 video ramps up computer fans on Macbook. → Using software video decoders (include VP9) ramps up computer fans on Macbook.
(In reply to Tzuhao Kuo [:kaku] from comment #1)
> 
> Experiment #5 uses a null decoder which creates null VideoData and sends to
> MDSM. Then, VideoSink skips this kind of data, so no rendering at all.
> Experimental #6 creates a HTMLMediaElement by JavaScript without appending
> it to DOM tree. The element plays a WebM/VP9 video file with FFVPX decoder.
> From experimental #2, #5, and #6: we can see that it's the rendering
> pipeline consumes GPU resources.
> 
if there's no rendering with experiment #5, how could there be GPU resources being used?
Flags: needinfo?(kaku)
See Also: → 1418510
(In reply to Jean-Yves Avenard [:jya] from comment #10)
> (In reply to Tzuhao Kuo [:kaku] from comment #1)
> > 
> > Experiment #5 uses a null decoder which creates null VideoData and sends to
> > MDSM. Then, VideoSink skips this kind of data, so no rendering at all.
> > Experimental #6 creates a HTMLMediaElement by JavaScript without appending
> > it to DOM tree. The element plays a WebM/VP9 video file with FFVPX decoder.
> > From experimental #2, #5, and #6: we can see that it's the rendering
> > pipeline consumes GPU resources.
> > 
> if there's no rendering with experiment #5, how could there be GPU resources
> being used?
That's something unknown to us. GPU seems to not only work for video frames rendering.
See Also: → 1420880
Flags: needinfo?(kaku)
See Also: → 1427432
This might also affect Linux as well.
Flags: needinfo?(padenot)
I've taken up to date measurements, comparing Firefox Nightly and Chrome stable:

I've been playing two VP9 videos [0] and [1], respectively at 1080p60 and 1080p30, on Chrome and Firefox, in full screen and non-full screen, and measuring the average power consumption in Watts, when using the integrated GPU (Intel [2]) and the discrete GPU (AMD, [3]), on a maxed-out MacBook Pro 2016 (2,9 GHz Intel Core i7, 16GB RAM), while plugged in, running the latest version of OSX (10.13).

On Intel, GPU, the Intel power gadget allows knowing the frequency at which the GPU is clocked, so I've added this info. 


Video [0] (60fps)
  Full screen:
    AMD GPU: Firefox 10W
             Chrome 10.5W
    Intel GPU: Firefox 12W (GPU clocked at 0.5GHz)
               Chrome 10.5W (GPU clocked at 0.16GHz)
  Non full screen:
    AMD GPU: Firefox 10W
             Chrome 10.5W
    Intel GPU: Firefox 25W (GPU clocked at 1.0GHz, the maximum)
               Chrome 7W (GPU clocked at 0.1GHz)

Video [1] (30fps)
  Full screen:
    AMD GPU: Firefox 6W
             Chrome 6.5W
    Intel GPU: Firefox 12W (GPU clocked at 0.2GHz)
               Chrome 6W (GPU clocked at 0.02GHz)
  Non full screen:
    AMD GPU: Firefox 6W
             Chrome 6W
    Intel GPU: Firefox 11W (GPU clocked at 0.3GHz)
               Chrome 7W (GPU clocked at 0.06GHz)

Basically, whenever Firefox is using the Intel GPU, things are very very bad. The problem is that it's the default.

Jeff, do you have any idea why this would be?


[0]: https://www.youtube.com/watch?v=KaCQ8SQ6ZHQ
[1]: https://www.youtube.com/watch?v=0CFQQVkJLP4
[2]: Intel HD Graphics 530 1536 MB
[3]: AMD Radeon Pro 460
Flags: needinfo?(padenot)
Flags: needinfo?(jyavenard)
Flags: needinfo?(jmuizelaar)
Hrm Jeff is going away soon. David, do you know who could have a go a having a look at this?
Flags: needinfo?(dbolter)
Kats helped me catch up here. Let's see what happens with the bugs in comment 4. Doug Thayer seems to be active there.
Flags: needinfo?(jmuizelaar)
Flags: needinfo?(dbolter)
What resolution is the mac?

Chrome is likely using CoreAnimation for video playback so they are not going to be paying the memory bandwidth cost of the full screen blit. 

i.e. my understanding is that Chrome's pipeline looks like this:

video frame -> scale and color convert -> final screen scan out buffer

where as Firefox's looks like:

video frame -> scale and color convert -> full screen window buffer -> copy -> final screen scan out buffer.

Fixing this is roughly tracked in bug 1191965.

Additionally, our texture upload path on Mac OS has extra copies. This will hopefully be fixed by bug 1265824.
(In reply to Jeff Muizelaar [:jrmuizel] on parental leave until at least June 10 from comment #17)
> What resolution is the mac?
> 
> Chrome is likely using CoreAnimation for video playback so they are not
> going to be paying the memory bandwidth cost of the full screen blit. 
> 
> i.e. my understanding is that Chrome's pipeline looks like this:
> 
> video frame -> scale and color convert -> final screen scan out buffer
> 
> where as Firefox's looks like:
> 
> video frame -> scale and color convert -> full screen window buffer -> copy
> -> final screen scan out buffer.
> 
> Fixing this is roughly tracked in bug 1191965.

But Paul has shown here that the issue is much worse on non-fullscreen playback.

Is Chrome still using CoreAnimation there?
Flags: needinfo?(jyavenard)
Paul, CoreAnimation on Chrome can be disabled with the switch: disable-remote-core-animation could you please compare again your result, that would help identify if CA does improve things a great deal, and see if it's even use with some of the cases benchmarked above.
Flags: needinfo?(padenot)
Chrome, with --disable-remote-core-animation and without, testing only the Intel GPU:

Video [0] (60fps):
                  disabled | enabled
  Non-fullscreen:    9W    |   6.2W
  Fullscreen:        9W    |   6W

Video [1] (30fps)
                  disabled | enabled
  Non-fullscreen:   5.2W   |   4W
  Fullscreen:       5W     |   4W
  

Numbers are lower, probably because the machine was just rebooted and I closed all other programs this time around. I re-did some measurements multiple times, and passing the command line switch increases the energy consumption reliably.
Flags: needinfo?(padenot)
I think I have an even easier way to demonstrate/reproduce this problem (because it doesn't involve any codecs or network):

- Open https://mozilla.github.io/webrtc-landing/gum_test.html
- Click 'Video'
- Grant access to your camera

I'm using OSX Activity Monitor to get a rough idea for the energy consumption (I know it's probably not very accurate, but should hopefully good enough for getting a rough idea).

These numbers are from my 2013 MacBook Pro

Chrome:
 - Nvidia card: ~ 50 "Energy Impact"
 - Intel card: ~ 50 "Energy Impact"

Firefox:
 - Nvidia card: ~ 50 "Energy Impact"
 - Intel card: > 100 "Energy Impact" - the number goes down to 60, but also as high as 140
Whiteboard: [qf]
Whiteboard: [qf] → [qf:p3:f64]
Whiteboard: [qf:p3:f64] → [qf:p1:f64]
Bug 1265824 (which just landed) should improve this. Can you retest with a nightly that includes bug 1265824
Depends on: 1265824
It looks like bug 1265824 may not be working correctly on Intel hardware yet so it might be worth holding of measuring this.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #23)
> It looks like bug 1265824 may not be working correctly on Intel hardware yet
> so it might be worth holding of measuring this.

Is there a follow-up bug we can cc to?
Bug 1478704 has landed on inbound. It should bring some improvements on Intel GPUs but it's possible that we can improve things further. I've filed 1479145 for further improvements.
Depends on: 1479145
Recent patches have made things better, here are new measurements, with a Nightly from 2018-08-13:

Video [0] (60fps)
  Full screen:
    AMD GPU: Old Firefox 10W
             Current Firefox 9W
             Chrome 10.5W 
    Intel GPU: Old Firefox 12W (GPU clocked at 0.5GHz)
               Current Firefox 11W (GPU clocked at 0.4GHz)
               Chrome 10.5W (GPU clocked at 0.16GHz)
  Non full screen:
    AMD GPU: Old Firefox 10W
             Current Firefox 7W
             Chrome 10.5W
    Intel GPU: Old Firefox 25W (GPU clocked at 1.0GHz, the maximum)
               Current Firefox 19W (GPU clocked at 0.85GHz)
               Chrome 7W (GPU clocked at 0.1GHz)

It looks like we're still have quite a few things to fix still.

[0]: https://www.youtube.com/watch?v=KaCQ8SQ6ZHQ
Whiteboard: [qf:p1:f64] → [qf:p2:resource]

I tried again the same scenario: same machine, same video, but this time comparing with Web Render enabled and without Web Render enabled. I'm also now measuring the total consumption of the laptop instead of just the CPU/GPU packages.

Windowed, Intel:
Web Render On: 35W average (however looking at the graph, the stddev is larger)
Web Render Off: 35W average

Full screen, Intel:
Web Render On: 33W average (stddev a bit tighter)
Web Render off: 20W average

Windowed, AMD:
Web Render On: 34W average (however looking at the graph, the stddev is larger)
Web Render Off: 33W average

Full screen, AMD:
Web Render On: 27W average (stddev a bit tighter)
Web Render off: 19W average

Chrome is roughly half or two third of this, depending on the scenario.

Flags: needinfo?(nical.bugzilla)

Current WebRender doesn't have the texture upload improvements from bug 1478704 so it's expected to be worse than non-webrender.

Thanks for the update, Paul.

Flags: needinfo?(nical.bugzilla)

Paul, when you get a chance I'd be curious to see how much difference the --disable-mac-overlays flag makes in Chrome's battery usage.

There is currently a bug in Intel Power Gadget on OSX that makes it not work (the kernel module isn't signed). I'll try again when I get a chance.

(In reply to Paul Adenot (:padenot) from comment #32)

There is currently a bug in Intel Power Gadget on OSX that makes it not work (the kernel module isn't signed). I'll try again when I get a chance.

This now works again. NI to myself to get more recent numbers. Bug 1576107 is related.

Flags: needinfo?(padenot)
Flags: needinfo?(padenot)
See Also: → 1576107
Flags: needinfo?(padenot)

Paul, given that Core Animation is enabled by default now for Nightly and Beta builds, would you mind doing another check? It would be interesting how the numbers have changed for video streaming in non-fullscreen mode and WebRender turned off.

New numbers, with Firefox Nightly 71.0a1 (2019-09-08) (64-bit) and Google Chrome 76.0.3809.132 on OSX 10.14.5 (18F203), Macbook pro 15" 2018 (not the same machine as the previous measurements!), only with the integrated Intel GPU:

Video [0] (60fps)
Full screen:
Intel GPU:
Current Firefox Nightly: 6.5W
Chrome: 5W
Non full screen:
Intel GPU:
Current Firefox Nightly: 7.3W
Chrome: 4.7W

For a point of reference, which is not comparable since it's not the same codec, Safari is below 3W in windowed, and it's too hard to measure in full screen because it's in the noise compared to idle state.

In short, superb improvements, but there is still a lot of room for improvements on this metric.

On another note, 4K video can now play without frame drops as well.

[0]: https://www.youtube.com/watch?v=KaCQ8SQ6ZHQ (same video as before, always locked to 1080p60).

Flags: needinfo?(padenot)

I filed bug 1589165 for one of the problems here.

Depends on: 1589165
Priority: P2 → P3
Whiteboard: [qf:p2:resource] → [qf:p2:resource] [media-performance]
Performance Impact: --- → P2
Whiteboard: [qf:p2:resource] [media-performance] → [media-performance]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.