1474583 - [WebRender Shield Study] Higher CPU usage with WebRender enabled on YouTube (Windows)

Reporter

Description

•

7 years ago

Version: 63.0a1 Build: ID 20180709221247 User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:63.0) Gecko/20100101 Firefox/63.0 [Affected Platforms]: - Windows 10 64bit [Prerequisites] - Have Task Manager opened at Processes section. [Steps to reproduce]: 1. Open Firefox Nightly (63.0a1) with a new profile and navigate to https://www.youtube.com/watch?v=aqz-KE-bpKQ 2. Observe the CPU usage of Firefox Nightly in Task Manager. 3. Go to about:config and add "gfx.webrender.all.qualified" preference with true value 4. Restart the browser. 5. Navigate to https://www.youtube.com/watch?v=aqz-KE-bpKQ 6. Observe the CPU usage of Firefox Nightly in Task Manager. [Expected result]: - CPU usage of Firefox Nightly with WebRender enabled should be approximately the same as without it. [Actual result]: - CPU usage is on average with 50% higher with WebRender enabled. [Notes]: - This issue is reproducible by performing any action in the browser. - Attached a link to CPU usage test results: https://tinyurl.com/yaesrl2d. - Attached a copy of "about:support" from both systems this issue was tested on. - Attached a screen recording of the issue: https://tinyurl.com/y9d79u69.

Darkspirit

Updated

•

7 years ago

Comment 1

•

7 years ago

Attached file about support graphics.txt — Details

Forgot to add the about:support info file.

Jeff Muizelaar [:jrmuizel]

Updated

•

7 years ago

Priority: -- → P1

Jeff Muizelaar [:jrmuizel]

Updated

•

7 years ago

Blocks: stage-wr-nightly

Jeff Muizelaar [:jrmuizel]

Updated

•

7 years ago

Summary: [WebRender Shield Study] Higher CPU usage with WebRender enabled → [WebRender Shield Study] Higher CPU usage with WebRender enabled on YouTube

Jeff Muizelaar [:jrmuizel]

Updated

•

7 years ago

Assignee: nobody → jmuizelaar

Jeff Muizelaar [:jrmuizel]

Comment 2

•

7 years ago

What are the actual before and after CPU usage numbers you see?

Flags: needinfo?(andreea.cupsa)

Andreea Cupsa [:acupsa], Experiments QA

Reporter

Comment 3

•

7 years ago

Just to clarify, this issue also reproduces with other websites like Twitch, Amazon, Wikipedia. For exemple livestreams on https://twitch.tv has an average ~19% CPU usage without WebRender, and has an average ~ 34% CPU usage with WebRender enabled, also navigating on https://amazon.com with WebRender enabled almost doubles the average CPU usage. Also to answer your question the minimum and maximum CPU usage for the video from this bug is like this: without WebRender is ~5% and goes up to ~21% (average CPU usage ~14%), while with WebRender enabled it goes up to ~50% CPU usage, and won't go lower than 20% (average CPU usage ~30%).

Flags: needinfo?(andreea.cupsa)

Jeff Muizelaar [:jrmuizel]

Comment 4

•

7 years ago

I can reproduce this locally. I see about 10% CPU with WebRender and 5% CPU without. Here's a profile https://perfht.ml/2L40n8W. The extra time is being spent in the Renderer and the RenderBackend, about equally. It's expected that these numbers would be higher, because the scene that we're drawing is more complex with WebRender than it is with layers. That being said, I wouldn't be surprised if there's work we can do to get this number closer to our current levels. Glenn, can you take a look and see what can be done?

Jeff Muizelaar [:jrmuizel]

Updated

•

7 years ago

Assignee: jmuizelaar → gwatson

Flags: needinfo?(gwatson)

Jeff Muizelaar [:jrmuizel]

Comment 5

•

7 years ago

Here's where time on the Renderer thread is being spent: - 30% of the time is being spent in ANGLE. 27% of the time in the nvidia driver (this goes up to 41% if you include all subtrees under the nvidia driver) - The time spent in xul.dll only accounts for 6% of the Renderer time.

Jeff Muizelaar [:jrmuizel]

Updated

•

7 years ago

Depends on: 1474664

Glenn Watson [:gw]

Comment 6

•

7 years ago

I haven't tested this on the target hardware yet (I will do shortly), but on my Linux box, the youtube link is running at 30fps for most of the video, but it is running at a steady 60 fps the entire time with WR enabled. Could it be something simple like this? Would it be possible to measure the framerate with / without WR on the machines tested on above just to make sure we're doing a fair comparison?

Flags: needinfo?(gwatson)

Darkspirit

Updated

•

7 years ago

Comment 7

•

7 years ago

Regardless of the question about frame rate about, youtube.com does seem to be consuming more CPU time than most sites. On my profiling machine, it is ~3.2ms / frame in backend, whereas nytimes.com is ~1.4ms / frame, and nytimes.com has far more primitives / vertices. The primitive count on youtube is 653, which is quite low. There are 362 nodes in the CST, which is higher than ideal, but shouldn't be a massive issue. The reference frame count is 97 - this is *much* higher than most sites. I wouldn't expect this to cause a problem, but maybe there is something strange going on there. The vertex count is ~12k - this seems high for this site, but is much lower than nytimes.com, which is ~30k vertices without a problem, so that seems unlikely to cause it. Still investigating...

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 8

•

7 years ago

The 60fps instead of 30fps rendering should be fixed by bug 1474532.

Glenn Watson [:gw]

Comment 9

•

7 years ago

I was able to test on a Win10 + nVidia machine, and that does run at 60fps both with and without WR, so that's not the cause in this case (although it is interesting that non-WR runs at 30 fps on both Mac + Linux). I can reproduce this, I think - but it's somewhat difficult to measure since the CPU usage jumps around so much depending on the video (and the difference doesn't seem to be as large on my test setup). One interesting bit of data - the CPU backend and compositor times in WR are (almost exactly) constant throughout the video (1.2ms and 1.4ms, respectively). Does this imply that the CPU time variation is related to something happening outside WR? I'm not sure, but it seems possible. Is the video decode path going to be the same between WR and non-WR? Is there a way to confirm this?

Glenn Watson [:gw]

Comment 10

•

7 years ago

Huh, it occurs to me that in this case WR is actually doing some redundant work - the DL is not changing, just the contents of an external texture. Thus, we should be able to detect this case and redraw the same built Frame as the previous frame. I think this should be quite easy to detect - I'll prototype this today and see if (a) there's any gotchas I'm missing (b) it measurably reduces the CPU time reported in Task Manager.

Glenn Watson [:gw]

Comment 11

•

7 years ago

OK, this is slightly more involved than I thought, due to the way the texture cache update list is collected. It is certainly feasible to make this case (no new DL, just external texture cache updates) handled significantly more efficiently in WR though. This would help CPU usage in both the youtube and twitch cases. I'm not sure about the amazon.com case mentioned above - I couldn't repro the CPU usage difference there, and WR is doing almost no work on that page for me (it is idle except for the occasional banner animation that occurs once every few seconds). Since this is more involved than originally thought, it becomes a question of priority. Do we want to implement this optimization ASAP or is it low priority compared to other correctness bugs?

Flags: needinfo?(jmuizelaar)