FB Paper website maxes out CPU

RESOLVED INCOMPLETE

Status

()

Core
Audio/Video
RESOLVED INCOMPLETE
5 years ago
4 years ago

People

(Reporter: atopal, Unassigned)

Tracking

28 Branch
x86
All
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment)

(Reporter)

Description

5 years ago
This websites maxes out my CPU cores: https://www.facebook.com/paper

It does that for a few seconds on Chrome too, but then CPU usage on Chrome goes down to a few % while it still utilizes many cores at 100% with Firefox. Something is off.

Comment 1

5 years ago
Created attachment 8367878 [details]
fb-paper-profile

Yeah, this isn't right. Wonder what's going on though - seems we spend forever in the event loop itself. :-\

Comment 2

5 years ago
FWIW, this reproduces on 27 and nightly (29) as well. IRC suggested needinfo'ing bz, which I'm duly doing... this profile is non-enlightening to me, as noted in comment #1. Looking at it from the browser toolbox's profiler, seems we just spend forever in the event loop.
Flags: needinfo?(bzbarsky)
Summary: Website maxes out CPU → FB Paper website maxes out CPU
In a profile I got from Instruments, it looks like most of the CPU is spent doing stuff on other threads than the main thread. I saw VP8 decoding, allocating and deallocating texture memory, memcpy'ing video data and what seemed like continuous launching and destroying of decoder threads, among other things.
So for me I see us using 100% of one CPU core on this site (on Mac).

A profile I took over 13s of wall-clock time shows, if I'm not confused, about 25-30ms used per thread for 536 different threads (!).  At the same time, activity monitor shows the number of threads in our process oscillating between 55 and 57 total.  So it sounds like we're spinning up threads, doing a tiny little bit of work on them, then tearing them down?

All the threads involved are running mozilla::MediaDecoderStateMachine::DecodeThreadRun(); most of the actual time is spent under vp8_decode, with a bit (10%) spent under the memmove under mozilla::VideoData::Create.  Those two functions account for 90% of the time between the two of them.

Someone familiar with our video code should look at this....
Component: General → Video/Audio
Flags: needinfo?(bzbarsky)
Product: Firefox → Core
You don't see this pain in Chrome as I bet it's using H.264 and that may have hardware accelerated decoding support on Mac.

(In reply to Boris Zbarsky [:bz] from comment #4)
> So it sounds like we're spinning up
> threads, doing a tiny little bit of work on them, then tearing them down?

This can happen. Coincidentally right now I'm working on a patch to use a nsIThreadPool instead for decoding threads, so we shouldn't be doing the thread teardown/creation so much in future.

To avoid the memcpy we'd need to patch libvpx to pass in an allocator in which it writes decoded frames. vp8_decode being slow is also an issue with libvpx.
Depends on: 968016

Comment 6

4 years ago
The website has changed slightly, but Nightly is a lot better here now that bug 968016 is fixed. Is there more to gain here?
(Reporter)

Comment 7

4 years ago
Tried it in Chrome and I'm not seeing any spikes there anymore, so I'll assume that the website has changed in a significant way. It's working fine in Firefox now, but I'm not sure the underlying issue has been fixed, because the website has obviously changed. Since we don't have a reduced testcase, I'm not sure what the resolution of this bug should be. Fixed? WORKSFORME?

Updated

4 years ago
OS: Mac OS X → All

Comment 8

4 years ago
(In reply to Kadir Topal [:atopal] from comment #7)
> I'm not sure what the resolution of this bug should be. Fixed?
> WORKSFORME?

Perhaps incomplete?

I was going to attach a profile of a few seconds but it's too big
Flags: needinfo?(a.topal)
(Reporter)

Updated

4 years ago
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Flags: needinfo?(a.topal)
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.