Closed Bug 965761 Opened 10 years ago Closed 10 years ago

FB Paper website maxes out CPU

Categories

(Core :: Audio/Video, defect)

28 Branch
x86
All
defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: atopal, Unassigned)

References

()

Details

Attachments

(1 file)

This websites maxes out my CPU cores: https://www.facebook.com/paper

It does that for a few seconds on Chrome too, but then CPU usage on Chrome goes down to a few % while it still utilizes many cores at 100% with Firefox. Something is off.
Attached file fb-paper-profile
Yeah, this isn't right. Wonder what's going on though - seems we spend forever in the event loop itself. :-\
FWIW, this reproduces on 27 and nightly (29) as well. IRC suggested needinfo'ing bz, which I'm duly doing... this profile is non-enlightening to me, as noted in comment #1. Looking at it from the browser toolbox's profiler, seems we just spend forever in the event loop.
Flags: needinfo?(bzbarsky)
Summary: Website maxes out CPU → FB Paper website maxes out CPU
In a profile I got from Instruments, it looks like most of the CPU is spent doing stuff on other threads than the main thread. I saw VP8 decoding, allocating and deallocating texture memory, memcpy'ing video data and what seemed like continuous launching and destroying of decoder threads, among other things.
So for me I see us using 100% of one CPU core on this site (on Mac).

A profile I took over 13s of wall-clock time shows, if I'm not confused, about 25-30ms used per thread for 536 different threads (!).  At the same time, activity monitor shows the number of threads in our process oscillating between 55 and 57 total.  So it sounds like we're spinning up threads, doing a tiny little bit of work on them, then tearing them down?

All the threads involved are running mozilla::MediaDecoderStateMachine::DecodeThreadRun(); most of the actual time is spent under vp8_decode, with a bit (10%) spent under the memmove under mozilla::VideoData::Create.  Those two functions account for 90% of the time between the two of them.

Someone familiar with our video code should look at this....
Component: General → Video/Audio
Flags: needinfo?(bzbarsky)
Product: Firefox → Core
You don't see this pain in Chrome as I bet it's using H.264 and that may have hardware accelerated decoding support on Mac.

(In reply to Boris Zbarsky [:bz] from comment #4)
> So it sounds like we're spinning up
> threads, doing a tiny little bit of work on them, then tearing them down?

This can happen. Coincidentally right now I'm working on a patch to use a nsIThreadPool instead for decoding threads, so we shouldn't be doing the thread teardown/creation so much in future.

To avoid the memcpy we'd need to patch libvpx to pass in an allocator in which it writes decoded frames. vp8_decode being slow is also an issue with libvpx.
The website has changed slightly, but Nightly is a lot better here now that bug 968016 is fixed. Is there more to gain here?
Tried it in Chrome and I'm not seeing any spikes there anymore, so I'll assume that the website has changed in a significant way. It's working fine in Firefox now, but I'm not sure the underlying issue has been fixed, because the website has obviously changed. Since we don't have a reduced testcase, I'm not sure what the resolution of this bug should be. Fixed? WORKSFORME?
OS: Mac OS X → All
(In reply to Kadir Topal [:atopal] from comment #7)
> I'm not sure what the resolution of this bug should be. Fixed?
> WORKSFORME?

Perhaps incomplete?

I was going to attach a profile of a few seconds but it's too big
Flags: needinfo?(a.topal)
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(a.topal)
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: