Open Bug 1496096 Opened 7 years ago Updated 4 months ago

Horrible performance on runningworld due to media decoding

Categories

(Core :: Audio/Video: Playback, defect, P3)

defect

Tracking

()

Performance Impact medium
Tracking Status
firefox64 --- affected

People

(Reporter: jesup, Unassigned)

References

(Depends on 2 open bugs, )

Details

(Keywords: perf:resource-use)

Attachments

(2 files)

Loading this page (on any platform) pegs the CPU - 97% on a win10 mobile xeon P50 (dual gpu, Intel P530 + NVidia Quadro M2000M), 58% on a dual-XEON Linux desktop, and very bad on mac too (no % grabbed). All the CPU time after load appears to be in media decoders (apparently ~25 of them) and related things; mainthread was 99% idle. https://perfht.ml/2xZ2era (end of main load, lots of ION compiling happening you can't see) perf profile at "idle": https://perfht.ml/2xRmc7l Chrome appears to be using very little CPU on this on linux
Rank: 15
Priority: -- → P2
Flags: needinfo?(jyavenard)
Jean-Yves, Do you know who should take a look at this?
Additional perf traces, including loading the page, showing massive contention on the memory lock: https://perfht.ml/2y1jf45 ouch (though this is 35% of time not-waiting): https://perfht.ml/2xWgFw3 and https://perfht.ml/2xWgPUb In addition to not software decoding/etc everything, there's quite a few issues shown in the traces
gecko profile with symbols. This section shows ~29% lll_lock_wait; 6% directly in pthread_mutex_lock, 5% in lll_unlock_wake - and that's just mainthread https://perfht.ml/2Qw2vce
Assignee: nobody → rjesup
Status: NEW → ASSIGNED
Depends on: 1496554
Memory mutex contention in the content process, before (above) and after (below) switching to per-thread arenas for MediaDecoder's SharedThreadPool
Depends on: 1497569
Whiteboard: [qf] → [qf:p1:67]
See Also: → 1481967
For reference: this page has 25 videos (in firefox at least). The smallest of them appears to be 1920x1238. the larger ones are 3400x2266(!) @ 30fps. This kills a machine when decoded with software. Since we limit the number of simultaneous dxva-decoded videos to 8, this causes most (all?) to be software-decoded, even on win10 with a capable GPU (and thus crazy memory allocator traffic). However, letting them all decode in HW is also horrible, just in different ways - profiles show it sitting in Note that Chrome does not appear to be actively decoding all these videos; they're all hidden. When you click on the slideshow it unhides one of them at a time; this would match with they're all being HD or HD+.
Flags: needinfo?(jyavenard)

I'm still seeing this issue on my MacBook Pro, 2017.

https://share.firefox.dev/3noQNlj

I do see a number of active media coder threads.

The scrolling performance in particular is noticeably better in Chrome and in Safari on the site.

Whiteboard: [qf:p1:67] → [qf:p2:resource]
Assignee: rjesup → nobody
Status: ASSIGNED → NEW
Performance Impact: --- → P2
Whiteboard: [qf:p2:resource]
Severity: normal → S3
Priority: P2 → P3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: