Open Bug 1367352 Opened 7 years ago Updated 2 years ago

[Input Latency] Firefox is 63% (35 ms) slower than Chrome in case Facebook - click right arrow on photo viewer

Categories

(Core :: Graphics, defect, P3)

defect

Tracking

()

Performance Impact low

People

(Reporter: mlien, Unassigned)

References

(Blocks 1 open bug, )

Details

(Keywords: perf, Whiteboard: [QRC][QRC_Analyzed][gfx-noted])

User Story

STR:
0. Logged in with Whitehat account, go to Linda's photo album
1. Click right-arrow button on photo to next photo

Gecko Profiles:
1. https://perfht.ml/2qK9YKr

Short Gecko Profile(cover only Input Lantency action):
https://perfht.ml/2rixeQ0

Reports: https://goo.gl/DjIwNb
Notes: https://docs.google.com/spreadsheets/d/1MsTK1FW88wuLd25A18HG2KqpjYJnvrxGM9a6MeZp35w/edit#gid=256706414
      No description provided.
Whiteboard: [qf] → [qf][QRC_NeedAnalysis]
User Story: (updated)
Ready for QRC profile analysis but not yet ready for QF triage.
Whiteboard: [qf][QRC_NeedAnalysis] → [QRC][QRC_NeedAnalysis]
Summary: [Input Latency] Facebook - click right arrow on photo viewer → [Input Latency] Firefox is 63% (35 ms) slower than Chrome in case Facebook - click right arrow on photo viewer
Scott, could you help to profile bug? Thanks.
Assignee: nobody → scwwu
Flags: needinfo?(scwwu)
Hello Bas, from one of the profiles there's a spot where 200ms was spent on rasterizing. I think it looks unusual, and could have slowed the page down. Wonder if you know it's a problem or not, thanks!

https://perf-html.io/public/510d5f01c33c9b8c9db71723674a60268f6b1e96/calltree/?hiddenThreads=&range=4.2482_4.9018&thread=5&threadOrder=0-2-3-5-1-4
Flags: needinfo?(scwwu) → needinfo?(bas)
Markus, this last profile confuses me, there seems to be no samples for about 150ms for the content thread during this long rasterize. What's up with that? Ever seen this before? Other threads are being sampled... Also note the last sample before the gap is deep inside the Intel drivers.
Flags: needinfo?(bas) → needinfo?(mstange)
I've only ever seen that on the reference hardware, which only has two cores. Our hypothesis is that having one sampler thread per process causes the processes to starve each other for CPU resources (and that's why we need bug 1379286).

Gaps between the samples mean that the sampler thread didn't run at all during that time. And if that thread didn't get a chance to run, chances are that no thread inside that process had a chance to run during that window of time.
Flags: needinfo?(mstange)
Cervantes, could you help to see if we could do something with xPerf tool? Thanks.
Flags: needinfo?(cyu)
I doubt that xPerf can provide useful information on this use case. 

63% slower sounds bad, but the absolute value of the difference is 35 ms, which is about 2 frames. This difference is hardly perceptible to most users. The real difference can also be hidden in the fluctuation among individual runs, which can easily lead us to wrong directions. I think we'd better focus on use cases that has larger absolute difference and lower the priority of this use case.
Flags: needinfo?(cyu)
Whiteboard: [qf][QRC][QRC_Analyzed] → [qf:p2][QRC][QRC_Analyzed]
Keywords: perf
Comment 7 sounds quite a bit like bug 1408699.
Component: General → Graphics
Unassigning myself as I'm not involved in the analysis process anymore.
Assignee: scwwu → nobody
Whiteboard: [qf:p2][QRC][QRC_Analyzed] → [qf:p3][QRC][QRC_Analyzed]
Priority: -- → P3
Whiteboard: [qf:p3][QRC][QRC_Analyzed] → [qf:p3][QRC][QRC_Analyzed][gfx-noted]
Performance Impact: --- → P3
Whiteboard: [qf:p3][QRC][QRC_Analyzed][gfx-noted] → [QRC][QRC_Analyzed][gfx-noted]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.