Closed Bug 1371668 Opened 4 years ago Closed 3 years ago

Gmail: access YouTube link


(Core :: Graphics, defect, P2)

55 Branch
Windows 10



Tracking Status
firefox62 --- fixed


(Reporter: afilip, Assigned: mattwoodrow)


(Depends on 5 open bugs, Blocks 3 open bugs, )


(Keywords: perf, Whiteboard: [qf:p2][QRC][QRC_Analyzed][gfx-noted])


(2 files)

2.Login into
3.Start Gecko profiler
4.Access link provided by Benjamin Smedberg
5.Share results

Gecko profile:
Summary: Access YouTube link from Gmail → Gmail: access YouTube link
Mats, could you help to have first profiling?
Assignee: nobody → mats
Flags: needinfo?(mats)
Depends on: 1377257
Depends on: 1377316
Depends on: 1377329
Depends on: 1376063
Flags: needinfo?(mats)
Depends on: 1377352
I filed bugs on a few minor things that I think can be improved.
The rest is mostly JS and display list stuff, which dominates the time.
Someone else should look at those areas.
Assignee: mats → nobody
Whiteboard: [QRC][QRC_NeedAnalysis] → [qf][QRC][QRC_Analyzed]
We seem to be waiting on a lock periodically as well, TextureClient::TryReadLock.

Do you see anything actionable in this profile, Bas?
Flags: needinfo?(bas)
Whiteboard: [qf][QRC][QRC_Analyzed] → [qf:p1][QRC][QRC_Analyzed]
Hrm, it's odd that it's waiting there while the compositor doesn't seem to be doing anything (all in all the numbers aren't super worrying or surprising, but I'd like to understand what's happening). What do you think is going on there Matt?
Flags: needinfo?(bas) → needinfo?(matt.woodrow)
Attached image compositor-timings.jpeg
I've attached an image of a small subset of the profile which I think shows the problem here.

The image shows 4 paints, and 4 composites, though the first composite isn't overly relevant (since it's compositing the paint from one frame earlier), and the last paint's composite isn't pictured.

We're double buffered, so the first paint takes the lock on buffer 'A', and then the second paint takes the lock on buffer 'B'.

Composite 2 is what actually composites buffer 'A', but doesn't unlock it yet (since the GPU is async, and might still be reading from buffer 'A' for a while longer).

Composite 3 composites buffer 'B', and then unlocks buffer 'A'.

Paint 3 is trying to also take the lock on buffer 'A', but composite 3 hasn't happened yet, so it blocks (highlighted stack). We can see that as soon as composite 3 happens, we immediately unblock the main thread, and get back to what we're supposed to be doing.

So it seems that problem here is that the compositor and main thread are out of sync, and we end up blocking the main thread waiting for the compositor thread to actually schedule it's composite.

We can see composite 1 runs in parallel with paint 1 (as we'd want), but after that there's a big gap, and from there composites run significantly out of step with the main-thread, and we end up blocking.

Unfortunately vsync markers aren't present on this profile, so it's hard to see where we're supposed to be, but I believe we only ever composite from vsync, so it's probably the start of the composite markers.

Painting however is happening from 3 different callstacks. We get calls from NormalPriorityNotify and NotifyVsync, both which seem to be vsync related, but I'm not sure why they're different. The last caller is from FinishWaitingForTransaction, which definitely isn't vsync aligned.

The idea is that if compositing is lagging behind, then we skip painting on vsync to wait for it, and instead schedule a paint immediately when the compositor catches up.

I'm not actually sure why we're triggering this (since composites are all fast), but I guess it's timing related and once we get into this mode we end up blocking the event queue for the catch up paint, and then vsync events come in late too and we just don't really recover.

All a bit handwavy, but I think the idea of scheduling these catch-up paints immediately just isn't working as expected here, and we're instead sending paints at times where the compositor won't do anything with them for a while.

We should probably change the behaviour of FinishWaitingForTransaction to just mark itself as unblocked, and wait for the next vsync paint instead.
Flags: needinfo?(matt.woodrow)
Updating the Gecko Profile based on the latest benchmark measurements.

Name: Firefox
Version: 57.0a1
Windows 10 64 bit
buildID: 20170910220126

Gecko Profile:
This won't be addressed in 57; changing it from qf:p1 to qf:p2 for post 57 work.
Whiteboard: [qf:p1][QRC][QRC_Analyzed] → [qf:p2][QRC][QRC_Analyzed]
Keywords: perf
Should we move this to Core:Graphics?
Flags: needinfo?(bas)
Matt, is this still a problem? If it is that seems like a fairly bad situation to be in, since to our telemetry and all such factors it would appear like things are snappy, when actually we're doing pretty poorly.
Flags: needinfo?(bas) → needinfo?(matt.woodrow)
I'm not aware of the code being changed here, so if my analysis of the issue was correct, then yes it's still a problem.

I'm super busy with retained-dl at the moment, so won't be able to get to this in the near future.

Do you want to work on this? I think the initial patch should be easy, but I'm worried about test failures and other fallout that might be more work.
Flags: needinfo?(matt.woodrow) → needinfo?(bas)
Milan will find an owner here, who can take Matt's patch and hopefully guide it into the tree.
Flags: needinfo?(bas) → needinfo?(milan)
(In reply to Matt Woodrow (:mattwoodrow) from comment #5)
>  We get calls from
> NormalPriorityNotify and NotifyVsync, both which seem to be vsync related,
> but I'm not sure why they're different.
We fallback to use NormalPriorityNotify if processing RefreshDriver tick takes too much time, say reflow is too slow or animation frame callbacks take too much time. Otherwise we'd be processing pretty much only RefreshDriver ticks and nothing else.

I don't quite understand STR here. One logs into gmail and then what? Looks for some email with a link to youtube?
Could use new profiling (I could do if I understood the STR), or perhaps Ionut you could create a new profile.
OMTP could have improved the situation here a bit, I think.

As mats said, there is lots of painting and JS. Animation frame callbacks definitely take lots of time, and that probably affects why we so often get NormalPriorityNotify ticks.
The JS part there is about Polymer's webcomponents polyfill being slow
(and it is even the Custom Elements v0 version, which is Google only thingie).
Component: General → Graphics
Flags: needinfo?(ibudeanu)
Is this
And see comments 12 and 16 there.
hmm, maybe not the same, but possibly related.
Priority: -- → P2
Whiteboard: [qf:p2][QRC][QRC_Analyzed] → [qf:p2][QRC][QRC_Analyzed][gfx-noted]
Comment on attachment 8974852 [details]
Bug 1371668 - Don't attempt a catch up paint from FinishedWaitingForTransaction, wait until the next vsync.
Attachment #8974852 - Flags: review?(bas) → review+
Pushed by
Don't attempt a catch up paint from FinishedWaitingForTransaction, wait until the next vsync. r=bas
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Assignee: nobody → matt.woodrow
Flags: needinfo?(milaninbugzilla)
Flags: needinfo?(ibudeanu)
Depends on: 1462296
Depends on: 1462503
Perf win!

== Change summary for alert #13252 (as of Tue, 15 May 2018 18:35:12 GMT) ==


  5%  remote-nytimes android-4-2-armv7-api16 opt      3,053.99 -> 2,908.82

For up to date results, see:
You need to log in before you can comment on or make changes to this bug.