Open Bug 1462503 Opened Last year Updated 2 months ago

Performance regression in WebGL

Categories

(Core :: Graphics, defect, P2)

x86_64
Linux
defect

Tracking

()

UNCONFIRMED

People

(Reporter: tgnff242, Unassigned)

References

Details

(Keywords: perf, regression)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0
Build ID: 20180517100441

Steps to reproduce:

1. Create a new profile on a Linux system.
2. Open http://webglsamples.org/aquarium/aquarium.html and let it fully load.
3. Observe the number of FPS.


Actual results:

Without changing anything in the page (ie. no settings changes, no number of fish change), on my hardware, I got about 40 FPS.


Expected results:

I was getting a bit more than 50 FPS on previous builds.

Mozregression points to Bug 1371668.

Log:
15:03.86 INFO: Narrowed inbound regression window from [aa001829, e6013642] (3 builds) to [aa001829, fd84333f] (2 builds) (~1 steps left)
15:03.86 INFO: No more inbound revisions, bisection finished.
15:03.86 INFO: Last good revision: aa0018292597e9120a5ece1135487fbaaa9c47dc
15:03.86 INFO: First bad revision: fd84333ffe760d3fdb151e2d79316c69743f3129
15:03.86 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=aa0018292597e9120a5ece1135487fbaaa9c47dc&tochange=fd84333ffe760d3fdb151e2d79316c69743f3129

In case it's needed:
A performance profile on the last good revision: https://perfht.ml/2rQs7FP
A performance profile on the first bad revision: https://perfht.ml/2k83VLQ
Blocks: 1371668
Has Regression Range: --- → yes
Has STR: --- → yes
Component: Untriaged → Graphics
Flags: needinfo?(matt.woodrow)
Keywords: perf, regression
OS: Unspecified → Linux
Product: Firefox → Core
Hardware: Unspecified → x86_64
Motionmark also seemed to be regressed by Bug 1371668. See Bug 1416082 Comment 5.

174:18.38 INFO: No more inbound revisions, bisection finished.
174:18.38 INFO: Last good revision: 64ef3fab796d76dfbea7a16aca466be1e5ebda94
174:18.38 INFO: First bad revision: fd84333ffe760d3fdb151e2d79316c69743f3129
174:18.38 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=64ef3fab796d76dfbea7a16aca466be1e5ebda94&tochange=fd84333ffe760d3fdb151e2d79316c69743f3129
Hmm, so this profile is interesting!

It looks like on the content side we're failing to hit our 16.7ms window (both drawing the WebGL content, and then synchronously doing readback to get it into shmem). There may be some things we can do to be a bit faster here (looks like IsCurrent is slow?), but that's somewhat unrelated to the main issue.

We're painting off the vsync event posted to the event loop, but even though these are being scheduled every 16.7ms, we only pop them from the event queue when we've finished the previous paint, so we start falling behind our intended scheduling.

On the compositor side we're staying under the 16.7ms window, so we are correctly scheduling composites to begin at the start of each interval, and we get out of sync. Eventually we get to the point where a new content paint comes in just after a vsync tick, and we end up sitting idle for the whole window and then composite on the next one.

Now content catches up, and decides to skip painting on a vsync event (since we're waiting on the compositor, which is waiting on vsync) and we have some idle time, which results in a lower frame rate. Previously we kicked off a new content paint as soon as the compositor was done, rather than waiting for the next interval.

Compare this with the previous behaviour causing similar, but different issues from a different usage pattern: https://bugzilla.mozilla.org/show_bug.cgi?id=1371668#c5

Jeff, do you have any opinions on what you think the right behaviour is here?

Both compositor and content are attempting to do work at the start of each interval, but if one runs over time consistently, then the other being locked in causes weird synchronization issues.

Note that on some platforms compositing will actually block internally for vblanks (OSX and Windows D3D I believe), so compositing whenever we want might cause new issues there. The comment in the other bug also details how we can block mid-paint waiting for composites.

The answer might be to suppress vsync events if we process them sufficiently late. That'd regress this demo even more by dropping us to a fixed 30fps, but at least we'd be holding a consistent framerate. Figuring out the right heuristics to make sure we don't accidentally suppress *all* paints with some timings might be complicated.
Flags: needinfo?(matt.woodrow) → needinfo?(jgilbert)
https://bugzilla.mozilla.org/show_bug.cgi?id=1461336#c20 has another case where the new behaviour is beneficial.
There is (and should be) a different behavior when JS calls requests a RAF callback at the beginning of a frame or the end of a frame:
- End of a frame means long frames decay framerate as if vsync is on: 60/30/20/15.
- Beginning of a frame means long frames probably receive a RAF callback immediately (it is enqueued while they're still running the previous RAF callback), as if vsync is off.


I'm giving this some more thought. I'll try to crystallize and get back to you tomorrow.
(In reply to Jeff Gilbert [:jgilbert] from comment #4)
> There is (and should be) a different behavior when JS calls requests a RAF
> callback at the beginning of a frame or the end of a frame:
> - End of a frame means long frames decay framerate as if vsync is on:
> 60/30/20/15.
> - Beginning of a frame means long frames probably receive a RAF callback
> immediately (it is enqueued while they're still running the previous RAF
> callback), as if vsync is off.

That's interesting, hadn't considered that!

Unless there's code I'm unaware of (quite likely) this isn't how it works right now at least.

If JS requests rAF at the end of the frame, that request will be registered synchronously. It's possible that we've missed the vsync interval at this point and the paint request is already in the event queue. Once we return from JS, and handle the paint request (late), we'll just see that rAF was requested, and have no way of knowing that the request was added after the paint request was scheduled. We could potentially fix that though.

Also of note is that the JS returns within 16ms for this test, but once we do the gecko side (flush and readback into Layers shmem), then we've missed the interval. Comparing relatively ordering of the rAF request and vsync wouldn't result in a different behaviour.
Feels like P1/P2 to figure this out.
Priority: -- → P2
See Also: → motionmark-meta
Flags: needinfo?(jgilbert)
You need to log in before you can comment on or make changes to this bug.