Closed
Bug 745315
Opened 12 years ago
Closed 3 years ago
up to 16ms delay before paint starts after setViewport
Categories
(Firefox for Android Graveyard :: General, defect)
Firefox for Android Graveyard
General
Tracking
(firefox14-, blocking-fennec1.0 -)
RESOLVED
INCOMPLETE
Firefox 14
People
(Reporter: gal, Assigned: jrmuizel)
References
Details
Attachments
(2 files, 1 obsolete file)
3.16 KB,
patch
|
kats
:
review+
|
Details | Diff | Splinter Review |
3.83 KB,
patch
|
Details | Diff | Splinter Review |
Native Fennec sends setViewport events from the Java frontend to Gecko every time the user pans the screen around. In Gecko these are batched and the paint is executed at the next refresh driver tick. This is bad, because the tick can be up to 16ms away. Even on slow devices this delay is often larger than the paint time itself (trace from cnn.com below, panning a small amount): I/RenderDump( 2797): paint time 9 I/RenderDump( 2797): 28442334 syncViewportInfo 1868 13229 720 1035 3.897975 1988.2378 13540.98 3.8979745 I/RenderDump( 2797): 28442337 adjustViewport v=RectF(1988.2378, 13541.187, 2468.2385, 14231.187) p=(3991.9995,14266.998) z=3.8979745 DisplayPortMetrics(1868.2375,13231.998,2588.2388,14266.998,3.8979745) The 9 here is ms, obviously. The right thing to do here is to request a lot viewer viewport changes, and then when we execute one it should draw immediately and pre-schedule the next refresh driver tick.
Reporter | ||
Comment 1•12 years ago
|
||
a lot => a lot less
Reporter | ||
Comment 2•12 years ago
|
||
JP, can you get someone to take a look and double check my math above, and if confirmed, we might want to get someone on this.
Comment 3•12 years ago
|
||
I agree with the problem described in this bug; I was seeing similar latencies caused by the refresh driver in bug 740883. Usually it wouldn't be the full 16ms but some smaller number like 4-5ms. Something I tried that helped was to just queue a DRAW event in nsAppShell right after we queued a Viewport:Change event; the DRAW coalescing behaviour would take care of any extra draws. Although this did provide some improvement I didn't think it was really enough to justify the hackiness of the solution. You can see the WIP I tried at https://bug740883.bugzilla.mozilla.org/attachment.cgi?id=611305
Blocks: checkerboarding
Assignee | ||
Comment 4•12 years ago
|
||
(In reply to Andreas Gal :gal from comment #0) > The right thing to do here is to request a lot viewer viewport changes, and > then when we execute one it should draw immediately and pre-schedule the > next refresh driver tick. Can you expand on this? I don't quite follow your suggested solution.
Reporter | ||
Comment 5•12 years ago
|
||
The refresh driver is driven by a timer right now. We should allow that timer to fire early when we need an immediate paint (trigger the refresh driver manually, set a flag so that the next timer tick doesn't actually paint). We likely also have to reduce the frequency of setViewport events as well. Right now we hammer the Gecko thread with setViewport events. When you move a page around with your finger, the Gecko thread is constantly painting (up to 60 times a second, if the page paints quickly enough), even if you just move a tad. This adds additional latency. When you need a paint to happen, Gecko is usually busy servicing the previous paint event. That has to finish first, then we wait for a refresh driver tick, and then we start the paint we desperately need right away to avoid checker-boarding.
Comment 6•12 years ago
|
||
(In reply to Andreas Gal :gal from comment #5) > The refresh driver is driven by a timer right now. We should allow that > timer to fire early when we need an immediate paint Agreed > (trigger the refresh > driver manually, set a flag so that the next timer tick doesn't actually > paint). > That's one option. Another would be to modify the refresh driver scheduling to the same as BenWa is doing in the compositor, which seems better to me. This algorithm is: if we need to paint and have not painted in the last 16ms, paint immediately; otherwise schedule a paint such that it will happen 16ms after the last paint. This algorithm keeps the 16ms throttling but also removes the latency on first draw. > We likely also have to reduce the frequency of setViewport events as well. > Right now we hammer the Gecko thread with setViewport events. When you move > a page around with your finger, the Gecko thread is constantly painting (up > to 60 times a second, if the page paints quickly enough), even if you just > move a tad. This adds additional latency. When you need a paint to happen, > Gecko is usually busy servicing the previous paint event. That has to finish > first, then we wait for a refresh driver tick, and then we start the paint > we desperately need right away to avoid checker-boarding. I don't necessarily agree this will help. We were doing this with the fixed-margin strategy earlier, and the problem I was seeing is that if you send fewer viewport changes, the area that gets drawn on each change is larger. Rather than drawing 1 row of pixels 50 times (for example) we will draw a block of 50 pixels once. This larger draw takes much more time (I think maybe even super-linear) and by the time it's done the user has already scrolled somewhere else. With the 50 small draws, however, the gecko thread is more responsive and the drawn content actually gets displayed to the user. However, I concede that this behaviour is dependent on a lot of factors including the content being drawn. You can twiddle with the danger zone prefs in the fixed margin strategy to play with how often we send viewport changes. I'm adding similar prefs to the velocity bias in bug 744390.
Assignee | ||
Comment 7•12 years ago
|
||
Here's a version of kats' patch that works better. First it doesn't add the DRAW to the event queue, it just does it right away. This helps us avoid leaving extra DRAWs around when we coalesce VIEWPORTs. Second it uses the bounds of the window instead of a 0x0 rect. I originally tried it with a 0x0 rect and found that it did not work. Using the full bounds fixed it. I've yet to measure the impact of this patch.
Assignee | ||
Comment 8•12 years ago
|
||
After thinking about this more, I'm pretty sure we want to do it. I've yet to try to measure any real difference but I think it's at least justifiable theoretically.
Attachment #615965 -
Attachment is obsolete: true
Attachment #616189 -
Flags: review?(bugmail.mozilla)
Comment 9•12 years ago
|
||
Comment on attachment 616189 [details] [diff] [review] Draw directly from viewport changes Review of attachment 616189 [details] [diff] [review]: ----------------------------------------------------------------- Patch looks fine to me, but I really would like to see a measured benefit from this before we land it. It's entirely possible that because of this we'll be doing extra draw calls that just burn CPU cycles. In particular the DRAW event that the refresh driver will queue in response to a viewport change will be turned into a no-op.
Assignee | ||
Comment 10•12 years ago
|
||
(In reply to Kartikaya Gupta (:kats) from comment #9) > Comment on attachment 616189 [details] [diff] [review] > Draw directly from viewport changes > > Review of attachment 616189 [details] [diff] [review]: > ----------------------------------------------------------------- > > Patch looks fine to me, but I really would like to see a measured benefit > from this before we land it. It's entirely possible that because of this > we'll be doing extra draw calls that just burn CPU cycles. In particular the > DRAW event that the refresh driver will queue in response to a viewport > change will be turned into a no-op. The refresh driver will only queue a DRAW if there's an invalid region to draw.
Updated•12 years ago
|
Attachment #616189 -
Flags: review?(bugmail.mozilla) → review+
Assignee | ||
Comment 11•12 years ago
|
||
I ran this on the red page with robocop and got the following results: Before: __start_report0.97285944__end_report __startTimestamp1334864757196__endTimestamp After: __start_report0.97513074__end_report __startTimestamp1334869192814__endTimestamp
Assignee | ||
Updated•12 years ago
|
Attachment #616189 -
Flags: approval-mozilla-central?
Updated•12 years ago
|
tracking-firefox14:
--- → +
Updated•12 years ago
|
Attachment #616189 -
Flags: approval-mozilla-central? → approval-mozilla-central+
Assignee | ||
Comment 12•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/c048eb7cc728
Assignee: nobody → jmuizelaar
Target Milestone: --- → Firefox 14
Comment 13•12 years ago
|
||
This seems to have noticeably regressed both tcheckerboard and tcheckerboard2, unfortunately. :( http://graphs.mozilla.org/graph.html#tests=[[175,63,20]]&sel=none&displayrange=7&datatype=running http://graphs.mozilla.org/graph.html#tests=[[201,63,20]]&sel=none&displayrange=7&datatype=running
Assignee | ||
Comment 14•12 years ago
|
||
I've backed it out: https://hg.mozilla.org/integration/mozilla-inbound/rev/9c88cc584f04
Comment 15•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/c048eb7cc728
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 16•12 years ago
|
||
Backed out: https://hg.mozilla.org/mozilla-central/rev/9c88cc584f04 (Please include the bug number in your backout commit message in the future as well. Thanks!)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 17•12 years ago
|
||
Do we know why this regressed?
Comment 18•12 years ago
|
||
(In reply to Andreas Gal :gal from comment #17) > Do we know why this regressed? I'm sort of looking into it because applying this patch on top of your heuristic stuff is also giving odd results. What I see is that applying this patch seems to result in two draws. Specifically, the NS_PAINT event handler in nsViewManager kicks off an invalidate which results in another draw. Here is the stack from where this happens: #0 nsWindow::Invalidate (this=0x5e1c2880, aRect=...) at /Users/kats/zspace/mozilla-git/widget/android/nsWindow.cpp:545 #1 0x61687316 in nsViewManager::InvalidateWidgetArea (this=<optimized out>, aWidgetView=0x5ff86f60, aDamagedRegion=<optimized out>) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:557 #2 0x6168744c in nsViewManager::FlushDirtyRegionToWidget (this=<optimized out>, aView=<optimized out>) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:437 #3 0x616874da in nsViewManager::ProcessPendingUpdatesForView (this=0x601b99d0, aView=0x5ff86f60, aFlushDirtyRegion=true) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:419 #4 0x61687540 in nsViewManager::ProcessPendingUpdates (this=0x601b99d0) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:1357 #5 0x616865b2 in nsViewManager::DispatchEvent (this=0x601b99d0, aEvent=0x5b7600f8, aView=0x5ff86f60, aStatus=0x5b7601bc) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:786 #6 0x6168670e in nsViewManager::DispatchEvent (this=0x601b99d0, aEvent=0x5b760218, aView=0x5ff86f60, aStatus=0x5b7601bc) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:808 #7 0x61684068 in HandleEvent (aEvent=0x5b760218) at /Users/kats/zspace/mozilla-git/view/src/nsView.cpp:158 #8 0x61b7ee08 in nsWindow::DispatchEvent (this=0x5e1c2880, aEvent=0x5b760218) at /Users/kats/zspace/mozilla-git/widget/android/nsWindow.cpp:673 #9 0x61b800ca in nsWindow::DrawTo (this=0x5e1c2880, targetSurface=0x5ff56c40, invalidRect=<optimized out>) at /Users/kats/zspace/mozilla-git/widget/android/nsWindow.cpp:1064 #10 0x61b8032a in nsWindow::DrawTo (this=0x5e1c2280, targetSurface=0x5ff56c40, invalidRect=<optimized out>) at /Users/kats/zspace/mozilla-git/widget/android/nsWindow.cpp:1112 #11 0x61b80570 in nsWindow::OnDraw (this=0x5e1c2280, ae=0x60432870) at /Users/kats/zspace/mozilla-git/widget/android/nsWindow.cpp:1179
Comment 19•12 years ago
|
||
Even if I hack around the will-paint event shown in the last comment, I still get the refresh driver queueing invalidates, which shouldn't be happening AIUI. (this is on the red page, so there's no animations or anything). #0 nsWindow::Invalidate (this=0x5e1c2880, aRect=...) at /Users/kats/zspace/mozilla-git/widget/android/nsWindow.cpp:545 #1 0x603da316 in nsViewManager::InvalidateWidgetArea (this=<optimized out>, aWidgetView=0x5f886f60, aDamagedRegion=<optimized out>) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:557 #2 0x603da44c in nsViewManager::FlushDirtyRegionToWidget (this=<optimized out>, aView=<optimized out>) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:437 #3 0x603da4da in nsViewManager::ProcessPendingUpdatesForView (this=0x5fab99d0, aView=0x5f886f60, aFlushDirtyRegion=true) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:419 #4 0x603da540 in nsViewManager::ProcessPendingUpdates (this=0x5fab99d0) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:1357 #5 0x6008ac64 in nsRefreshDriver::Notify (this=0x5f7d16a0, aTimer=<optimized out>) at /Users/kats/zspace/mozilla-git/layout/base/nsRefreshDriver.cpp:447 #6 0x60a7abdc in nsTimerImpl::Fire (this=0x5d2ec4c0) at /Users/kats/zspace/mozilla-git/xpcom/threads/nsTimerImpl.cpp:511
Reporter | ||
Comment 20•12 years ago
|
||
That definitely explains the regression. Very encouraging. If we fix the cause we should be able to land this after all.
Comment 21•12 years ago
|
||
Yeah, but I'll need help from somebody who understands this code.
Comment 22•12 years ago
|
||
Boris might have some idea; failing that, perhaps Matt does, as he's looked at invalidates a lot.
Comment 23•12 years ago
|
||
Why are there pending invalidates in the refresh driver in this case?
Assignee | ||
Comment 24•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #23) > Why are there pending invalidates in the refresh driver in this case? I was going to look into that.
Comment 25•12 years ago
|
||
And a perhaps related question: if we're invalidating the whole viewport anyway (are we?) should we just clear other pending invalidates?
Assignee | ||
Comment 26•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #25) > And a perhaps related question: if we're invalidating the whole viewport > anyway (are we?) should we just clear other pending invalidates? Doesn't this already happen during paint?
Comment 27•12 years ago
|
||
I don't know. It might not if the invalidates are queued up in our view system and haven't been flushed to the widget layer yet....
Assignee | ||
Comment 28•12 years ago
|
||
I took a look at this. WILLPAINT clears our pending invalidates, but this causes us to queue a draw(). We need to clear before calling paint.
Assignee | ||
Comment 29•12 years ago
|
||
We could fix this by disabling nsWindow::Invalidate() during our Draw() but that's pretty ugly. Better suggestions welcome.
Assignee | ||
Comment 30•12 years ago
|
||
We'll see if this fixes it: https://tbpl.mozilla.org/?tree=Try&rev=dfef37e38b75
Assignee | ||
Comment 31•12 years ago
|
||
This change did not fix the regression.
Assignee | ||
Comment 32•12 years ago
|
||
While this doesn't fix the regression, it does fix the double draw problem. The regression may come from scheduling a draw to early, which better displayport prediction should avoid. We can retry this when we have the better prediction.
Reporter | ||
Comment 33•12 years ago
|
||
kats, want to try this patch with your patch?
Updated•12 years ago
|
blocking-fennec1.0: --- → ?
Updated•12 years ago
|
blocking-fennec1.0: ? → -
Comment 34•12 years ago
|
||
minusing because its not clear that this is a win or even a problem.
Reporter | ||
Comment 35•12 years ago
|
||
You don't think its a problem that we delay urgently needed paints to avoid checkerboarding by up to 16ms? You don't think it would be a win not to do that?
Comment on attachment 617580 [details] [diff] [review] Version with invalidation disabling Review of attachment 617580 [details] [diff] [review]: ----------------------------------------------------------------- The way this is supposed to work is that you send an NS_WILLPAINT event (and have your NS_PAINT event set didSendWillPaint). The NS_WILLPAINT handler may trigger invalidation. That invalidation should be covered by your NS_PAINT event. In the Android case, since you always paint everything, you should just ignore invalidations that happen during the NS_WILLPAINT. It is not correct to ignore invalidations that happen during NS_PAINT, although there won't normally be any.
Gonk needs a similar fix.
Though it's a bit more complicated because Gonk uses mDirtyRegion. So nsWindow::DoDraw needs to send NS_WILLPAINT before copying mDirtyRegion into event.region and clearing mDirtyRegion.
As for this 16ms delay, I think the right thing to do is for nsRefreshDriver::ScheduleViewManagerFlush to schedule a refresh driver run with *zero* delay (but still async) if there isn't a timer running and the delay since the last refresh driver tick is more than one refresh driver period. It might make sense to do that for some other callers of EnsureTimerStarted too, the non-animation ones --- AddStyleFlushObserver, AddLayoutFlushObserver. Does that sound right, Boris?
Another option is to add API to request "urgent" early refresh driver ticks in response to "critical" user input where we must minimize latency. The question is of course, what would be treated as "critical". If everything's critical then we won't be throttling paints anymore.
Reporter | ||
Comment 41•12 years ago
|
||
roc, can we simply skip the next tick when we schedule an early draw?
Sure, if a tick happens early then we won't need to schedule another one (unless an animation observer requests it). That still means we'd end up painting furiously if too many "critical" requests happen.
Reporter | ||
Comment 43•12 years ago
|
||
Shouldn't be the case. Even pretty quick paints take a few ms. We shouldn't fire a new one at least until the last one is done. It would be nice if we make paints interruptible.
On desktop, we definitely had problems in the past with painting at too high a rate and wasting cycles since you can't see more than 60fps anyway. Hence bug 598482. Really we should be thinking about this in terms of OMTC.
Assignee | ||
Comment 45•12 years ago
|
||
(In reply to Andreas Gal :gal from comment #35) > You don't think its a problem that we delay urgently needed paints to avoid > checkerboarding by up to 16ms? You don't think it would be a win not to do > that? I've been continuing to look at this, and I have yet been able to see clear wins. Tryserver has been flaky which has made measuring this a pain, but I hope to get better results soon.
Reporter | ||
Comment 46•12 years ago
|
||
On the red div page this delay takes longer than the actual paint time. Happy to help measure it.
Assignee | ||
Comment 47•12 years ago
|
||
My current patch https://hg.mozilla.org/try/rev/27f790c8d5e5 still causes tcheckerboard to regress from .77 to .73-.71 and I don't know why.
Comment 48•12 years ago
|
||
> Does that sound right, Boris?
We don't want to do an "immediate" tick for style/layout observers, I don't think. Though maybe it would be ok.
Comment 50•3 years ago
|
||
We have completed our launch of our new Firefox on Android. The development of the new versions use GitHub for issue tracking. If the bug report still reproduces in a current version of [Firefox on Android nightly](https://play.google.com/store/apps/details?id=org.mozilla.fenix) an issue can be reported at the [Fenix GitHub project](https://github.com/mozilla-mobile/fenix/). If you want to discuss your report please use [Mozilla's chat](https://wiki.mozilla.org/Matrix#Connect_to_Matrix) server https://chat.mozilla.org and join the [#fenix](https://chat.mozilla.org/#/room/#fenix:mozilla.org) channel.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 3 years ago
Resolution: --- → INCOMPLETE
Updated•3 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•