Closed Bug 1848824 Opened 1 years ago Closed 2 months ago

reduced FPS panning on Leaflet.js based maps

Categories

(Core :: Panning and Zooming, defect, P2)

Firefox 118
Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
135 Branch
Tracking Status
firefox-esr115 --- wontfix
firefox-esr128 --- wontfix
firefox133 --- wontfix
firefox134 --- wontfix
firefox135 --- fixed

People

(Reporter: wbob, Assigned: hiro)

References

(Regression)

Details

(Keywords: regression)

Attachments

(5 files, 3 obsolete files)

description

when using leaflet layers especially with map markers I see older fenix (or fennec even) releases rendering significantly faster.

With leaflet being widespread investigating can be worthwhile. Admittedly it takes an 2017 era phone as the former standard test device Moto G5 to have the perf regression on markers-shown intolerable.

I did manual bisect testing on when the performance regressed in nightly builds and came between two linked nightly 106b1 revisions.

( the prepared map is from osm.fr umap service with the geojson layer ne_50m_geography_regions_points.geojson from geojson.xyz )

steps to reproduce

  1. open http://umap.openstreetmap.fr/de/map/leaflet-geojson-1_950230 (geojson layer containing ~50 map markers is shown by default but can be hidden)
  2. make clockwise 360° touch pan 2x
    optional 3. either disable the map marker layer or open http://umap.openstreetmap.fr/de/map/leaflet-geojson-2_950231 for plain map panning

then compare the general frames per second impression for one version prior to regressed version and current nightly

actual results

  1. notice visual stalls in panning movement

expected results

  1. same fps performance to prior release

regression version search

tracing profiles

For the regression-step nightly versions and the current nightly a tracing profile is linked

https://share.firefox.dev/47ztLi0 (fenix 106 baseline) rev 2f074155062c08f8568b0ccfbdc58c57aefc99e2
https://share.firefox.dev/3KHwfkR (fenix 106 less fps) rev c7037dbd2de37f147a445b6dae361671ef71896a
https://share.firefox.dev/3DVJsm9 (fenix 118 less fps)

regression code commit search

The changelogs for the regression I saw between these to revisions..

hg log -r 2f074155062c08f8568b0ccfbdc58c57aefc99e2::c7037dbd2de37f147a445b6dae361671ef71896a --template changelog

..the most likely commits by description to be at odds are related to Android Overscroll effects, this is me making a guess

https://bugzilla.mozilla.org/show_bug.cgi?id=1786452

I wonder if [:m_kato] or [:botond] could have a look at this

PS

how would I create an automated and also visual test on pan and zoom performance on maps - looking at docs, raptor?

Attached video f106b1-marker-good.webm (obsolete) —
Attached video f106b2-marker-bad.webm (obsolete) —
Attached video f118-marker-bad.webm (obsolete) —
Attachment #9349036 - Attachment is obsolete: true
Attachment #9349035 - Attachment is obsolete: true
Attachment #9349034 - Attachment is obsolete: true

LTR: f106 good (2f074155) - f106 bad (c7037dbd) - f118 bad

Thanks for the report!

Here's a link to a web view of the identified regression window for convenience: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=2f074155062c08f8568b0ccfbdc58c57aefc99e2&tochange=c7037dbd2de37f147a445b6dae361671ef71896a

Could I ask you to check if the issue reproduces in the GeckoView Example app?

An easy way to install the GeckoView Example app is to install mozregression, and run mozregression --app gve --launch 2023-08-15 (or another date of your choosing, to get a nightly from that date) on the command line.

If the issue does reproduce in GeckoView Example, you could then run mozregression --app gve --good 2f074155062c08f8568b0ccfbdc58c57aefc99e2 --bad c7037dbd2de37f147a445b6dae361671ef71896a to narrow down the regression window to a specific push, which would be super helpful to guide further investigation.

Flags: needinfo?(wbob)

(In reply to wbob from comment #0)

how would I create an automated and also visual test on pan and zoom performance on maps - looking at docs, raptor?

Could you clarify what you mean by a visual test here? Do you mean that you'd like the test to perform a comparison of visual information like screenshots?

Thanks for picking this up. I did the 360° pan on gve from 2023-08-15 and it behaves in framerate as the fenix 118 nightly.

As f106 is already a year in the past, mozregression --good --bad came back with:

INFO: There are no build artifacts for these changesets (they are probably too old).

In absence of prebuilts for automated regression search, I'd revert the 5 changes related to bug 1786452 and make a gve build from that to validate the assumption.

(In reply to wbob from comment #7)

As f106 is already a year in the past, mozregression --good --bad came back with:

INFO: There are no build artifacts for these changesets (they are probably too old).

That's a bit unfortunate; the regression window in comment 5 dates to 2022-09-12 which is less than a year old. My understanding is that we should have builds going back at least a full year; I filed bug 1849082 about this.

(In reply to Botond Ballo [:botond] from comment #6)

(In reply to wbob from comment #0)

how would I create an automated and also visual test on pan and zoom performance on maps - looking at docs, raptor?

Could you clarify what you mean by a visual test here? Do you mean that you'd like the test to perform a comparison of visual information like screenshots?

sorry I was unclear: no computational detection comparison would need to take place, just as artifact of a scripted test, a video is made available. I see Alexandru Ionescu wrote on video recordings https://blog.mozilla.org/performance/author/aionescumozilla-com/ - I think what I want is inside (raptor) browsertime as it wraps Seleniums Webdriver, both for the recording and to script a touch pan - I'll turn to the perf channels on mozilla chat for more questions.

Flags: needinfo?(wbob)

In the profile (fenix 106 less fps) in comment 0, Android UI thread is very yellow. I suspected that if bug 1786452 introduced this bug, it introduced redundant calls which are processed in the Android UI thread. And I think this changeset which implemented ClearOverscroll in WidgetOverscrollEffect introduced the redundant calls. We have a couple of places where we call ClearOverscroll unconditionally.

Just adding a mIsOverscrolled check whether to call UpdateOverscrollVelocity in WidgetOverscrollEffect::RelieveOverscroll feels the map smoother for me FWIW.

The severity field is not set for this bug.
:botond, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(botond)

(In reply to wbob from comment #7)

In absence of prebuilts for automated regression search, I'd revert the 5 changes related to bug 1786452 and make a gve build from that to validate the assumption.

I tested a build with those changes reverted and, while I think it's faster, I have low confidence in my ability to discern the difference visually.

I'll try taking profiles and checking the elapsed time between frame markers in the profile, to have actual numbers to compare.

I'll also push the build with the changes reverted to Try, so you can try it as well.

(In reply to Botond Ballo [:botond] from comment #12)

I'll also push the build with the changes reverted to Try, so you can try it as well.

Ok, the Try push is ready. Could you please:

  1. Go to https://treeherder.mozilla.org/jobs?repo=try&revision=060d1af090a1ac4775d34ef624c9691cea4d49b9
  2. Select the "B" task in the row corresponding to your phone's architecture (ARMv7 or AArch64)
  3. In the bottom panel that comes up, in the "Artifacts and Debugging Tools" tab, download geckoview_example.apk and install it

and let me know if the issue seems to be resolved in that build?

Flags: needinfo?(botond) → needinfo?(wbob)

(In reply to Botond Ballo [:botond] from comment #12)

I'll try taking profiles and checking the elapsed time between frame markers in the profile, to have actual numbers to compare.

Here are the profiles I took:

I counted 30 (content) frames in a 4-second period in the first one, and 44 frames in a 4-second period in the second one, which does seem like a notable improvement.

(In reply to Botond Ballo [:botond] from comment #13)

(In reply to Botond Ballo [:botond] from comment #12)

and let me know if the issue seems to be resolved in that build?

thanks for supplying the Try-build. Good idea to count frames - it's an noticeable visual improvement too.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #10)

Just adding a mIsOverscrolled check whether to call UpdateOverscrollVelocity in WidgetOverscrollEffect::RelieveOverscroll feels the map smoother for me FWIW.

thank you @hiro for that input - adding this as a third comparison would be interesting.

Flags: needinfo?(wbob)

Thanks for checking -- I'll mark this as a regression from bug 1786452.

Severity: -- → S3
Keywords: regression
Priority: -- → P2
Regressed by: 1786452

(In reply to Hiroyuki Ikezoe (:hiro) from comment #10)

I think this changeset which implemented ClearOverscroll in WidgetOverscrollEffect introduced the redundant calls. We have a couple of places where we call ClearOverscroll unconditionally.

I do wonder which codepath is calling ClearOverscroll in the middle of a touch drag -- I think that shouldn't happen.

(In reply to Botond Ballo [:botond] from comment #18)

I do wonder which codepath is calling ClearOverscroll in the middle of a touch drag -- I think that shouldn't happen.

I do see ClearOverscroll being called on every frame of a touch drag. I haven't yet determined what calls it, but it at least means that bug 1786452 causing a regression by making ClearOverscroll slower is plausible.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #10)

Just adding a mIsOverscrolled check whether to call UpdateOverscrollVelocity in WidgetOverscrollEffect::RelieveOverscroll feels the map smoother for me FWIW.

Here's a build which implements this fix: https://treeherder.mozilla.org/jobs?repo=try&revision=9f8eb46a74550737bd30241134dbb91331cee746

wbob, would you mind trying it please?

Flags: needinfo?(wbob)

(In reply to Botond Ballo [:botond] from comment #20)

Here's a build which implements this fix: https://treeherder.mozilla.org/jobs?repo=try&revision=9f8eb46a74550737bd30241134dbb91331cee746

(In my own testing, the result is fairly indeterminate -- I took this profile and counted 37 frames in 4-second period.)

(In reply to Botond Ballo [:botond] from comment #21)

(In reply to Botond Ballo [:botond] from comment #20)

Here's a build which implements this fix: https://treeherder.mozilla.org/jobs?repo=try&revision=9f8eb46a74550737bd30241134dbb91331cee746

(In my own testing, the result is fairly indeterminate -- I took this profile and counted 37 frames in 4-second period.)

it's still an improvement, visually too in comparison videos - just not to the extent as a complete revert.

Flags: needinfo?(wbob)

:m_kato, since you are the author of the regressor, bug 1786452, could you take a look?

For more information, please visit BugBot documentation.

Flags: needinfo?(m_kato)

bug 1786452 was fixed by botond.

Flags: needinfo?(m_kato)
Assignee: nobody → botond

(In reply to Botond Ballo [:botond] from comment #18)

(In reply to Hiroyuki Ikezoe (:hiro) from comment #10)

I think this changeset which implemented ClearOverscroll in WidgetOverscrollEffect introduced the redundant calls. We have a couple of places where we call ClearOverscroll unconditionally.

I do wonder which codepath is calling ClearOverscroll in the middle of a touch drag -- I think that shouldn't happen.

The call chain is:

#01: mozilla::layers::WidgetOverscrollEffect::RelieveOverscroll (gfx/layers/apz/src/Overscroll.h:232)
#02: mozilla::layers::WidgetOverscrollEffect::ClearOverscroll (gfx/layers/apz/src/Overscroll.h:243)
#03: mozilla::layers::OverscrollHandoffChain::ForEachApzc (gfx/layers/apz/src/OverscrollHandoffState.cpp:57)
#04: mozilla::layers::AsyncPanZoomController::ResetTouchInputState (gfx/layers/apz/src/AsyncPanZoomController.cpp:0)
#05: mozilla::layers::InputQueue::ProcessQueue (gfx/layers/apz/src/InputQueue.cpp:0)
#06: mozilla::layers::InputQueue::ReceiveTouchInput (gfx/layers/apz/src/InputQueue.cpp:243)
#07: mozilla::layers::InputQueue::ReceiveInputEvent (gfx/layers/apz/src/InputQueue.cpp:0)
#08: mozilla::layers::APZCTreeManager::ProcessTouchInput (gfx/layers/apz/src/APZCTreeManager.cpp:2060)
#09: mozilla::layers::APZCTreeManager::ReceiveInputEvent (gfx/layers/apz/src/APZCTreeManager.cpp:0)

So what's happening here is that the page is calling preventDefault() on the touch events, and so the touch input block gets marked as ShouldDropEvents() (meaning, the touch events should be dropped without being processed by APZ to perform regular scrolling), and in that state we call ResetTouchInputState() on this line for every event.

This is unnecessarily wasteful -- it should be sufficient to call ResetTouchInputState() on the first event after the input block is prevented.

The gtests added in the previous change are renamed to "NoResetXXX"
since with this change UpdateOverscrollVelocity gets never called.

Instead of the previous gtests this change adds new gtests to test
there's only one call of UpdateOverscrollVelocity triggered by either
ResetTouchInputState or ResetPanGestureInputState while overscrolling.

Pushed by hikezoe.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9cf87baee57d Prevent calling ResetTouchInputState()/ResetPanGestureInputState() multiple times for a single input block. r=botond https://hg.mozilla.org/integration/autoland/rev/8219d42f2c18 Bail out from WidgetOverscrollEffect::RelieveOverscroll if it's not overscrolling. r=botond https://hg.mozilla.org/integration/autoland/rev/1a0999b729cf apply code formatting via Lando
Status: UNCONFIRMED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 135 Branch

I sure see the improvement in frames per second in the fixed nightly

thanks for getting back to this and the good explanations

(In reply to wbob from comment #31)

I sure see the improvement in frames per second in the fixed nightly

Thanks for checking, and apologies for this having fallen off my radar for so long after the initial diagnosis.

Assignee: botond → hikezoe.birchill
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: