Closed Bug 1239615 Opened 5 years ago Closed 5 years ago

Scrolling janks on horizontally-scrollable CodeMirror on OSX with APZ enabled

Categories

(Core :: Panning and Zooming, defect)

Unspecified
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
firefox45 --- unaffected
firefox46 --- fixed
firefox47 --- fixed

People

(Reporter: jdescottes, Unassigned)

References

(Depends on 1 open bug)

Details

(Whiteboard: [gfx-noted])

Attachments

(4 files)

STRs : 

- pre : OSX only
- pre : FF Nightly / 46 only
- pre : async pan zoom is enabled
- open the html file in attachment with Firefox Nightly
- scroll quickly in editor displayed

Actual :
The scrolling will regularly freeze and the editor will become blank.

Expected :
The scrolling should not freeze. In Firefox 44 and 45, the scrolling is relatively smooth for the same test-case. In Firefox 46 without async pan zoom, the scrolling is smooth.

CodeMirror configuration:
This issue only occurs when displaying the line number gutter and when a horizontal scrollbar is present.
Summary: Scrolling freezes on CodeMirror on OSX → Scrolling freezes on CodeMirror on OSX with APZ enabled
I can't reproduce any "freezing" but I do see a lot of checkerboarding (editor going blank) on the test case. The reason for the checkerboarding that CodeMirror does its own virtual list implementation, where it only puts things into the DOM based on the scroll position. It relies on the scroll events to do this, and so with APZ that will be delayed relative to the visual scroll position. Implementing bug 1232491 will help with this a lot because it will provide an API to the content that will tell it which area it should fill in.
Depends on: displayport-api
This GIF highlights the "freeze" issue I am referring to. The scroll thumb clearly hangs for ~ half a second before jumping to another location.

Can only reproduce this when scrolling with a macbook touchpad. It does not occur when using a mousewheel, or when dragging the scroll thumb.

I am mostly scrolling UP and DOWN, but with a touchpad it's hard not to scroll a little LEFT/RIGHT. Freezes seem to only occur when my scrolling also modifies the horizontal scroll (and I noted before that the issue only occurred when we had a horizontal scrollbar).
Hm, I'm still unable to reproduce this. Can you grab a profile while this is happening, using the Gecko profiler [1]?

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler
I'm seeing similar behaviour both with and without APZ. Filed bug 1243554 for it.
I just recorded this profile : https://cleopatra.io/#report=37acd02d088486055f09f3c9d421536f2566f77a

Let me know if I should turn on/off some options while recording.
Thanks, that's helpful - it looks like the layer transactions are taking a really long time and causing the compositor thread to block, which would explain the jankiness you're seeing. BenWa, any thoughts on how to improve this?
Flags: needinfo?(bgirard)
I can reproduce the problem. I got a profile with layers.dump enabled and was able to look up the layer tree during a jank layers update.

A quick glance looks like we might be creating ~80ish layers sized at about 1080p. It's a surprise we even run that good under this setup :).

Looks like a FrameLayerBuilder bug to me.
Flags: needinfo?(bgirard)
Looks to me like most of the layers are 28x14, probably from the gutter elements. Now that you mention it though I seem to recall Markus saying something at some point about this as well, although I can't remember when or where. We might need to make some FrameLayerBuilder modifications to not split those gutter elements into so many layers even though they are separate elements. Matt/Markus, any thoughts?
Interestingly, I don't see the complex layer tree if I only vertically scroll. It's only once I scroll horizontally that it blows up.

Will look into why we do this.
(In reply to Matt Woodrow (:mattwoodrow) from comment #9)
> Interestingly, I don't see the complex layer tree if I only vertically
> scroll. It's only once I scroll horizontally that it blows up.
> 
> Will look into why we do this.

I noticed that as well.

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8)
> Looks to me like most of the layers are 28x14

Ahh you're right. I must of looked at the wrong field. If we're ok with the layer tree we might want to look at the number and sizes of the upload calls. see if it's the average or some in particular that are bad. We do seem to be hitting an efficient texture upload leaf so that part looks fine to me.
Ok, so each of the gutter elements is separately wrapped in an element with class 'CodeMirror-gutter-wrapper'. 

When scrolling horizontally, the page is updating the 'left' property on this class, so that all of the gutter elements get moved.

Layout code is deciding that each of these gutter wrapper elements is now establishing an AGR, and we get a separate layer for each text element.

This really isn't easy to fix in our current architecture. 

We'd need to know that these changes are all synchronized and coming from the same place, and I don't think layout has any easy way to determine that.

Our code also always expects AGRs to be an ancestor frame, but there's no common frame for all the moved frames that wouldn't also include the non-moved content.

The page could be rewritten to have two columns, with all the gutter elements in the first and the text content in the second, rather than having it interleaved. That's not a great solution though.
Is there a good way for web authors to write this? They clearly want the numbers to act as position:fixed within the scroll container, but only for the y-axis.

The current solution is pretty painful for us, and it doesn't work with APZ.
Depends on: 1243589
position:sticky can do this - it's what we use for the fixed columns in the treeview on cleopatra. If your positioned ancestor is outside the scrollable element, then setting position:sticky;top:auto;left:0px; will make the x position fixed and the y position scrolled, and it'll work great with APZ. However, doing this is hard on CodeMirror because CodeMirror uses position:relative all over the place on elements inside the scrollable frame.

And even with position:sticky we'll create a separate layer for each line number.

The big freeze here comes from two sources: (1) Texture upload of a big number of tile buffers blocks the compositor. On OS X retina we're uploading <number of line numbers within the displayport> * 1024 * 1024 * 2 (because of component alpha) pixels. I've filed bug 1243589 to stop using tiles for small layers.
And (2), sometimes scrolling janks because CodeMirror uses a DOMMouseScroll listener to adjust their internal display port "in time" - this only does harm and I think CodeMirror can just remove that listener. This only happens if a new scroll is started while the main thread is busy painting, we don't interrupt existing scrolls.
Interestingly, what's taking so long on the main thread is the clearing of the tiles: Each layer gets a full component alpha tile, so per layer we fill one tile with black and one with white, even in the parts of the tile that we don't use.
Why are we uploading the entire tile, even if we only validated a small section of that?

That wasn't the case when I initially enabled tiling for OSX, as we need to restrict uploads to the valid region to perform well enough.

We should definitely skip the surface clearing too.
This commit caps the tile size to 512.

Scrolling was fine on nightly builds of early November, so I spent some time with mozregression yesterday. Seems the performance issues with the codemirror testcase started after Bug 1221073.

The tile size of 1024 seems too big for using codemirror on macbooks (recent pro retina 15" here). If the tile size is capped to 512, scrolling is smooth again (both with APZ and without APZ).
Attachment #8717789 - Flags: feedback?(jnicol)
Comment on attachment 8717789 [details] [diff] [review]
bug1239615.wip.patch

I'm just getting up to speed on this bug, so forgive me if my understanding isn't great. If we were to reduce the tile size this would be the correct patch. But it seems to me like we'd be better fixing what's causing us to use so many multi tiled layers for this page instead. Even using 512x512 tiles here isn't ideal, and would certainly still be an issue on mobile even if a macbook can just about handle it.
Attachment #8717789 - Flags: feedback?(jnicol) → feedback-
(In reply to Jamie Nicol [:jnicol] from comment #16)
> Comment on attachment 8717789 [details] [diff] [review]
> bug1239615.wip.patch
> 
> I'm just getting up to speed on this bug, so forgive me if my understanding
> isn't great. If we were to reduce the tile size this would be the correct
> patch. But it seems to me like we'd be better fixing what's causing us to
> use so many multi tiled layers for this page instead. Even using 512x512
> tiles here isn't ideal, and would certainly still be an issue on mobile even
> if a macbook can just about handle it.

Thanks for having a look! I think several issues lead to the current situation.

One of them is that using 1024x1024 tiles causes a performance regression on its own on most macbooks with the testcase presented here (with or without APZ). This is visible by testing builds for https://hg.mozilla.org/mozilla-central/rev/9d2001daa6a2 vs. its parent changeset. That's what motivated my patch here.

Then another change (around the 23rd of Dec) made the situation much worse with APZ enabled. Still need to investigate here.

I'm not saying we should fix the whole bug by capping the tile size to 512, but since 1024 causes issues on its own on some computers, maybe it should be reviewed?
Flags: needinfo?(jnicol)
Cancelling my ni?.
I suppose we'd prefer also fixing what causes the performance issue solely linked to the 1024 tile size.
Flags: needinfo?(jnicol)
Thanks for the investigation Julian! Did you try the patch from bug 1243589 to see if that helps? That should also limit the size of the layers being used here, by not using tiles for the smaller layers. In fact it should help more than just reducing the tile to 512x512.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #19)
> Thanks for the investigation Julian! Did you try the patch from bug 1243589
> to see if that helps? That should also limit the size of the layers being
> used here, by not using tiles for the smaller layers. In fact it should help
> more than just reducing the tile to 512x512.

Just tried right now, it fixes the issue! Scrolling is super smooth with this patch.
Since Markus is on PTO for a few more days and the patch was incomplete, I stole that bug and updated the patch - if you get a chance please try the updated patch and let me know if you get the same results. I'm not really able to repro the problem on my non-retina laptop so it's hard for me to test.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #21)
> Since Markus is on PTO for a few more days and the patch was incomplete, I
> stole that bug and updated the patch - if you get a chance please try the
> updated patch and let me know if you get the same results. I'm not really
> able to repro the problem on my non-retina laptop so it's hard for me to
> test.

Just tested your new patch, and same results : smooth scrolling both with and without APZ on a mbp retina. Thanks a lot for working on this!
Great, thanks for testing!
Summary: Scrolling freezes on CodeMirror on OSX with APZ enabled → Scrolling janks on horizontally-scrollable CodeMirror on OSX with APZ enabled
Marking fixed by bug 1243589.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.