Intermittent text-overflow/xulscroll.html == text-overflow/xulscroll-ref.html | image comparison, max difference: 255, number of differing pixels: 5051
Categories
(Core :: Layout: Text and Fonts, defect, P5)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox-esr78 | --- | unaffected |
firefox80 | --- | unaffected |
firefox81 | --- | unaffected |
firefox82 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: kats)
References
(Regression)
Details
(Keywords: intermittent-failure, regression)
Attachments
(2 files)
Filed by: rmaries [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=315533640&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/DezBIXQjSUKodyfOai7dqQ/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/DezBIXQjSUKodyfOai7dqQ/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1
Comment hidden (Intermittent Failures Robot) |
Comment 2•4 years ago
|
||
Appeared on this Try push before hitting autoland.
Comment 3•4 years ago
|
||
(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #2)
Appeared on this Try push before hitting autoland.
That try push does not have bug 1655130. The only patches it has that are landed came from kats' stuff pushed a day or two ago.
Comment 4•4 years ago
|
||
And it first failed on this push
Which is bug 1662013.
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 5•4 years ago
|
||
Thanks. On PTO today but I'll look tomorrow.
Assignee | ||
Comment 6•4 years ago
|
||
I started looking into this. Can reproduce it easily enough but am having a hard time making sense of what's going on. The scrollframe in question that isn't scrolled properly is an RTL one, so I would expect the x-coord of the scrollable rect to be negative. This is true when I load the page standalone but for some reason doesn't seem to happen in the reftest. But aside from that there seems to be some issue with the visual scroll offset clobbering the scroll update, or something. So it does seem like a regression from my patches, but it's not immediately clear to me what the problem might be.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 8•4 years ago
|
||
I think the RTL vs non-RTL numbers are some sort of bug, but orthogonal to the problem here. Here what we have happening is that initially the page loads, and APZ gets the initial "scrollframe constructed" ScrollPositionUpdate, with the scroll position at (0,0). And along with that is a visual scroll offset update which scrolls the scrollframe over to the rightmost extent, which is where it's supposed to be. And then the page triggers a scrollframe reconstruction, and APZ gets another ScrollPositionUpdate with the scroll position at (0,0) which scrolls the scrollframe over to the leftmost extent.
The reason it's intermittent is that sometimes the scrollframe reconstruction happens before the first paint. In that scenario we just get the one paint with two scroll position updates to (0,0) followed by the visual scroll offset update, so things up in the right spot. I can reliably reproduce it by delaying the scrollframe reconstruction to run in a rAF callback.
I also tried implementing the changes suggested in Botond's review comment at https://phabricator.services.mozilla.com/D88744#inline-510374 but that didn't help here. I'm going to keep digging to figure out where the visual scroll offset update is coming from and why it doesn't re-appear on the reconstructed scrollframe.
Assignee | ||
Comment 9•4 years ago
|
||
The visual scroll offset update I referred to in my previous comment is not actually an update; the updatetype is eNone. It's just that the APZ is in the isDefault
state and so it applies the visual scroll offset in the metrics.
The fundamental problem here though is that when a scrollframe reconstruction happens, it can get reconstructed to a nonzero scroll position without going going through RestoreState
or running any meaningful code in ScrollToRestoredPosition
. Instead it seems like the scrolled frame's position is already at the nonzero scroll position. In this situation right now the only ScrollPositionUpdate we send to APZ is the one from the ScrollFrameHelper's constructor, which bakes in a zero scroll position. If we instead populate that with the correct scroll position that should solve this problem. Unfortunately I don't know if that's doable right in the constructor as the mScrolledFrame
is only populated later. I'll see if I can figure out a good way to accomplish this.
Assignee | ||
Comment 10•4 years ago
|
||
Least bad way I could think to do this was to replace the origin-None
ScrollPositionUpdate from the constructor with a better one later that has the actual scroll position. Try push to see if this breaks anything: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=39129467b6125ca5a1bd29766aca409f09a8db10
Comment 11•4 years ago
|
||
I'd like to understand this scroll that comes from nowhere, if not for this bug, then for my understanding for the future. When does the scroll position get to a non-zero value and how? Immediately when all of the frames involved have been constructed but not reflowed yet? Or (as the try push suggests) after the first reflow? Can you override nsIFrame::Init in nsHTML/XULScrollFrame and do it there?
Assignee | ||
Comment 12•4 years ago
|
||
I can look in more detail tomorrow but I verified that the scroll position is still zero after the call to ReloadChildFrames
which populates mScrolledFrame
. At some point between that and the end of the reflow, the scroll position changes, I assume because something in the reflow code sets a position on the scrolled frame. I can use rr tomorrow to find out exactly where.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 14•4 years ago
|
||
So maybe this is some confusion between physical and logical coordinates, I'm not really sure. Here's the backtrace at which point the scrollframe's scroll position "changes", because the mScrolledFrame's rect is modified.
At the point of this call the childRect
has a 0,0 top-left because both mHelper.mScrollPort.TopLeft()
and aScrollPosition
are 0. However inside ClampAndSetBounds
we take the non-LTR codepath which sets the x-coord of the rect to a negative number, and that produces a positive scroll position.
This might also explain why it's only affecting a XUL test, since the relevant code seems specific to XULScrollFrame. I haven't checked if there's an equivalent codepath in HTMLScrollFrame.
Assignee | ||
Comment 15•4 years ago
|
||
I poked around some more but I don't understand the LTR/RTL semantics of things inside nsGfxScrollFrame (as in, which quantities are supposed to be in logical vs physical coordinates) well enough to tell if there's an underlying bug here. But given that this is happening, and my fix corrects the problem, I'll put it up for review. If we find an underlying RTL bug then we can always not land it or back it out.
Assignee | ||
Comment 16•4 years ago
|
||
In some cases (in this scenario, with a RTL XUL scrollframe), the scrollframe
can have a nonzero scroll position upon construction or reconstruction, without
having executed the scrolling code in ScrollFrameHelper::RestoreState or
ScrollFrameHelper::ScrollToRestoredPosition. In such cases APZ is not properly
notified of the scroll position, and it can get into a state where it is out of
sync with the main thread scroll position (until a subsequent scroll).
To correct this, we replace the ScrollPositionUpdate that is created during
scrollframe construction with a new ScrollPositionUpdate that holds the correct
scroll position, so that APZ is properly notified of the scroll position.
Comment 17•4 years ago
|
||
Thanks. It sort of seems like our assumption that scroll frames start at 0,0 is wrong. Would it make sense to use the frame bit NS_FRAME_FIRST_REFLOW to detect this?
Assignee | ||
Comment 18•4 years ago
|
||
I can certainly add an assertion that the bit is set. I think we still want to check mScrollUpdates
has the expected size and contains a ScrollOrigin::None
, to ensure that we're only doing the replacement in the exact condition we care about. This is mostly for more robust handling in release builds where the assertion failure won't crash - we still want to have the most sane behaviour that we can.
Assignee | ||
Comment 19•4 years ago
|
||
Hm, actually that bit doesn't even seem to be set in this case.
Assignee | ||
Comment 20•4 years ago
|
||
For posterity, we discussed this further in #apz. The bit isn't set because ReflowFinished() runs after the bit is cleared. I added a new flag in ScrollFrameHelper to catch the first ReflowFinished() and made the code conditional on that.
New try push: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=b46f6d8360933a40e9422484055fd64f15ee23d8
Comment 21•4 years ago
|
||
Comment 22•4 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Description
•