Closed Bug 1400238 Opened 2 years ago Closed 2 years ago

[Ubuntu] Autoscroll indicator is placed below mouse pointer in new windows

Categories

(Core :: Panning and Zooming, defect, P3)

57 Branch
All
Linux
defect

Tracking

()

VERIFIED FIXED
mozilla58
Tracking Status
firefox55 --- wontfix
firefox56 --- wontfix
firefox57 --- wontfix
firefox58 --- verified

People

(Reporter: petruta.rasa, Assigned: botond)

Details

(Whiteboard: [gfx-noted])

Attachments

(3 files)

[Note]:
- This is unrelated to apz.autoscrolling.enabled pref being set to true
- It may happen intermittently 

[Affected versions]:
- Firefox 55.0.3
- Firefox 56.0b12
- latest Nightly 57.0a1

[Affected platforms]:
- Ubuntu 16.04 64-bit

[Steps to reproduce]:
1. Open Firefox and check Autoscrolling option in about:preferences
2. While having two tabs with scroll-able content opened, drag one of them to a new window and without moving mouse position press the mouse middle button 
3. Open a new tab, right click on it and select Move to New Window
4. Autoscroll the page

[Expected result]:
2.4. Autoscroll indicator is placed on mouse pointer

[Actual result]:
2.4. Autoscroll indicator is placed below mouse pointer and scrolling starts immediately if apz.autoscrolling.enabled is True. If the pref is not set, the icon is still placed incorrectly, but it won't scroll automatically.

[Regression range]:
- This is not a recent regression, it happens also on Firefox 51.0.1. It may regressed between 45 (last known good build) and 51, but many of the builds in between won't open with mozregression anymore, so it's difficult to find out for sure. If necessary, I will try to find the exact regression range.

[Additional notes]:
- I could reproduce this on another Ubuntu machine with 14.04 64-bit
I'm a bit confused about this step:

> 2. While having two tabs with scroll-able content opened, drag one of them
> to a new window and without moving mouse position press the mouse middle
> button 

When I tried this, after the drag the mouse was over the tab bar of the new window. If I don't move the mouse and middle-click in that location, middle-clicking will open a new tab, which is the expected behaviour.

Perhaps you could clarify this step, or post a screencast?
Thanks for the screencast. Based on it, I can follow your STR (in particular, the issue described in comment 1 is solved by positioning the mouse lower in the screen when dropping the tab). 

Unfortunately, I cannot reproduce the buggy behaviour - for me, both times the autoscroll indicator appears at the same position as the mouse pointer.

A regression range might be useful and shed some light on the issue.
I've reproduced the issue on latest Nightly with a clean profile.

Unfortunately, I cannot provide the regression range as some of the 2015-2016 Nightly builds won't open in Ubuntu, neither from mozregression tool nor downloaded locally.
(In reply to Petruta Rasa [QA] [:petruta] from comment #4)
> Unfortunately, I cannot provide the regression range as some of the
> 2015-2016 Nightly builds won't open in Ubuntu, neither from mozregression
> tool nor downloaded locally.

Does using the "skip" option on the problematic builds in mozregression help at all?
Whiteboard: [gfx-noted]
Sorry for the late reply.

I managed to reproduce the issue on a different Ubuntu machine, 14.04 x32 - on 32-bit systems the old builds are opened.

It appears this is a 2015 regression, I couldn't narrow it more since the inbound build are no longer available. Hope it helps.

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b68eab795f9de072bee12821b0f09422e5aa0da9&tochange=0b69d304f861d0038fb78f1d52b0f5d13ef7c6fe

If the link above doesn't have an immediate suspect, I'm pasting the longer pushlog:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d01dd42e654b8735d86f9e7c723cc869a3b56798&tochange=0b69d304f861d0038fb78f1d52b0f5d13ef7c6fe
(In reply to Petruta Rasa [QA] [:petruta] from comment #6)
> It appears this is a 2015 regression, I couldn't narrow it more since the
> inbound build are no longer available. Hope it helps.
> 
> https://hg.mozilla.org/mozilla-central/
> pushloghtml?fromchange=b68eab795f9de072bee12821b0f09422e5aa0da9&tochange=0b69
> d304f861d0038fb78f1d52b0f5d13ef7c6fe

Bug 1209774 in that pushlog looks like it might be relevant. It seems to be specific to HiDPI Linux devices.

Petruta, are you testing with a HiDPI screen? (I am not, so that would explain why I couldn't reproduce it.)

One way to check is:

  - Open Firefox to any page
  - Make sure the browser zoom is at 100%
  - Open the Developer Tools Console with Ctrl+Shift+K
  - Type "window.devicePixelRatio", and see what it outputs

On a HiDPI device, it should output something higher than 1; otherwise, it should output 1.
Flags: needinfo?(petruta.rasa)
Hi Botond!

My screen is not HiDPI. When I can't reproduce in the first try with a clean profile, I try again for several times, usually I can see it pretty quick.

In attachment, I captured the autoscroll indicator under Ubuntu 14.04 32-bit. It copies the text from below the icon when mouse wheel is pressed.
Flags: needinfo?(petruta.rasa)
I tried again on a machine running Ubuntu, and I could reproduce the problem. I was previously testing on Debian, so it might be an issue specific to Ubuntu (or possibly specific to the Unity desktop environment).
Assignee: nobody → botond
Once we have a fix, we can decide if it should be uplifted.
I've been investigating this, and traced it down to the WidgetToScreenOffset()s of the child and parent process widgets not agreeing (beyond the amount by which they are expected to differ, i.e. the height of the browser chrome). I need to investigate some more to figure out exactly how this discrepancy arises.
I do now understand why APZ vs. main-thread autoscrolling makes a difference:

  - With APZ autoscrolling, the anchor location is computed by the child 
    process, while the mouse location is computed by the parent process.

  - With main-thread autoscrolling, both the anchor location and the
    mouse location are computed by the child process.

The discrepancy only becomes relevant when the two locations are computed by two different processes (since it is the delta between the two locations that causes scrolling).
(The visual location of the anchor, meanwhile, is always computed by the parent process. This is why the anchor is visually in the wrong place even with main-thread scrolling, but we don't actually get the scrolling.)
(In reply to Botond Ballo [:botond] from comment #11)
> I've been investigating this, and traced it down to the
> WidgetToScreenOffset()s of the child and parent process widgets not agreeing
> (beyond the amount by which they are expected to differ, i.e. the height of
> the browser chrome). I need to investigate some more to figure out exactly
> how this discrepancy arises.

The child process widget computes its widget-to-screen offset by adding together the following quantities:

  - The origin of the "tab dimensions", which is essentially the 
    offset of the browser window from the screen.

  - The "client offset", which is the offset of the browser
    window's client area from the window itself.

  - The "chrome dimensions", which is the size of the browser
    chrome, which offsets the child process widget from the
    browser window's client area.

The parent process widget's computation of the widget-to-screen offset is platform-specific; on GTK it happens by calling a GDK API (gdk_window_get_origin) directly.

As mentioned, the two process' widget-to-screen offsets are expected to differ by the "chrome dimensions", since the parent process widget includes the chrome area while the child process widget does not.

The observed discrepancy arises because the child process's "client offset" is (0,0), while the parent process's widget-to-screen offset reflects a client offset of (0,28).

The child process widget gets the client offset from TabChild, which gets its from TabParent via the PBRowser::UpdateDimensions IPC message. TabParent gets its from the parent process widget. On GTK, the parent process widget updates its client offset when it receives a GDK "_NET_FRAME_EXTENTS" property event.

It appears that the parent process widget does have the correct (0,28) client offset, but in between it receiving it and the child process querying the client offset, the parent does not sent an UpdateDimensions message to tell the child about the client offset.

I'm not sure yet how this is expected to work in general. Perhaps the parent process widget updating its client offset should trigger a PBrowser::UpdateDimensions message?
(In reply to Botond Ballo [:botond] from comment #14)
> I'm not sure yet how this is expected to work in general. Perhaps the parent
> process widget updating its client offset should trigger a
> PBrowser::UpdateDimensions message?

It looks like on Debian, shortly after receiving the "_NET_FRAME_EXTENTS" property event that sets the client offset to (0,28), the parent process widget receives a configure event (via configure_event_cb), which triggers a MozUpdateWindowPos event (via nsBaseWidget::NotifyWindowMoved() which calls nsWebShellWindow::WindowMoved()). TabParent listens to the MozUpdateWindowPos event, and when it receives one it sends the PBrowser::UpdateDimensions message that tells the child process about the client offset.

On Ubuntu, the configure event arrives _before_ the "_NET_FRAME_EXTENTS" property event - when the client offset is still at its old value of (0,0) - and thus does not trigger propagation of the correct offset to the child process. This is what appears to be different between Debian and Ubuntu, which causes the bug to reproduce on Ubuntu and not on Debian.

Karl, do you know whether the ordering between these events is something that we intend to be relying on? If not, should we add some other mechanism to ensure that once a "_NET_FRAME_EXTENTS" property event informs the parent process widget of the client offset, that information is promptly propagated to the child process?
Flags: needinfo?(karlt)
Thank you for all the analysis, Botond.  It certainly is helpful to have this
written down.

(In reply to Botond Ballo [:botond] from comment #15)
> It looks like on Debian, shortly after receiving the "_NET_FRAME_EXTENTS"
> property event that sets the client offset to (0,28), the parent process
> widget receives a configure event (via configure_event_cb), which triggers a
> MozUpdateWindowPos event (via nsBaseWidget::NotifyWindowMoved() which calls
> nsWebShellWindow::WindowMoved()). TabParent listens to the
> MozUpdateWindowPos event, and when it receives one it sends the
> PBrowser::UpdateDimensions message that tells the child process about the
> client offset.
> 
> On Ubuntu, the configure event arrives _before_ the "_NET_FRAME_EXTENTS"
> property event - when the client offset is still at its old value of (0,0) -
> and thus does not trigger propagation of the correct offset to the child
> process. This is what appears to be different between Debian and Ubuntu,
> which causes the bug to reproduce on Ubuntu and not on Debian.
> 
> Karl, do you know whether the ordering between these events is something
> that we intend to be relying on?

I think you are right to suspect that Gecko is relying on this ordering, but
doing so is making assumptions that are not necessarily correct.

It's probably reasonable for the client to assume that _NET_FRAME_EXTENTS is
set before the window is mapped, but the window would also be configured
before mapped, and I suspect the ordering of _NET_FRAME_EXTENTS and configure
is not defined.  WindowMoved() is called on configure.

Even if that order were defined, the client offset is permitted to change over
time, as implied by "The client can track changes to the frame's dimensions by
listening for _NET_FRAME_EXTENTS PropertyNotify events." [1]  I don't know
whether or not there exist window managers that actually change
_NET_FRAME_EXTENTS.

Gecko seems to assume that a change to _NET_FRAME_EXTENTS would also trigger a
configure event, but that's not necessarily the case, and the changes may not
be in the desired order.

> If not, should we add some other mechanism
> to ensure that once a "_NET_FRAME_EXTENTS" property event informs the parent
> process widget of the client offset, that information is promptly propagated
> to the child process?

Yes, OnPropertyNotifyEvent() should send a notification after calling
UpdateClientOffset().

I don't see an existing specific notification on nsIWidgetListener, but, as
you imply, WindowMoved() would work, provided there is no optimization for
repeat notification of the same coords.

Something else to watch out for is that WidgetEvent::mRefPoint is relative to
the client window origin which is at the client offset relative to the "outer
window" origin described for nsIWidgetListener::WindowMoved() parameters.
I don't know whether or not this is consistent with other ports, but I guess
events in the content process have reference points relative to the child
process "widget", and so the correction would be performed somewhere in the
parent/browser process.

[1] https://specifications.freedesktop.org/wm-spec/wm-spec-latest.html
Flags: needinfo?(karlt)
Thanks, Karl. I wrote a fix based on what you suggested; I'll test it on Friday when I'm in the office and near a Ubuntu machine.
The fix seems to be working well. Posting for review.
Attachment #8919840 - Flags: review?(karlt)
Comment on attachment 8919840 [details]
Bug 1400238 - Notify TabParent when the GTK client offset changes.

https://reviewboard.mozilla.org/r/190778/#review197432
Attachment #8919840 - Flags: review?(karlt) → review+
Pushed by bballo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/76300bd1f7c1
Notify TabParent when the GTK client offset changes. r=karlt
While this fix doesn't seem particularly risky, the full set of implications of a new notification message like this is hard to understand exhaustively. Given that the STR for triggering this issue is not super-common (it (1) happens on some but not all Linux variants; (2) requires opening a new window; and (3) only affects the interval between opening the new window, and something else triggering an UpdateDimensions message), my preference is to let this fix ride the 58 train rather than uplifting to 57.

I'm happy to reconsider if someone feels strongly that we should uplift.
https://hg.mozilla.org/mozilla-central/rev/76300bd1f7c1
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
I could no longer reproduce the initial issue on Nightly 58.0a1 2017-11-06 under Ubuntu 14.04 32-bit LTS and Ubuntu 16.04 64-bit LTS.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.