Closed Bug 1565525 Opened 3 months ago Closed 2 months ago

browser.windows.create() using tabId and 'panel' type produces shifted active areas

Categories

(Core :: Panning and Zooming, defect, P1)

67 Branch
defect

Tracking

()

VERIFIED FIXED
mozilla70
Fission Milestone M5
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- wontfix
firefox68 --- wontfix
firefox69 --- wontfix
firefox70 --- verified

People

(Reporter: vvendigo, Assigned: botond)

References

(Blocks 2 open bugs, Regression)

Details

(Keywords: regression)

Attachments

(4 files, 1 obsolete file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0

Steps to reproduce:

In my webextension I want to move tab into the new panel window.
It worked just fine for last two years.

Minimal example:

browser.tabs.query({currentWindow: true}).then((tabs) => {
    browser.windows.create({
        tabId: tabs[tabs.length-1].id,
        type: "detached_panel",
    })
});

Actual results:

Resulting window document has all active areas (of inputs, hrefs, buttons etc.) shifted downwards (by height of source window's address/tab bars I guess).
I observed some textarea rendering error, other elements are looking OK.
The same problem occurs when I am using 'popup' or 'panel' window type.

Expected results:

Panel window containing document with properly positioned active areas. :)

Could you please add an example extension that encapsulates the minimal example and also reproduces the problem?

From comment 0, I assume this is a regression, could you please state what is the last version it worked properly on? Your UA states Fx67, but I'm uncertain if the problem is in Fx67 or not. The OSes affected (to your knowledge) would probably help as well.
Alternatively, you could help us more in actually finding the regression range (if this is a regression as your comment implies: "It worked just fine for last two years.") -> https://mozilla.github.io/mozregression/

Component: Untriaged → Frontend
Flags: needinfo?(vvendigo)
Product: Firefox → WebExtensions
Attached file Simple tab to panel moving extension (obsolete) —

Minimal bug reproducing webextension example.

Flags: needinfo?(vvendigo)

I attached example extension.
It affects Windows and Linux version of FF. (Didn't tested other.)
I am not sure in which version problem started. Seems like it was 67 or 68. First bugreport from our users arrived someday last week (2019-07-8 - 2019-07-12). I am not in direct touch with them.

And one random observation: the bug doesn't affect tab containing 'about:debugging'.

Whiteboard: webext?
Flags: needinfo?(rob)
Attached file input-shifted.zip

The previous test case couldn't be loaded, so I created a new test case and used it to bisect the issue. The same test case can also be used to verify the bug.

STR:

  1. Load the extension, e.g. via about:debugging.
  2. The extension will open a new tab. Click on the button to move it to a separate window.
  3. Try to focus the input field (or the button, or select text).

Expected:

  • The input field can be focused by clicking on it.

Actual:

  • The input field can only be focused by clicking below it.
Flags: needinfo?(rob)

Comment on attachment 9078360 [details]
Simple tab to panel moving extension

This test case has an invalid manifest file. The homepage_url field cannot be empty. I've created a reduced test case and attached it in comment 4.

Attachment #9078360 - Attachment is obsolete: true
Has Regression Range: --- → yes
Has STR: --- → yes
Keywords: regression
Regressed by: 1530661
Status: UNCONFIRMED → NEW
Ever confirmed: true
Component: Frontend → Panning and Zooming
Product: WebExtensions → Core
Whiteboard: webext?

Setting status flags based on the regression range.

In a debug build, an assertion fails at https://searchfox.org/mozilla-central/rev/0ffa9e372df56c95547fed9c3433ddec4fbf6f11/gfx/layers/ipc/CompositorBridgeParent.cpp#1704 with the stack:

#0  0x00007fc8da517476 in mozilla::layers::CompositorBridgeParent::RecvAdoptChild(mozilla::layers::LayersId const&) (this=0x7fc8b7405800,
 child=...) at /opt/Projects/gecko/gfx/layers/ipc/CompositorBridgeParent.cpp:1704
#1  0x00007fc8d9321419 in mozilla::layers::PCompositorBridgeParent::OnMessageReceived(IPC::Message const&) (this=0x7fc8b7405800, msg__=..
.) at PCompositorBridgeParent.cpp:1092
#2  0x00007fc8d932c565 in mozilla::layers::PCompositorManagerParent::OnMessageReceived(IPC::Message const&) (this=0x7fc8c5507000, msg__=.
..) at PCompositorManagerParent.cpp:200
#3  0x00007fc8d91ad20f in mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) (th
is=0x7fc8c5507120, aProxy=0x7fc8c7a2f580, aMsg=...) at /opt/Projects/gecko/ipc/glue/MessageChannel.cpp:2184
#4  0x00007fc8d91ab9dc in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) (this=0x7fc8c5507120, aMsg=...) at /opt/Projects/
gecko/ipc/glue/MessageChannel.cpp:2108
#5  0x00007fc8d91ac27c in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) (this=0x7fc8c5507120, aTas
k=...) at /opt/Projects/gecko/ipc/glue/MessageChannel.cpp:1955
#6  0x00007fc8d91ac815 in mozilla::ipc::MessageChannel::MessageTask::Run() (this=0x7fc8afa4dc50) at /opt/Projects/gecko/ipc/glue/MessageC
hannel.cpp:1986
#7  0x00007fc8d90c08a9 in MessageLoop::RunTask(already_AddRefed<nsIRunnable>) (this=0x7fc8c84fdca8, aTask=...) at /opt/Projects/gecko/ipc
/chromium/src/base/message_loop.cc:442
#8  0x00007fc8d90c1066 in MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask&&) (this=0x7fc8c84fdca8, pending_task=...) at /opt/
Projects/gecko/ipc/chromium/src/base/message_loop.cc:450
#9  0x00007fc8d90c12a0 in MessageLoop::DoWork() (this=0x7fc8c84fdca8) at /opt/Projects/gecko/ipc/chromium/src/base/message_loop.cc:523
#10 0x00007fc8d90c2138 in base::MessagePumpDefault::Run(base::MessagePump::Delegate*) (this=0x7fc8ce135190, delegate=0x7fc8c84fdca8) at /
opt/Projects/gecko/ipc/chromium/src/base/message_pump_default.cc:35
#11 0x00007fc8d90c06ef in MessageLoop::RunInternal() (this=0x7fc8c84fdca8) at /opt/Projects/gecko/ipc/chromium/src/base/message_loop.cc:3
15
#12 0x00007fc8d90c0665 in MessageLoop::RunHandler() (this=0x7fc8c84fdca8) at /opt/Projects/gecko/ipc/chromium/src/base/message_loop.cc:30
8
#13 0x00007fc8d90c061a in MessageLoop::Run() (this=0x7fc8c84fdca8) at /opt/Projects/gecko/ipc/chromium/src/base/message_loop.cc:290
#14 0x00007fc8d90e23a8 in base::Thread::ThreadMain() (this=0x7fc8ce1f7470) at /opt/Projects/gecko/ipc/chromium/src/base/thread.cc:192
#15 0x00007fc8d90c772e in ThreadFunc(void*) (closure=0x7fc8ce1f7470) at /opt/Projects/gecko/ipc/chromium/src/base/platform_thread_posix.c
c:40
#16 0x00007fc8f02266db in start_thread (arg=0x7fc8c84fe700) at pthread_create.c:463
#17 0x00007fc8ef40c88f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

It looks like this needs a look from the graphics team. The non-debug build manifestation is likely just fallout from executing past this assertion.

Flags: needinfo?(hsivonen) → needinfo?(botond)

(In reply to Henri Sivonen (:hsivonen) from comment #8)

The non-debug build manifestation is likely just fallout from executing past this assertion.

I don't think so, as this assertion occurs even in builds before the landing of the regressing bug.

This assertion firing is probably still worth investigating (in a separate bug), but it doesn't look like it's the cause of this bug.

Flags: needinfo?(botond) → needinfo?(hsivonen)
See Also: → 1573795
Blocks: 1573795
See Also: 1573795
Priority: -- → P1

Henri, do you think we should consider backing out bug 1530661 until this is resolved? It's a Fission-motivated change that is breaking non-Fission scenarios.

Duplicate of this bug: 1573795

(In reply to Botond Ballo [:botond] from comment #10)

Henri, do you think we should consider backing out bug 1530661 until this is resolved? It's a Fission-motivated change that is breaking non-Fission scenarios.

Backing it out would break Fission pretty badly, so I think backing out would be bad.

(In reply to Botond Ballo [:botond] from comment #9)

(In reply to Henri Sivonen (:hsivonen) from comment #8)

The non-debug build manifestation is likely just fallout from executing past this assertion.

I don't think so, as this assertion occurs even in builds before the landing of the regressing bug.

This assertion firing is probably still worth investigating (in a separate bug), but it doesn't look like it's the cause of this bug.

I commented out the assertion, and then another more serious-looking assertion fires:
https://searchfox.org/mozilla-central/rev/3366c3d24f1c3818df37ec0818833bf085e41a53/gfx/layers/ipc/CompositorBridgeParent.cpp#1749

I think this needs investigation by someone on the graphics team who, unlike me, understands how APZ and the compositor relate to popups and who can assess the implications of these pre-existing assertion failures on this code path.

Flags: needinfo?(hsivonen) → needinfo?(botond)

(In reply to Henri Sivonen (:hsivonen) from comment #12)

I commented out the assertion, and then another more serious-looking assertion fires:
https://searchfox.org/mozilla-central/rev/3366c3d24f1c3818df37ec0818833bf085e41a53/gfx/layers/ipc/CompositorBridgeParent.cpp#1749

Notably, the comment above the assertion says:

    // We don't support moving a child from an APZ-enabled compositor to a
    // APZ-disabled compositor. The mOptions assertion above should already
    // ensure this, since APZ-ness is one of the things in mOptions. Note
    // however it is possible for mApzUpdater to be non-null here with
    // oldApzUpdater null, because the child may not have been previously
    // composited.

Which hints at a deeper non-Fission bug with popup creation being the root cause and the Fission change merely made it visible.

(In reply to Botond Ballo [:botond] from comment #10)

Henri, do you think we should consider backing out bug 1530661 until this is resolved?

The code from that patch was originally behind a pref on Windows:
https://hg.mozilla.org/mozilla-central/rev/91d0c9066fd1#l5.55

I guess we could re-insert a pref check for fission.autostart at that point of the code as band-aid.

See Also: → 1574089

(In reply to Henri Sivonen (:hsivonen) from comment #14)

I guess we could re-insert a pref check for fission.autostart at that point of the code as band-aid.

Filed as bug 1574089.

(In reply to Henri Sivonen (:hsivonen) from comment #15)

(In reply to Henri Sivonen (:hsivonen) from comment #14)

I guess we could re-insert a pref check for fission.autostart at that point of the code as band-aid.

Filed as bug 1574089.

That sounds like a good plan, thanks.

As dicussed with Jessie, we'll look at this for Fission M5.

Fission Milestone: --- → M5
Flags: needinfo?(botond)

What's likely happening here is that the tab is getting moved from an APZ-enabled compositor to an APZ-disabled compositor (as evidenced by the assertions getting tripped) which is something we currently don't properly support in the code. In non-debug builds we proceed anyway, and the tab get stuck with stale matrices, because there would be nothing in the new APZ-disabled compositor updating those matrices. So a workaround might be to modify the MatrixMessage type to carry a Maybe<Matrix> instead of a Matrix, and then if a tab is moved from APZ-enabled to APZ-disabled, we use that to clear the matrix on the content side. This would be done by something like this:

if (oldApzUpdater && !mApzUpdater) {
  // workaround for bug 1565525
  nsTArray<MatrixMessage> clear;
  clear.AppendElement(MatrixMessage(Nothing(), child));
  if (GeckoContentController* cont = GetGeckoContentControllerForRoot(child)) {
    cont->NotifyLayerTransforms(clear);
  }
}

at this point. On the content we'd set the matrix to Nothing() here instead of setting it to Some(matrix).

Also given that the STR is a legitimate use case, we need to modify the graphics stack to properly support moving a tab from APZ-enabled to APZ-disabled compositors (and vice-versa) so this workaround isn't so much a workaround as a step in that direction. Another option would be to enable APZ on compositors for panel windows which may or may not be covered by bug 1493208 (I've lost track of the specific scenarios that bug covers).

(In reply to Kartikaya Gupta (email:kats@mozilla.com) (away until Feb-2020) from comment #17)

Another option would be to enable APZ on compositors for panel windows which may or may not be covered by bug 1493208 (I've lost track of the specific scenarios that bug covers).

I believe it would be covered by bug 1493208, since "panel" and "popup" are equivalent.

See Also: → 1493208

(In reply to Kartikaya Gupta (email:kats@mozilla.com) (away until Feb-2020) from comment #17)

So a workaround might be to modify the MatrixMessage type to carry a Maybe<Matrix> instead of a Matrix, and then if a tab is moved from APZ-enabled to APZ-disabled, we use that to clear the matrix on the content side.

I tried to implement this, but ran into the issue that if we don't have APZ, we don't have a GeckoContentController either.

(In reply to Botond Ballo [:botond] from comment #19)

I tried to implement this, but ran into the issue that if we don't have APZ, we don't have a GeckoContentController either.

I guess this can be worked around by using the GeckoContentController from the old compositor.

Ok, I've implemented the suggested fix locally and it does fix the STR from comment 4.

Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a87fc4698fdc88821ab1c5024a316639dd803abc

Nice :) and yeah, the controller from the old APZ is the right one to use. I didn't realize it would change but that makes sense.

Pushed by bballo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a7651a98ea51
Add helper functions to make working with Maybe<Matrix> easier. r=hsivonen
https://hg.mozilla.org/integration/autoland/rev/6b1a0a29644e
Modify MatrixMessage to carry a Maybe<Matrix>, so that a transform can be cleared by sending Nothing. r=hsivonen
https://hg.mozilla.org/integration/autoland/rev/71b6b2fc33cb
Clear layer transforms when a tab is moved from an APZ-enabled compositor to an APZ-disabled compositor. r=hsivonen
Assignee: nobody → botond
Regressions: 1575498
Blocks: 1579896

Thanks for the fix suggestion Kats! I had to extend it a bit in bug 1576524 for WebRender but otherwise it seems to be working well.

To the bug reporter, sorry we couldn't get the fix into 69, it was too close to the release date to take a somewhat risky fix.

(In reply to Kartikaya Gupta (email:kats@mozilla.com) (away until Feb-2020) from comment #17)

Also given that the STR is a legitimate use case, we need to modify the graphics stack to properly support moving a tab from APZ-enabled to APZ-disabled compositors (and vice-versa) so this workaround isn't so much a workaround as a step in that direction.

Filed bug 1579896 for the full fix here.

Flags: qe-verify+
Blocks: 1576524

I’ve reproduced this issue with Fx 70.0a1 (2019-07-24) on Windows 10 x64 using the information from comment 4.
The issue is verified fixed with Fx 71.0a1 (2019-09-23) and Fx 70.0b9 on Windows 10 x64, macOS 10.13 and Ubuntu 18.04 x64.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.