Open Bug 1555497 Opened 4 months ago Updated 3 hours ago

Remove menupopup binding

Categories

(Toolkit :: XUL Widgets, task)

task
Not set

Tracking

()

ASSIGNED

People

(Reporter: surkov, Assigned: surkov, NeedInfo)

References

(Blocks 2 open bugs)

Details

(Keywords: leave-open)

Attachments

(5 files)

bug 1531870 added a base class, bug 1539651 is about to kill derived bindings, so when bug 1539651 is done, then this one can be fixed:

Depends on: 1539651, 1531870
Summary: convert menupoup to custom element → convert menupoup to a custom element
Summary: convert menupoup to a custom element → Remove menupopup binding
Type: defect → task
Assignee: nobody → surkov.alexander

There are keyboard navigation in popups issue with the patch applied. The problem is because XULPopupManager can't find menuitems [1], menupopup->PrincipalChildList().FirstChild() returns an arrowscrollbox both when arrowscrollbox in shadow DOM content or if it's explicit. It appears, it works in case of XBL bindings.

Emilio, do you have ideas how to workaround that and any estimation on how many similar dependencies in code we have?

[1] https://searchfox.org/mozilla-central/source/layout/xul/nsXULPopupManager.cpp#2267

Flags: needinfo?(emilio)

So it'd make sense for me that it finds the arrowscrollbox, but there's some suspect code in XULPopupManager for sure.

All the calls to GetUncomposedDoc() are not going to work in Shadow DOM. All the calls to GetParent() on an element or nsIContent (note: not on an nsIFrame) may or may not need to use GetFlattenedTreeParent.

All the sibling stuff may or may not need to look at shadow trees / FlattenedChildIterator.

Can you tell me which patch to apply and how to repro the brokenness? then I can try to give more details.

Flags: needinfo?(emilio)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #3)

Can you tell me which patch to apply and how to repro the brokenness? then I can try to give more details.

sure

  • apply patch https://phabricator.services.mozilla.com/D33821 from bug 1539651 (that bug is a twin of this one for places popup only),
  • then if no bookmarks menu on your toolbar, then drag-n-drop it from Customize...
  • then click on the bookmarks menu, and arrow down it -> selection is not changed

also, while you are here, could you kindly take a look why bookmarks menu is shown at wrong position? 1) is aligned differently than a popup from original build and 2) popup arrow is also positioned incorrectly. I suspect there are also some broken hierarchical expectations in popup manager.

Flags: needinfo?(emilio)

The patch I sent fixes the arrow navigation. That's the only caller of that nsBindingManager function, so I think that should be good.

I haven't dug in the popup positioning, but debugging the result of nsMenuPopupFrame::ComputeAnchorRect shouldn't be terribly hard. I don't obviously see anything dependent on shadow DOM there, so may be some styling change in your patch. Let me know if you want me to dig deeper into that and sorry for the lag here.

Flags: needinfo?(emilio)
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/edfe64f42693
Make the code in nsXULPopupManager that finds the default insertion point for children handle Shadow DOM as well. r=NeilDeakin
Pushed by emilio@crisal.io:
https://hg.mozilla.org/integration/autoland/rev/69ac56f66e26
followup: Add missing include to unbust builds in a CLOSED TREE
Keywords: leave-open

a11y failure [1]: accessible/tests/mochitest/actions/test_general.xul | Can't get accessible for 'submenu'

It fails because menupopup doesn't find slotted children, filed bug 1563275 on the issue.

Depends on: 1563275
Blocks: 1554627

It seems the patch makes Firefox to crash [1] on Linux asan build [2], which I don't see on central.

#0 0x7f0e9d3ebe02 in ResumeLoad /builds/worker/workspace/build/src/dom/ipc/BrowserParent.cpp:871:7
[task 2019-08-23T13:56:51.593Z] 13:56:51 INFO - GECKO(1090) | #1 0x7f0e9d3ebe02 in mozilla::dom::BrowserBridgeParent::RecvResumeLoad(unsigned long) /builds/worker/workspace/build/src/dom/ipc/BrowserBridgeParent.cpp:152
[task 2019-08-23T13:56:51.633Z] 13:56:51 INFO - GECKO(1090) | #2 0x7f0e9706bdfd in mozilla::dom::PBrowserBridgeParent::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/build/src/obj-firefox/ipc/ipdl/PBrowserBridgeParent.cpp:341:63
[task 2019-08-23T13:56:51.689Z] 13:56:51 INFO - GECKO(1090) | #3 0x7f0e969b9a08 in mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/build/src/obj-firefox/ipc/ipdl/PContentParent.cpp:5794:32
[task 2019-08-23T13:56:51.705Z] 13:56:51 INFO - GECKO(1090) | #4 0x7f0e9677c620 in mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) /builds/worker/workspace/build/src/ipc/glue/MessageChannel.cpp:2184:25
[task 2019-08-23T13:56:51.705Z] 13:56:51 INFO - GECKO(1090) | #5 0x7f0e967789ad in mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&) /builds/worker/workspace/build/src/ipc/glue/MessageChannel.cpp:2108:9
[task 2019-08-23T13:56:51.705Z] 13:56:51 INFO - GECKO(1090) | #6 0x7f0e9677a5fb in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) /builds/worker/workspace/build/src/ipc/glue/MessageChannel.cpp:1955:3
[task 2019-08-23T13:56:51.705Z] 13:56:51 INFO - GECKO(1090) | #7 0x7f0e9677abd7 in mozilla::ipc::MessageChannel::MessageTask::Run() /builds/worker/workspace/build/src/ipc/glue/MessageChannel.cpp:1986:13
[task 2019-08-23T13:56:51.725Z] 13:56:51 INFO - GECKO(1090) | #8 0x7f0e956871c0 in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/workspace/build/src/xpcom/threads/nsThread.cpp:1225:14
[task 2019-08-23T13:56:51.725Z] 13:56:51 INFO - GECKO(1090) | #9 0x7f0e9568d208 in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/workspace/build/src/xpcom/threads/nsThreadUtils.cpp:486:10
[task 2019-08-23T13:56:51.725Z] 13:56:51 INFO - GECKO(1090) | #10 0x7f0e9678395a in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /builds/worker/workspace/build/src/ipc/glue/MessagePump.cpp:88:21
[task 2019-08-23T13:56:51.741Z] 13:56:51 INFO - GECKO(1090) | #11 0x7f0e966a24a2 in RunInternal /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:315:10
[task 2019-08-23T13:56:51.741Z] 13:56:51 INFO - GECKO(1090) | #12 0x7f0e966a24a2 in RunHandler /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:308
[task 2019-08-23T13:56:51.741Z] 13:56:51 INFO - GECKO(1090) | #13 0x7f0e966a24a2 in MessageLoop::Run() /builds/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:290
[task 2019-08-23T13:56:51.745Z] 13:56:51 INFO - GECKO(1090) | #14 0x7f0e9dde3c29 in nsBaseAppShell::Run() /builds/worker/workspace/build/src/widget/nsBaseAppShell.cpp:137:27
[task 2019-08-23T13:56:51.748Z] 13:56:51 INFO - GECKO(1090) | #15 0x7f0ea18369e0 in nsAppStartup::Run() /builds/worker/workspace/build/src/toolkit/components/startup/nsAppStartup.cpp:276:30
[task 2019-08-23T13:56:51.755Z] 13:56:51 INFO - GECKO(1090) | #16 0x7f0ea1ab28c3 in XREMain::XRE_mainRun() /builds/worker/workspace/build/src/toolkit/xre/nsAppRunner.cpp:4569:22
[task 2019-08-23T13:56:51.756Z] 13:56:51 INFO - GECKO(1090) | #17 0x7f0ea1ab49ee in XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/workspace/build/src/toolkit/xre/nsAppRunner.cpp:4707:8
[task 2019-08-23T13:56:51.757Z] 13:56:51 INFO - GECKO(1090) | #18 0x7f0ea1ab69fe in XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/workspace/build/src/toolkit/xre/nsAppRunner.cpp:4788:21
[task 2019-08-23T13:56:51.777Z] 13:56:51 INFO - GECKO(1090) | #19 0x557e68ece459 in do_main /builds/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:213:22
[task 2019-08-23T13:56:51.777Z] 13:56:51 INFO - GECKO(1090) | #20 0x557e68ece459 in main /builds/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:295
[task 2019-08-23T13:56:51.866Z] 13:56:51 INFO - GECKO(1090) | #21 0x7f0eb682682f in __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291
[task 2019-08-23T13:56:51.866Z] 13:56:51 INFO - GECKO(1090) | #22 0x557e68defdb8 in _start (/builds/worker/workspace/build/application/firefox/firefox+0x40db8)
[task 2019-08-23T13:56:51.866Z] 13:56:51 INFO - GECKO(1090) | AddressSanitizer can not provide additional info.

Emilio, could you take a look at the issue?

[1] https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=263161311&repo=try&lineNumber=1833
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=d85a04c10f80532298098e2a71ffb8f3c2e5007d&selectedJob=263161311

Flags: needinfo?(emilio)

That seems fission kind of stuff, nothing I'd expect to be related with your patch.

I'd retry it a couple times and ni? some of the fission folks if reproducible.

Flags: needinfo?(emilio)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #12)

That seems fission kind of stuff, nothing I'd expect to be related with your patch.

yeah, it shouldn't be, thought though it may reveal some existing bugs

I'd retry it a couple times

I did already. Two more failing builds:
https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=262930282&revision=648b313ee883e06a5d78db281e2130911e0b0f3d
https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=262996039&revision=d157b482cf47d695c133241f5095c8e8b23f29db

and ni? some of the fission folks if reproducible.

could you recommend anyone?

For BrowserParent.cpp, with nika still on PTO, I'd suggest farre.

Flags: needinfo?(afarre)

I attempted to reproduce the linux64-asan test failure on my Ubuntu VM to try to get it recorded in rr, but I wasn't able to reproduce. :/

(In reply to Mike Conley (:mconley) (:⚙️) from comment #15)

I attempted to reproduce the linux64-asan test failure on my Ubuntu VM to try to get it recorded in rr, but I wasn't able to reproduce. :/

curious, if there is an option to record rr on a try server?

(In reply to alexander :surkov (:asurkov) from comment #16)

curious, if there is an option to record rr on a try server?

Not that I'm aware of... there is roc's pernosco that maybe could help here? Hey emilio, do you know if pernosco could be at all useful here?

Also, I just realized that my Linux ASAN build is a non-opt debug build, and the failure occurred on an opt build, so I'm rebuilding my ASAN with the opt build flags and will try again.

Flags: needinfo?(emilio)

No dice reproducing with ASAN opt either. :(

You could try chaos mode in one or both of rr and firefox.

(In reply to Mike Conley (:mconley) (:⚙️) from comment #17)

(In reply to alexander :surkov (:asurkov) from comment #16)

curious, if there is an option to record rr on a try server?

Not that I'm aware of... there is roc's pernosco that maybe could help here? Hey emilio, do you know if pernosco could be at all useful here?

Also, I just realized that my Linux ASAN build is a non-opt debug build, and the failure occurred on an opt build, so I'm rebuilding my ASAN with the opt build flags and will try again.

I suspect they could probably jump in and help if 100% necessary, but I think for pernosco to repro the failure it should be reproducible in rr.

What I'd do is trying to reproduce it locally (with rr record --chaos and/or MOZ_CHAOSMODE set as appropriate, as Tymothy suggested), and then uploading the local trace to Pernosco using pernosco-submit.

If that doesn't work Roc may be able to dig into it more I guess, or try to repro it on a pernosco instance or something. IIRC they use ~the same configuration as automation so if something in that makes a difference...

Flags: needinfo?(emilio)

No luck reproducing with MOZ_CHAOSMODE over here.

Flags: needinfo?(afarre)
Depends on: 1573158

(In reply to alexander :surkov (:asurkov) from comment #22)

btw, I don't see asan failures on a fresh try server anymore https://treeherder.mozilla.org/#/jobs?repo=try&revision=14281b8fa1f5b41af44d3baf19199c6eca5dd968

(╯°□°)╯︵ ┻━┻

Welp, I'm glad it's working now - still, kinda disturbing...

Comment on attachment 9090181 [details] [diff] [review]
menupopup-shadowdom-slotting-close-issue.patch

Review of attachment 9090181 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/content/widgets/menupopup.js
@@ +27,5 @@
> +        // inline it just never appears to open the first time.
> +        setTimeout(() => {
> +          if (!this.shadowRoot) {
> +            this.attachShadow({ mode: "open" });
> +            this.shadowRoot.appendChild(this.fragment);

Emilio, in order to fix performance to allow this to land I tried an approach where I delay attaching the shadow root until popupshowing happens (without this we see a bunch of startup regressions which is I guess a result of all the DOM manipulation on the 50+ menupopups in browesr.xhtml).

I noticed a problem though: the menupopup closes immediately if it's opened the first time when the shadow root gets attached.

Here I've put it in a timeout so you can see that it opens and then gets closed. With this patch applied, it can be reproduced by right clicking anywhere. The context menu will open (unslotted) and then 1 second later it will go away.

Does anything jump out at you? Wondering if this is a dead end for optimization because it's going to be too big of a pain to make it work correctly.

needinfo for Comment 26

Flags: needinfo?(emilio)

Alternatively, I'm curious if there's anything that could be improved perf-wise with attachShadow / slotting that would allow us to not need to lazify appending (for example the patch on this bug which eagerly attaches the shadow dom: https://bugzilla.mozilla.org/attachment.cgi?id=9068977).

That's probably a side-effect of the popup frame getting destroyed for the new Shadow DOM content. This is probably the culprit:

https://searchfox.org/mozilla-central/rev/9bb55ae4d808fc48afcf93f99da6a685265b86c6/layout/xul/nsMenuPopupFrame.cpp#2250

FWIW we have similar issues with the e10s select popup which we should fix (bug 1440506), though XUL popups are way more complex, I suspect.

So probably non-trivial to fix... Not 100% impossible, I guess if we move a bunch of state out of nsMenuPopupFrame, but that may be a pretty big change.

It may be the case that we're unnecessarily reframing the popup, but I'm not 100% hopeful for that... I can take a closer look tomorrow if you want to that, however, rather than working-around perf issues, it'd be great to know if we should do better at this than what we are...

(In reply to Brian Grinstead [:bgrins] from comment #28)

Alternatively, I'm curious if there's anything that could be improved perf-wise with attachShadow / slotting that would allow us to not need to lazify appending (for example the patch on this bug which eagerly attaches the shadow dom: https://bugzilla.mozilla.org/attachment.cgi?id=9068977).

Is there any chance you could create a test-case that reproduces the issue in a way that's easy to profile without all the other noise? I'm thinking just a simple HTML page with the same structure as the menupopup or such. We should compare with other browsers and see if we have subpar performance there, or see if something stands out.

The other thing that comes to mind that may be the case for the slowdown, specially if we're not too slow with regular HTML content is this line:

https://searchfox.org/mozilla-central/rev/9bb55ae4d808fc48afcf93f99da6a685265b86c6/layout/base/nsCSSFrameConstructor.cpp#6507

That's a hackaround for people relying on XBL bindings being eagerly bound that I can't wait to remove. It may or may not be making all those DOM operations do layout tree construction. Worth checking. Is there any profile of the perf regression to see where time is spent?

Flags: needinfo?(emilio)
See Also: → 1440506

keeping attachShadow in constructor and appending shadow DOM itself on popupshowing doesn't shows equally good results as keeping attachShadow in popupshowing, but it also delivers a bunch of improvements [1]. It makes try orange though [2]

[1] https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=3e234b40e82008e5acbe1aa24d6d9bc48fdc1b2b&newProject=try&newRevision=89bd9a1ce8d4a6891fa0f1e655d8f9723fc5ac76&framework=1&showOnlyConfident=1
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=d3603fcc59e8d3859d405511dcb905ecac36db80&selectedJob=264844838

(In reply to alexander :surkov (:asurkov) from comment #30)

keeping attachShadow in constructor and appending shadow DOM itself on popupshowing doesn't shows equally good results as keeping attachShadow in popupshowing, but it also delivers a bunch of improvements [1]. It makes try orange though [2]

[1] https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=3e234b40e82008e5acbe1aa24d6d9bc48fdc1b2b&newProject=try&newRevision=89bd9a1ce8d4a6891fa0f1e655d8f9723fc5ac76&framework=1&showOnlyConfident=1
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=d3603fcc59e8d3859d405511dcb905ecac36db80&selectedJob=264844838

Those numbers look really promising. Hopefully the test failures are due to tests assuming the anonymous elements are there before the popup is opened, or something like that. If so, we could either fix the tests to open the popup before accessing properties on it, or we could override the shadowRoot getter in JS and have it populate the root eagerly at that time.

here's a fresh talos https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=62e1e41ef662ae8a12dbde02539ade0fe9a4e190&newProject=try&newRevision=5267c411c26bc1c25f003b6c1a156f04412dc534&framework=1&showOnlyConfident=1, there are few high confidence improvements, and med confidence tscrollx opt e10s stylo down numbers. It doesn't seem though that CE could affect negatively on scrolling.

Is there any chance you could create a test-case that reproduces the issue in a way that's easy to profile without all the other noise? I'm thinking just a simple HTML page with the same structure as the menupopup or such. We should compare with other browsers and see if we have subpar performance there, or see if something stands out.

I made a test case here with similar markup and creating 100 elements with / without slotting. I'm not seeing much of a difference when loading this in a tab (maybe 1 ms or 2, nothing like we were seeing on talos), but I'm uploading this here for completeness.

The other thing that comes to mind that may be the case for the slowdown, specially if we're not too slow with regular HTML content is this line:

https://searchfox.org/mozilla-central/rev/9bb55ae4d808fc48afcf93f99da6a685265b86c6/layout/base/nsCSSFrameConstructor.cpp#6507

That's a hackaround for people relying on XBL bindings being eagerly bound that I can't wait to remove. It may or may not be making all those DOM operations do layout tree construction. Worth checking. Is there any profile of the perf regression to see where time is spent?

I did do a push to get a profile at https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=264842898&revision=e8a0c852669e23b6c4de692045d245e6b21f92bc (for example, here is a profile on ts_paint filtering on "menupopup": https://perfht.ml/2LZOm6a, though I don't see anything obvious there but ni? just in case). I think this is at least a similar patch that was causing the regressions at https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=e64386f86541ea0c61b27b6b41d345ea80493e85&newProject=try&newRevision=b61ff592c199ecebaceaa772af45f0d7697d8508&framework=1&showOnlyConfident=1

Flags: needinfo?(emilio)

This also caused the "failure" (unexpected pass) in bug 1581718 comment 12, for which bug 1581718 was incorrectly backed out.

(In reply to David Baron :dbaron: 🇯🇵 ⌚UTC+9 from comment #36)

This also caused the "failure" (unexpected pass) in bug 1581718 comment 12, for which bug 1581718 was incorrectly backed out.

Thanks for the heads up. Based on https://searchfox.org/mozilla-central/rev/1d0e84d2ec2843924bc6e5ed6658f93439e12351/layout/generic/crashtests/crashtests.list#40 it's not surprising that this test would have been fixed by this change. It's not clear to me what the asserts(1-18) annotation should be changed to - according to https://bugzilla.mozilla.org/show_bug.cgi?id=1581718#c13 it sounds like there are still some. Alex, worth digging into the unexpected passes to see what we should set it at.

Dug into this with Alex, and turns out we are failing to attach the shadow root due to the XRE_IsParentProcess() check at https://searchfox.org/mozilla-central/rev/1d0e84d2ec2843924bc6e5ed6658f93439e12351/dom/base/Element.cpp#1112. WPT runs in the content process, so even though we would have passed the permission check we failed that one.

I think that condition can be safely removed - it's an optimization but will only be relevant if the page tries attachShadow on a non-html element anyway.

Status: NEW → ASSIGNED
Pushed by asurkov@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6a5144ccd09d
allow shadow DOM in content process for XUL elements to make possible running chrome UI in content process needed for wpt tests r=emilio
Pushed by asurkov@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0257933d9e31
Remove menupopup binding, r=bgrins
You need to log in before you can comment on or make changes to this bug.