screen-sharing a window crashes browser in Wayland
Categories
(Core :: WebRTC, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox-esr102 | --- | unaffected |
firefox104 | --- | unaffected |
firefox105 | --- | unaffected |
firefox106 | + | wontfix |
firefox107 | --- | wontfix |
firefox108 | --- | wontfix |
firefox109 | --- | wontfix |
firefox110 | --- | wontfix |
firefox111 | --- | verified |
People
(Reporter: hlieberman, Assigned: stransky)
References
(Blocks 2 open bugs, Regression)
Details
(Keywords: nightly-community, regression)
Crash Data
Attachments
(18 files, 3 obsolete files)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
Hello!
I have been seeing crashes on nightly associated with screen-sharing, most especially with screensharing a single window (as opposed to a screen). I have seen similar crashes with sharing entire screens, but they don't seem to be as common, whereas sharing a single window crashes every time.
For testing, I have used https://www.webrtc-experiment.com/Pluginfree-Screen-Sharing/ to prompt for screensharing, selecting a window (Slack, if it matters). Bad versions crash within a second or two.
This is associated with crashes such as 1c943f6c-d3b1-4c99-be07-f78230220912; the signature is OOM | large | webrtc::BaseCapturerPipeWire::OnStreamProcess
.
mozregression has traced this to https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4c76664026b55d57999e109b5bc5429d986df9ab&tochange=ce42afff6cfca3e1f9089c805e302ef7596141e9 -- skimming the commit list, I highly suspect this is something related to the libwebrtc vendoring/update.
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Comment 1•2 years ago
|
||
This also appears to be crash efa6552a-80d5-4cf3-bd5a-b7df40220912, with a slightly different crash signature.
Comment 2•2 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Updated•2 years ago
|
Updated•2 years ago
|
Comment 3•2 years ago
|
||
:mjf, since you are the author of the regressor, bug 1766646, could you take a look? Also, could you set the severity field?
For more information, please visit auto_nag documentation.
Updated•2 years ago
|
Comment 4•2 years ago
|
||
Set release status flags based on info from the regressing bug 1766646
Comment 5•2 years ago
|
||
The bug is marked as tracked for firefox106 (nightly). We have limited time to fix this, the soft freeze is in 2 days. However, the bug still isn't assigned.
:jimm, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit auto_nag documentation.
Comment 6•2 years ago
•
|
||
I don't have a wayland machine available at the moment. Andreas, is this something you have time to look at?
Comment 7•2 years ago
|
||
Me neither. Karl, do you have Wayland readily available for a pernosco repro of this?
Reporter | ||
Comment 8•2 years ago
|
||
I have an rr dump of the crash, if that helps. I'd need some way to dump the ~352MB file to you, though.
Comment 9•2 years ago
|
||
Is that file the output of rr pack? Feel free to send me a link to say gdrive, dropbox or any other service by email and I'll let you know when you can take it down again.
Updated•2 years ago
|
Reporter | ||
Comment 10•2 years ago
|
||
I've been working with Andreas with this offline, as well as with the Pernosco team. Unfortunately, because of pipewire's use of shared memory, we're having trouble getting a valid rr capture.
I've opened a ticket with rr (https://github.com/rr-debugger/rr/issues/3376) to see if they have any suggestions for how we could get this, but I'm not sure we're going to end up with much luck.
In the meantime, the fact that there's a difference between sharing a single window and sharing a whole screen seems to be a curiosity. Is that handled entirely within the libwebrtc code, or is there a different path in FF that takes?
Comment 11•2 years ago
|
||
Set release status flags based on info from the regressing bug 1766646
Comment hidden (spam) |
Comment hidden (spam) |
Comment 14•2 years ago
|
||
This is a reminder regarding comment #5!
The bug is marked as tracked for firefox106 (beta). We have limited time to fix this, the soft freeze is in 10 days. However, the bug still isn't assigned.
Comment 15•2 years ago
•
|
||
I tried mutter --nested
and WAYLAND_DISPLAY=wayland-0 GDK_BACKEND=wayland DISPLAY= mozregression --launch 2022-10-01 --process-output stdout
, but I'm getting NotFoundError
indicating that there are no windows or screens to share. pipewire.socket and pipewire.service exist in /usr/lib/systemd/user/. Is there more config required to use pipewire for sharing?
Comment 16•2 years ago
|
||
This is a reminder regarding comment #5!
The bug is marked as tracked for firefox106 (beta). We have limited time to fix this, the soft freeze is in 9 days. However, the bug still isn't assigned.
Comment 17•2 years ago
|
||
(In reply to Karl Tomlinson (:karlt) from comment #15)
I tried
mutter --nested
andWAYLAND_DISPLAY=wayland-0 GDK_BACKEND=wayland DISPLAY= mozregression --launch 2022-10-01 --process-output stdout
, but I'm gettingNotFoundError
indicating that there are no windows or screens to share. pipewire.socket and pipewire.service exist in /usr/lib/systemd/user/. Is there more config required to use pipewire for sharing?
I received the following suggestion from a GNOME dev:
to run in nested, you need a nested dbus session, with mutter, xdg-desktop-portal, and xdg-desktop-portal-gnome running in that session
Comment 18•2 years ago
|
||
This is a reminder regarding comment #5!
The bug is marked as tracked for firefox106 (beta). We have limited time to fix this, the soft freeze is in 8 days. However, the bug still isn't assigned.
Assignee | ||
Comment 20•2 years ago
|
||
It's reproducible even on release (105.0). I see massive memory consumption while screen sharing. It's marked as orphan-nodes:
1,125.48 MB (100.0%) -- explicit
├────764.92 MB (67.96%) -- window-objects
│ ├──754.65 MB (67.05%) -- top(about:memory, id=57)/active/window(about:memory)
│ │ ├──731.70 MB (65.01%) -- dom
│ │ │ ├──731.66 MB (65.01%) ── orphan-nodes
│ │ │ └────0.03 MB (00.00%) ++ (3 tiny)
│ │ ├───15.62 MB (01.39%) -- js-realm([System Principal], about:memory)
│ │ │ ├──15.08 MB (01.34%) -- classes
│ │ │ │ ├──11.30 MB (01.00%) ++ class(Object)/objects
│ │ │ │ └───3.78 MB (00.34%) ++ (4 tiny)
│ │ │ └───0.54 MB (00.05%) ++ (5 tiny)
│ │ └────7.33 MB (00.65%) ++ (2 tiny)
│ └───10.26 MB (00.91%) ++ (4 tiny)
Assignee | ||
Comment 21•2 years ago
|
||
btw. I don't see that on Fedora which has backported pipewire patches from WebRTC project.
Assignee | ||
Comment 22•2 years ago
|
||
When testing https://www.webrtc-experiment.com/Pluginfree-Screen-Sharing/ the memory consumption looks related to js code bundled on the page. When running with blockers the memory is stable, when running pain profile allocated memory grows rapidly.
Tested on all versions (105-107). Doesn't seem to be related to WebRTC directly.
Assignee | ||
Comment 23•2 years ago
|
||
Testing on https://mozilla.github.io/webrtc-landing/gum_test.html and I see only JS/GC memory operations, nothing related to pipewire/webrtc.
Assignee | ||
Comment 24•2 years ago
|
||
Hm, may we get a malformed PW buffer on BaseCapturerPipeWire::OnStreamProcess() / BaseCapturerPipeWire::HandleBuffer() so we try to allocate too large buffer. We may add some sanity check there to restrict max buffer size or at least assert/warn.
Comment 25•2 years ago
|
||
AFAICS this current_frame_
allocation is the only one we do in HandleBuffer. Looking at its width and height inputs there are some guards against malformed metadata in the spaBuffer, so then the malformed dimensions must come from desktop_size_
?
But even if those were malformed and set, we'd still be guarded a bit because frame metadata can constrain the dimensions. So are the metadata not there, or also malformed?
Should this perhaps be a fallible allocation? Also, things have changed a bit upstream but the allocation is still infallible.
Or is the OOM happening elsewhere? Crash-stats is showing a weird top frame for these reports.
Comment 26•2 years ago
|
||
Hi, I will look what's going on. I wrote the WebRTC code so hopefully I will be able to identify the issue. Maybe I missed some important change that should have been backported. Also, if there are similar issues in the future, issues related to WebRTC and screen sharing, let me please know so I can help you as soon as possible.
Comment 27•2 years ago
|
||
I'm not able to reproduce this issue on Fedora 37 (KDE) using latest git snapshot of Firefox. I will try to test with GNOME.
Comment 28•2 years ago
|
||
I've found out that while you rebased to latest WebRTC, you don't use the latest screen sharing code, instead you still keep the old code under a different name "moz_base_capturer_pipewire.cc" so I'm now wondering what was the point of rebasing here and me asking you to backport WebRTC related patches if they are not going to be used. The old code you use is way behind, slow and as you can see, it is crashing.
Comment 29•2 years ago
|
||
(In reply to grulja from comment #28)
I've found out that while you rebased to latest WebRTC, you don't use the latest screen sharing code, instead you still keep the old code under a different name "moz_base_capturer_pipewire.cc" so I'm now wondering what was the point of rebasing here and me asking you to backport WebRTC related patches if they are not going to be used. The old code you use is way behind, slow and as you can see, it is crashing.
See Bug 1777345 for more details. The short version is we need to update our linux sysroot, but are waiting until we can drop Ubuntu 18.04 support.
Comment 30•2 years ago
|
||
(In reply to Michael Froman [:mjf] from comment #29)
(In reply to grulja from comment #28)
I've found out that while you rebased to latest WebRTC, you don't use the latest screen sharing code, instead you still keep the old code under a different name "moz_base_capturer_pipewire.cc" so I'm now wondering what was the point of rebasing here and me asking you to backport WebRTC related patches if they are not going to be used. The old code you use is way behind, slow and as you can see, it is crashing.
See Bug 1777345 for more details. The short version is we need to update our linux sysroot, but are waiting until we can drop Ubuntu 18.04 support.
That shouldn't be a problem, you can include libdrm the same way you had to include PipeWire and dlopen it on runtime, that's what Chromium does and it works on Ubuntu 18.04. I included libdrm in https://phabricator.services.mozilla.com/D153354 when I wanted to bring it on par with Chromium.
Comment 31•2 years ago
|
||
Possibly related bug report on Archlinux: https://bugs.archlinux.org/task/76231
Comment 32•2 years ago
|
||
The crash happens because both KDE and GNOME use a different approach to share a window. While KDE (KWin) makes the stream size equal to the shared window, GNOME (Mutter) makes the stream size equal to the screen size and uses video_crop metadata. This has been already fixed long time ago in WebRTC so in order to fix this crash you really need to use the latest code. There is no point fixing this old code.
Comment 33•2 years ago
|
||
If you can point me to a source code patch for Firefox 106.0 that fixes/updates the WebRTC code I'd be happy to compile Firefox by myself, test this on Archlinux x86_64 and give you feedback.
Comment 34•2 years ago
|
||
I unfortunately don't have any particular commit I can point you to, the code changed a lot and it was most likely fixed with something else. I'm going to work now on a patch for Fedora (again) to enable the new WebRTC code and I can at least provide you that as ArchLinux can also use it.
Comment 35•2 years ago
|
||
Thanks a lot for your work :)
Comment 36•2 years ago
|
||
This is what I proposed for Fedora: https://src.fedoraproject.org/fork/jgrulich/rpms/firefox/blob/ff106-screencast/f/libwebrtc-screen-cast-sync.patch. It should hopefully work for you as well.
Updated•2 years ago
|
Comment 37•2 years ago
|
||
(In reply to grulja from comment #36)
This is what I proposed for Fedora: https://src.fedoraproject.org/fork/jgrulich/rpms/firefox/blob/ff106-screencast/f/libwebrtc-screen-cast-sync.patch. It should hopefully work for you as well.
Thanks a lot, this solves the issue for me in both Firefox 106.0 and 106.0.1 :)
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 40•2 years ago
|
||
(In reply to grulja from comment #36)
This is what I proposed for Fedora: https://src.fedoraproject.org/fork/jgrulich/rpms/firefox/blob/ff106-screencast/f/libwebrtc-screen-cast-sync.patch. It should hopefully work for you as well.
As of bug 1790097 this patch no longer applies to mozilla-central.
Comment 43•1 year ago
|
||
Copying crash signatures from duplicate bugs.
Comment 44•1 year ago
|
||
Hi Martin, (all,)
I was wondering what the status of this is (since it also affects the Firefox snap) and if there are any further updates here?
Comment 45•1 year ago
|
||
Comment 46•1 year ago
|
||
Depends on D163975
Comment 47•1 year ago
|
||
Depends on D163976
Comment 48•1 year ago
|
||
Depends on D163977
Comment 49•1 year ago
|
||
Depends on D163978
Updated•1 year ago
|
Updated•1 year ago
|
Comment 50•1 year ago
|
||
Depends on D163976
Updated•1 year ago
|
Comment 51•1 year ago
|
||
Depends on D163978
Comment 52•1 year ago
|
||
Depends on D164034
Updated•1 year ago
|
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Comment 53•1 year ago
|
||
Depends on D163979
Comment 54•1 year ago
|
||
Depends on D164414
Comment 55•1 year ago
|
||
Depends on D164415
Comment 56•1 year ago
|
||
Depends on D164416
Comment 57•1 year ago
|
||
Depends on D164417
Comment 58•1 year ago
|
||
Depends on D164418
Comment 59•1 year ago
|
||
I have decomposed the patch that grujla posted, using updatebot to control the headers we are vendoring. It looks like there is now a dependency on epoxy
that I will also need to add to updatebot.
Comment 61•1 year ago
|
||
Got a variety of other pipewire-memfd crashes that might also be this bug: https://crash-stats.mozilla.org/search/?signature=memfd%3Apipewire-memfd%20%28deleted%29&product=Firefox
Updated•1 year ago
|
Updated•1 year ago
|
Comment 62•1 year ago
|
||
Bug 1800919 is making it very difficult to keep the set of signatures fresh for this.
Comment 63•1 year ago
|
||
[@ webrtc::BaseCapturerPipeWire::BaseCapturerPipeWire ] looks different, but these reports have webrtc::BaseCapturerPipeWire::OnStreamProcess in the stack. Not sure if the stacks can be believed.
Updated•1 year ago
|
Comment 64•1 year ago
|
||
Hi, I'm having the same problem (immediate crash when sharing a single window, no problem sharing a full monitor). This happened on Ubuntu 22.10 and now on the development version of 23.04.
Here is one of the crash reports: https://crash-stats.mozilla.org/report/index/ded6f61a-b18f-4623-9587-3a66a0230111
Comment 66•1 year ago
|
||
It's pretty disappointing to see that this is still an issue given that Jan Grulich pointed out exactly where the problem was and explained the solution along with a patch. This is also in the fedora RPM, the code for which, as far as I know, isn't secret.
This should be low hanging fruit for a fix in the next firefox release. Can someone please fix this?
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 67•1 year ago
|
||
Depends on D164419
Comment 68•1 year ago
|
||
Depends on D166820
Comment 69•1 year ago
|
||
Depends on D166821
Comment 70•1 year ago
|
||
Depends on D166822
Comment 71•1 year ago
|
||
Depends on D166823
Comment 72•1 year ago
|
||
Depends on D166824
Comment 73•1 year ago
|
||
(In reply to Max Ehrlich from comment #66)
It's pretty disappointing to see that this is still an issue given that Jan Grulich pointed out exactly where the problem was and explained the solution along with a patch. This is also in the fedora RPM, the code for which, as far as I know, isn't secret.
This should be low hanging fruit for a fix in the next firefox release. Can someone please fix this?
It was not as low hanging as it first seemed, but we will be landing this shortly.
Comment 74•1 year ago
|
||
That's excellent news, thanks a lot for the fast turnaround
Updated•1 year ago
|
Comment 75•1 year ago
|
||
Depends on D166821
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 76•1 year ago
|
||
This is now queued for landing in autoland. There are several outstanding issues which need to be addressed which I have filed as dependent bugs.
Comment 77•1 year ago
|
||
Pushed by na-g@nostrum.com: https://hg.mozilla.org/integration/autoland/rev/5e47822767fe P0 - Add libdrm to update bot;r=mjf https://hg.mozilla.org/integration/autoland/rev/ff38234db176 P1 - Add mozdrm;r=mjf https://hg.mozilla.org/integration/autoland/rev/d33ee32d8118 P2 - add drm moz.build;r=mjf https://hg.mozilla.org/integration/autoland/rev/f6231fc391b9 P3 - add drm to third_party moz.build;r=mjf https://hg.mozilla.org/integration/autoland/rev/2f20209c3db6 P4 - add libdrm to webrtc moz.build;r=mjf https://hg.mozilla.org/integration/autoland/rev/b34bbd275dcf P5 - vendor drm headers;r=mjf https://hg.mozilla.org/integration/autoland/rev/365e1156423f P6 - add new signatures to mozpipewire.cpp;r=mjf https://hg.mozilla.org/integration/autoland/rev/c2068308592b P7 - add gbm to updatebot;r=mjf https://hg.mozilla.org/integration/autoland/rev/cd896ba4fef0 P8 - vendor gbm header;r=mjf https://hg.mozilla.org/integration/autoland/rev/f0ca8b6c1c00 P9 - add gbm to third_party moz.build;r=mjf https://hg.mozilla.org/integration/autoland/rev/0d5e7b053f2a P10 - add mozgbm;r=mjf https://hg.mozilla.org/integration/autoland/rev/40f23d2436c7 P11 - add libgbm to webrtc moz.build;r=mjf https://hg.mozilla.org/integration/autoland/rev/d76734b61172 P12 - add screen capture code changes;r=mjf https://hg.mozilla.org/integration/autoland/rev/6dab1ad1a199 P13 - add libepoxy to updatebot;r=mjf https://hg.mozilla.org/integration/autoland/rev/90e0f07c7326 P14 - vendor libepoxy;r=mjf https://hg.mozilla.org/integration/autoland/rev/2fc9e198bb27 P15 - add libepoxy to webrtc moz.build;r=mjf https://hg.mozilla.org/integration/autoland/rev/314eb6687fd0 P16 - condensed BUILD.gn changes;r=mjf https://hg.mozilla.org/integration/autoland/rev/a9ca4976b1fd P17 - regenerate moz.build;r=mjf
Comment 78•1 year ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/5e47822767fe
https://hg.mozilla.org/mozilla-central/rev/ff38234db176
https://hg.mozilla.org/mozilla-central/rev/d33ee32d8118
https://hg.mozilla.org/mozilla-central/rev/f6231fc391b9
https://hg.mozilla.org/mozilla-central/rev/2f20209c3db6
https://hg.mozilla.org/mozilla-central/rev/b34bbd275dcf
https://hg.mozilla.org/mozilla-central/rev/365e1156423f
https://hg.mozilla.org/mozilla-central/rev/c2068308592b
https://hg.mozilla.org/mozilla-central/rev/cd896ba4fef0
https://hg.mozilla.org/mozilla-central/rev/f0ca8b6c1c00
https://hg.mozilla.org/mozilla-central/rev/0d5e7b053f2a
https://hg.mozilla.org/mozilla-central/rev/40f23d2436c7
https://hg.mozilla.org/mozilla-central/rev/d76734b61172
https://hg.mozilla.org/mozilla-central/rev/6dab1ad1a199
https://hg.mozilla.org/mozilla-central/rev/90e0f07c7326
https://hg.mozilla.org/mozilla-central/rev/2fc9e198bb27
https://hg.mozilla.org/mozilla-central/rev/314eb6687fd0
https://hg.mozilla.org/mozilla-central/rev/a9ca4976b1fd
Updated•1 year ago
|
Comment 79•1 year ago
|
||
IMO this is too large of a change with risks for regressions for uplifting to 110 beta, let's have it ship the 111 train.
Updated•1 year ago
|
Comment 80•1 year ago
|
||
I have reproduced this issue using Firefox 106.0a1 (2022.09.12) on Ubuntu 22, starting Firefox with MOZ_ENABLE_WAYLAND=1 in terminal used the webrtc link and click on screensharing then selecting to sharing a single window, Firefox crashed within a second.
I can confirm this issue is fixed, I verified using Firefox 111.0 on Ubuntu 22, Firefox stopped crashing.
Description
•