Closed Bug 1807660 Opened 2 years ago Closed 2 years ago

KDE Wayland: Crash on receiving a Gmail notifier notification since 108

Categories

(Core :: Widget: Gtk, defect, P3)

Firefox 108
x86_64
Linux
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: hasezoey, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression)

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:108.0) Gecko/20100101 Firefox/108.0

Steps to reproduce:

  1. start firefox
  2. verify gmail notifier is active
  3. open a youtube video
  4. play the video and fullscreen it
  5. receive a email (with notification)
  6. observe the video frame stopping but audio still playing
  7. observe any keypress crashing firefox with "Lost connection to the wayland compositor" or wait for the youtube buffer to run out and observe that the video continues to play

Actual results:

Video pausing until buffer regenerates OR video pausing and crashing on any keypress

Expected results:

no crash, no paused video

i have also noticed that the notifications sometimes now just appear in the middle of the screen and have firefox popup window decoration, instead of appearing in a corner with no window decoration

will try to get a profiler profile while it occurs


KDE About System:
Operating System: Manjaro Linux
KDE Plasma Version: 5.26.4
KDE Frameworks Version: 5.101.0
Qt Version: 5.15.7
Kernel Version: 6.1.1-1-MANJARO (64-bit)
Graphics Platform: Wayland
Processors: 16 × AMD Ryzen 7 5800X 8-Core Processor
Memory: 15.5 GiB of RAM
Graphics Processor: AMD Radeon RX Vega
Manufacturer: ASUS


relevant journald section while having it crashed (because of a keypress instead of waiting for the buffer to regenerate):

Dec 21 16:35:07 Meicoo-Manjaro plasmashell[3109]: [2022-12-21T15:35:07Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Dec 21 16:35:07 Meicoo-Manjaro plasmashell[3109]: [2022-12-21T15:35:07Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Dec 21 16:35:07 Meicoo-Manjaro plasmashell[3109]: [2022-12-21T15:35:07Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Dec 21 16:35:07 Meicoo-Manjaro plasmashell[3109]: [2022-12-21T15:35:07Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Dec 21 16:35:08 Meicoo-Manjaro rtkit-daemon[1571]: Supervising 11 threads of 7 processes of 1 users.
Dec 21 16:35:08 Meicoo-Manjaro rtkit-daemon[1571]: Supervising 11 threads of 7 processes of 1 users.
Dec 21 16:35:23 Meicoo-Manjaro plasmashell[86572]: [2022-12-21T15:35:23Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Dec 21 16:35:23 Meicoo-Manjaro plasmashell[86572]: [2022-12-21T15:35:23Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Dec 21 16:35:23 Meicoo-Manjaro plasmashell[86572]: [2022-12-21T15:35:23Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Dec 21 16:35:23 Meicoo-Manjaro rtkit-daemon[1571]: Supervising 11 threads of 7 processes of 1 users.
Dec 21 16:35:23 Meicoo-Manjaro rtkit-daemon[1571]: Supervising 11 threads of 7 processes of 1 users.
Dec 21 16:35:24 Meicoo-Manjaro plasmashell[86572]: [2022-12-21T15:35:24Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Dec 21 16:39:08 Meicoo-Manjaro plasmashell[1525]: Could not find the Plasmoid for Plasma::FrameSvgItem(0x55f9526256>
Dec 21 16:39:08 Meicoo-Manjaro plasmashell[1525]: Could not find the Plasmoid for Plasma::FrameSvgItem(0x55f9526256>
Dec 21 16:39:08 Meicoo-Manjaro plasmashell[1525]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/c>
Dec 21 16:39:08 Meicoo-Manjaro plasmashell[1525]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/c>
Dec 21 16:39:08 Meicoo-Manjaro plasmashell[1525]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/c>
Dec 21 16:39:08 Meicoo-Manjaro plasmashell[1525]: file:///usr/lib/qt/qml/org/kde/plasma/components.3/ScrollView.qml>
Dec 21 16:39:08 Meicoo-Manjaro plasmashell[1525]: file:///usr/lib/qt/qml/org/kde/plasma/components.3/ScrollView.qml>
Dec 21 16:39:08 Meicoo-Manjaro plasmashell[1525]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/c>
Dec 21 16:39:16 Meicoo-Manjaro kwin_wayland[1367]: This plugin does not support raise()
Dec 21 16:39:27 Meicoo-Manjaro kwin_wayland_wrapper[1367]: error in client communication (pid 2892)
Dec 21 16:39:27 Meicoo-Manjaro firefox[2892]: Lost connection to Wayland compositor.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[87676]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[87847]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[86572]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[86575]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[3397]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[25139]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[3227]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[3135]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[3141]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[3132]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[3196]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[3046]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[24131]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[3109]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[3148]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[3157]: Exiting due to channel error.
Dec 21 16:39:27 Meicoo-Manjaro plasmashell[24637]: Exiting due to channel error.
Slight update to the reproduction: - does not require fullscreen window - video just needs to be *loaded* not even playing the second one caused my firefox to crash 3 times in a row slightly after starting because the notifier tried to display the same email again (because it couldnt mark it as displayed, because it never finished displaying) <details> <summary>Journalctl log for the 3 crashes in a row</summary> ```txt ``` </details>

Slight update to the reproduction:

  • does not require fullscreen window
  • video just needs to be loaded not even playing

the second one caused my firefox to crash 3 times in a row slightly after starting because the notifier tried to display the same email again (because it couldnt mark it as displayed, because it never finished displaying)

the journalctl log for the 3 crashes in a row has been added as a attachment

sorry for the duplicated message, didnt know it would post the message along with the attachment


did a profiler with the preset Firefox, while not having it crash (no keypress):
https://share.firefox.dev/3GkIsdc

The Bugbug bot thinks this bug should belong to the 'Core::Audio/Video: Playback' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Audio/Video: Playback
Product: Firefox → Core

The 'ERROR mp4parse] Found 2 nul bytes in "\0\0"' errors are passive.

This output seems to point to the core issue -
Dec 21 16:39:16 Meicoo-Manjaro kwin_wayland[1367]: This plugin does not support raise()
Dec 21 16:39:27 Meicoo-Manjaro kwin_wayland_wrapper[1367]: error in client communication (pid 2892)
Dec 21 16:39:27 Meicoo-Manjaro firefox[2892]: Lost connection to Wayland compositor.

Component: Audio/Video: Playback → Widget: Gtk

issue persists in firefox 109, now that i am on 109, do you need new profiler output or anything else? because this issue is impacting the full usage of the browser unless the extension is disabled or the notifications are disabled (which is only possible for the current browser session), but which i would rather not like to do

also from my experience since opening this issue, the issue happens even when no video is currently playing or loaded (but maybe had been loaded in the past of the current session)

in case it matters: when having multiple windows of firefox open and the issue appears, it locks up all browser windows. any keypress in that time while a browser window is focused may crash all browser windows.

Thanks for the report! Please try to find a regression range, you will get a pushlog url at the end:
$ pip3 install mozregression
$ ~/.local/bin/mozregression --good 107 --bad 108 -a https://mail.google.com -a https://www.youtube.com/watch?v=LXb3EKWsInQ

If 108 from above command has been fine, the problem might have been backported from Nightly 109 to Beta 108, then try this:
$ ~/.local/bin/mozregression --good 107 --bad 109 -a https://mail.google.com -a https://www.youtube.com/watch?v=LXb3EKWsInQ

Keywords: crash, regression
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
Summary: Crash on receiving a Gmail notifier notification since 108 → KDE Wayland: Crash on receiving a Gmail notifier notification since 108

Please try to find a regression range, you will get a pushlog url at the end:

sadly, the last time i tried to use mozregression it wouldnt work.

this time for the GUI variant:

platform: Linux-6.1.1-1-MANJARO-x86_64-with-glibc2.36
python: 3.9.16 FROZEN (64bit)
mozregression: 5.3.0rc1.dev2
message: AttributeError: 'BisectionWizard' object has no attribute 'Accepted'
traceback:   File "mozregui/mainwindow.py", line 106, in start_bisection_wizard
  File "mozregui/mainwindow.py", line 95, in _start_runner

as for the cli variant:

$ ~/.local/bin/mozregression --good 107 --bad 108 -a https://mail.google.com -a https://www.youtube.com/watch\?v\=LXb3EKWsInQ
 0:01.68 INFO: Using date 2022-11-14 for release 108
 0:02.73 INFO: Using date 2022-10-17 for release 107
 0:04.88 INFO: Testing good and bad builds to ensure that they are really good and bad...
 0:04.88 INFO: Downloading build from: https://archive.mozilla.org/pub/firefox/nightly/2022/10/2022-10-17-21-36-58-mozilla-central/firefox-108.0a1.en-US.linux-x86_64.tar.bz2
===== Downloaded 100% =====
 0:13.46 INFO: Running mozilla-central build for 2022-10-17
 0:19.69 INFO: Launching /tmp/tmp9bsp8wh_/firefox/firefox
 0:19.69 INFO: Application command: /tmp/tmp9bsp8wh_/firefox/firefox https://mail.google.com https://www.youtube.com/watch?v=LXb3EKWsInQ -profile /tmp/tmpno47jia1.mozrunner
 0:19.69 INFO: application_buildid: 20221017213658
 0:19.69 INFO: application_changeset: ac1330b68d3e7b231a177cfa1ac52e1b2199bb84
 0:19.69 INFO: application_name: Firefox
 0:19.69 INFO: application_repository: https://hg.mozilla.org/mozilla-central
 0:19.69 INFO: application_version: 108.0a1
Was this nightly build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry' or 'exit' and press Enter): 
 3:17.18 WARNING: Process exited with code 1
bad
 3:19.98 ERROR: Build was expected to be good! The initial good/bad range seems incorrect.

Is 109 good as well?

no, did not even try the second command yet because the regressions from my earlier report happened between 107 and 108

also as you can see, the script errored:

0:19.69 INFO: application_version: 108.0a1
Was this nightly build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry' or 'exit' and press Enter):
3:17.18 WARNING: Process exited with code 1
bad
3:19.98 ERROR: Build was expected to be good! The initial good/bad range seems incorrect.

update: the second command also tries to use version 108.0a1 as the first guess, which is a bad version, and so will fail

also, is there some way to speed this regression checking up? because every-time a new version is started, i have to log in to gmail (with all 2fa) and also install the extension (which this issue is about) instead of just re-using the other's profile (or is this just a new profile at the start of each bisection, because i have not been able to go further than the first version)

i have also now tried some other versions which are all bad: 107.0a1 and 106.0a1 and 100.0a1. which leads me to believe that something in kde changed (or maybe the extension?) rather than firefox itself - any pointers on how to debug this?
Note: i am not sure what exactly had changed or at what point, i just know it was at some point between firefox 107 and firefox 108

(Sorry, I didn't look closely enough and thought it was vice versa.)

Does the problem still occur with latest Nightly?
$ ~/.local/bin/mozregression --launch 2023-01-17 -a https://mail.google.com -a https://www.youtube.com/watch?v=LXb3EKWsInQ

This could be a duplicate of bug 1743144 comment 14.

tested version 111.0a1 and the problem still persists (same as the other versions)

Please run latest nightly on terminal with WAYLAND_DEBUG=1 MOZ_LOG="Widget:5 WidgetPopup:5 WidgetWayland:5" env variables and attach the log here (if it crashes).
Thanks.

Blocks: wayland
Flags: needinfo?(hasezoey)
Priority: -- → P3

command:
WAYLAND_DEBUG=1 MOZ_LOG="Widget:5 WidgetPopup:5 WidgetWayland:5" ~/.local/bin/mozregression --launch 2023-01-17 -a https://mail.google.com -a https://www.youtube.com/watch\?v\=LXb3EKWsInQ

log:

 0:01.56 INFO: Using local file: /home/hasezoey/.mozilla/mozregression/persist/2023-01-17--mozilla-central--firefox-111.0a1.en-US.linux-x86_64.tar.bz2
 0:01.56 INFO: Running mozilla-central build for 2023-01-17
 0:08.11 INFO: Launching /tmp/tmp1kr27svm/firefox/firefox
 0:08.11 INFO: Application command: /tmp/tmp1kr27svm/firefox/firefox https://mail.google.com https://www.youtube.com/watch?v=LXb3EKWsInQ -profile /tmp/tmp0b6c_osc.mozrunner
 0:08.12 INFO: application_buildid: 20230117161302
 0:08.12 INFO: application_changeset: 455aa95a34de5e712128d0dfac95366c60d96299
 0:08.12 INFO: application_name: Firefox
 0:08.12 INFO: application_repository: https://hg.mozilla.org/mozilla-central
 0:08.12 INFO: application_version: 111.0a1

 2:23.90 WARNING: Process exited with code 1

or do you mean a different log? when yes, where do i find it?

btw, i looked into about:support but could not find a mention of WAYLAND_DEBUG only MOZ_LOG

Flags: needinfo?(hasezoey)

Use https://nightly.mozilla.org or add -P stdout to the mozregression command.

is it safe to upload the complete log? because it may involve inputting my password for the google account
just asking to be safe

added requested firefox log for 111.0a1, with command:
WAYLAND_DEBUG=1 MOZ_LOG="Widget:5 WidgetPopup:5 WidgetWayland:5" ~/.local/bin/mozregression --launch 2023-01-17 -a https://mail.google.com -a https://www.youtube.com/watch\?v\=LXb3EKWsInQ -P stdout >> firefox_log

Note: this file has been edited, the part where i logged into gmail is cut out, just to be safe i hope

Flags: needinfo?(stransky)
Attachment #9312915 - Attachment description: firefox_111.0a1.log → firefox_111.0a1.log.txt
Attachment #9312915 - Attachment mime type: application/octet-stream → text/plain

Thanks for the log. This part is important:

 1:20.53 INFO: b'[  20613.044] wl_pointer@24.frame()'
 1:20.53 INFO: b'Gdk-Message: 21:53:13.378: Lost connection to Wayland compositor.'
 1:20.54 INFO: b'Exiting due to channel error.'

but it doesn't make much sense to me. If there's a buggy wayland client, we're supposed to get clear error from compositor (line protocol error - wrong object, missing Id etc.). But we're getting 'Lost connection to Wayland compositor.' without any other notice.

It looks like wayland compositor itself crashed / closed connection but we don't have an idea why. Vlad, any idea here how to get more info from KWin?

Thanks.

Flags: needinfo?(stransky) → needinfo?(vlad.zahorodnii)

It looks like wayland compositor itself crashed / closed connection but we don't have an idea why

i dont know about the structure of how wayland or kde-plasma-wayland works, but when firefox crashes because of the error, no other windows or plasma itself are affected

also does the profiler profile maybe help that i had posted earlier (and for reference now again)?: https://share.firefox.dev/3GkIsdc
should i maybe re-do the profiler profile for the latest nightly?

Does kwin print anything when firefox dies? Use journalctl --follow --user-unit plasma-kwin_wayland to follow kwin's log output

Flags: needinfo?(vlad.zahorodnii)

(In reply to Vlad Zahorodnii [:zzag] from comment #23)

Does kwin print anything when firefox dies? Use journalctl --follow --user-unit plasma-kwin_wayland to follow kwin's log output

using firefox 111.0a1 (like the last runs in this thread) and using journalctl --follow --user-unit plasma-kwin_wayland to follow plasma, the output after starting the firefox instance until after crash is:

Jan 22 17:17:39 Meicoo-Manjaro kwin_wayland_wrapper[1400]: error in client communication (pid 5690)

it is the same message and error as in the journalctl log i had already provided initially (filename 3 crashes in a row journalctl)

Flags: needinfo?(vlad.zahorodnii)

Recently Manjaro got a update from KDE 5.26.4 / 5.101.0 to KDE 5.26.5 / 5.102.0 and running firefox 109.0.1, and since then i did not see the issue anymore.

Because the issue is fixed on the KDE side and was seemingly a KDE side problem anyway i will close this.

Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
Flags: needinfo?(vlad.zahorodnii)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: