Closed
Bug 1456101
Opened 6 years ago
Closed 6 years ago
Intermittent Linux xserver hang with webrtc screen capture hangs user's desktop
Categories
(Core :: WebRTC, defect, P2)
Tracking
()
VERIFIED
FIXED
mozilla62
People
(Reporter: dfetis, Assigned: ng)
References
(Blocks 1 open bug, )
Details
(Keywords: crash)
Attachments
(3 files)
81.89 KB,
text/plain
|
Details | |
94.02 KB,
text/plain
|
Details | |
59 bytes,
text/x-review-board-request
|
dminor
:
review+
RyanVM
:
approval-mozilla-esr60+
|
Details |
User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36 Steps to reproduce: 1- Open https://mozilla.github.io/webrtc-landing/gum_test.html with firefox on linux OS (tested with last Debian, Ubuntu and Mint) 2 - Click on screen or windows button 3 - Allow cature within the dialog box 4 - Reload page and restart from step 1 multiple times. Actual results: After 5 to 10 iteration, desktop freeze and only mouse pointer is active. We can see some xserver Fatal IO error error in /var/lgsyslog. For example with Linux Mint 18 cinammon distribution we got this : Gdk-WARNING: t+477,72786s: cinnamon-session: Fatal IO error 11 (Ressource temporairement non disponible) on X server :0. org.a11y.atspi.Registry[2821]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" Expected results: No desktop freeze and no error in syslog.
Updated•6 years ago
|
Severity: normal → critical
Has Regression Range: --- → irrelevant
Has STR: --- → yes
Component: Untriaged → WebRTC
Keywords: crash
OS: Unspecified → Linux
Product: Firefox → Core
Hardware: Unspecified → x86_64
Assignee | ||
Comment 1•6 years ago
|
||
I can reproduce this on a Linux Mint VM.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Updated•6 years ago
|
Rank: 29
Priority: -- → P3
Comment 2•6 years ago
|
||
(In reply to Nico Grunbaum [:ng] from comment #1) > I can reproduce this on a Linux Mint VM. Can you please provide the crash report?
Assignee | ||
Comment 3•6 years ago
|
||
There is no crash report (it isn't crashing). I switched to nightly and I am no longer able to reproduce. I tried it on Ubuntu 17.10, and the latest Linux Mint 18 live CD. Damien, could you see if you can reproduce this issue with Nightly? It can be downloaded here: https://www.mozilla.org/en-US/firefox/channel/desktop/ (scroll down).
Flags: needinfo?(dfetis)
Reporter | ||
Comment 4•6 years ago
|
||
Hi Nico, I can reproduce it with last nighty-build version ( 61.0a1 (2018-04-27) (64-bit)) on my Linux Mint 18.3. After looking at GetUserMedia Logs with NSPR_LOG_MODULES=MediaManager:4,GetUserMedia:4 and logging it to a tempory file. I see this before the screen freeze : [21089:MediaManager]: D/MediaManager ChooseCapability(kFitness) for mCapability (Allocate) -- [21089:MediaManager]: D/MediaManager Video device 0 allocated [21089:Main Thread]: D/MediaManager GetUserMediaStreamRunnable::Run() [21089:Main Thread]: D/MediaManager SourceListener 0x7fb4db230400 activating audio=(nil) video=0x7fb4e0536660 [21089:MediaManager]: D/MediaManager virtual nsresult mozilla::MediaEngineRemoteVideoSource::SetTrack(const RefPtr<const mozilla::AllocationHandle>&, const RefPtr<mozilla::SourceMediaStream>&, mozilla::TrackID, const PrincipalHand le&) [21089:MediaManager]: D/MediaManager virtual nsresult mozilla::MediaEngineRemoteVideoSource::Start(const RefPtr<const mozilla::AllocationHandle>&) [21089:MediaManager]: D/MediaManager started all sources [21089:Main Thread]: D/MediaManager GetUserMediaStreamRunnable::Run: starting success callback following InitializeAsync() [21089:Main Thread]: D/MediaManager Returning success for getUserMedia() Then the screen freeze but log continue to be written and firefox process don't crash. [21089:Main Thread]: D/MediaManager SourceListener 0x7fb4db230400 stopping video track 1 [21089:Main Thread]: D/MediaManager SourceListener 0x7fb4db230400 this was the last track stopped [21089:Main Thread]: D/MediaManager SourceListener 0x7fb4db230400 stopping [21089:MediaManager]: D/MediaManager virtual nsresult mozilla::MediaEngineRemoteVideoSource::Stop(const RefPtr<const mozilla::AllocationHandle>&) [21089:Main Thread]: D/MediaManager SourceListener 0x7fb4db230400 StopSharing [21089:Main Thread]: D/MediaManager SourceListener 0x7fb4db230400 stopping video track 1 [21089:MediaManager]: D/MediaManager virtual nsresult mozilla::MediaEngineRemoteVideoSource::Deallocate(const RefPtr<const mozilla::AllocationHandle>&) [21089:MediaManager]: D/MediaManager Video device 0 deallocated [21144:MediaManager]: D/MediaManager GetUserMediaTask::Run() [21144:MediaManager]: D/MediaManager virtual nsresult mozilla::MediaEngineRemoteVideoSource::Allocate(const mozilla::dom::MediaTrackConstraints&, const mozilla::MediaEnginePrefs&, const nsString&, const mozilla::ipc::PrincipalInfo&, mozilla::AllocationHandle**, const char**) [21144:MediaManager]: D/MediaManager ChooseCapability(kFitness) for mCapability (Allocate) ++ [21144:MediaManager]: D/MediaManager bool mozilla::MediaEngineRemoteVideoSource::ChooseCapability(const mozilla::NormalizedConstraints&, const mozilla::MediaEnginePrefs&, const nsString&, webrtc::CaptureCapability&, mozilla::DistanceCalculation) [21144:MediaManager]: D/MediaManager ChooseCapability: prefs: 640x480 @30fps [21144:MediaManager]: D/MediaManager Constraints: width: { min: -2147483647, max: 2147483647 } [21144:MediaManager]: D/MediaManager height: { min: -2147483647, max: 2147483647 } [21144:MediaManager]: D/MediaManager frameRate: { min: -inf, max: inf } [21144:MediaManager]: D/MediaManager ChooseCapability(kFitness) for mCapability (Allocate) -- After killing Firefox in console mode (CTRL+ALT+F2) I can return to GUI and all is running well. So the Firefox process is freezing the windows manager and not crashing it, but I didn't look deeper what could cause this.
Flags: needinfo?(dfetis)
Assignee | ||
Comment 5•6 years ago
|
||
I was able to get it to reproduce in nightly, though it took far more attempts (>50). I attached gdb and got a backtrace of the threads in the parent process. There may be a deadlock between thread 59 and thread 1. Thread 59 is in libxcb and thread 1 is in libx11. I am not an X11 expert but I know that multithreaded access to libX11 can be tricky, and I am suspicious of mixed use of the two libraries.
Updated•6 years ago
|
Blocks: Screensharing
Assignee | ||
Comment 6•6 years ago
|
||
Damien, if possible could you get it to hang again and run `sudo gdb --batch -ex "thread apply all bt" -p YOUR_FIREFOX_PARENT_PID | tee hang_backtrace.log` where YOUR_FIREFOX_PARENT_PID is the PID of the Firefox parent process? Then add that file as an attachment here. Thanks for taking the time to report and help diagnose this.
Flags: needinfo?(dfetis)
Reporter | ||
Comment 7•6 years ago
|
||
Nico, I run your gdb command to get the Firefox thread backtrace log for the freeze situation and it was added to bug attachment files.
Flags: needinfo?(dfetis)
Comment 8•6 years ago
|
||
[Tracking Requested - why for this release]: Intermittently hangs the linux user's desktop. Workaround is to open a virtual console and close firefox master process. Not a regression AFAIK.
Assignee: nobody → na-g
Rank: 29 → 13
status-firefox60:
--- → affected
status-firefox61:
--- → affected
status-firefox62:
--- → affected
status-firefox-esr52:
--- → affected
status-firefox-esr60:
--- → affected
status-thunderbird_esr52:
--- → affected
tracking-firefox61:
--- → ?
tracking-firefox62:
--- → ?
Priority: P3 → P2
Summary: Linux xserver crash with webrtc screen capture → Intermittent Linux xserver crash with webrtc screen capture
Version: Trunk → 50 Branch
Updated•6 years ago
|
Summary: Intermittent Linux xserver crash with webrtc screen capture → Intermittent Linux xserver crash with webrtc screen capture hangs user's desktop
Comment 9•6 years ago
|
||
I don't think we need to track this if it goes back all the way to Fx50, but we'd certainly consider backporting a low-risk fix should one be available.
Updated•6 years ago
|
Attachment #8973217 -
Attachment mime type: text/x-log → text/plain
Comment 10•6 years ago
|
||
main thread: gdk_x11_device_core_window_at_position() grabs the Xserver, which "disables processing of requests and close downs on all other connections". video capture thread: From XOpenDisplay(), _XConnectXCB() holds _Xglobal_lock during the call to xcb_connect_to_display_with_auth_info(). _XConnectXCB() is poll()ing via read_setup() from xcb_connect_to_fd() for a response on the new connection to the X server. The server will not respond until the main thread releases the grab. main thread: gdk_x11_device_core_window_at_position() triggers XSetErrorHandler(), which waits for the video thread to release _Xglobal_lock.
Comment 11•6 years ago
|
||
The best fix would be to avoid using gdk_display_get_window_at_pointer() in nsWindow.cpp.
Depends on: 510411
Updated•6 years ago
|
Summary: Intermittent Linux xserver crash with webrtc screen capture hangs user's desktop → Intermittent Linux xserver hang with webrtc screen capture hangs user's desktop
Comment hidden (mozreview-request) |
Assignee | ||
Updated•6 years ago
|
Attachment #8985877 -
Flags: review?(dminor)
Comment 13•6 years ago
|
||
mozreview-review |
Comment on attachment 8985877 [details] Bug 1456101 - ensure X11 DesktopCapture module is created on main thread https://reviewboard.mozilla.org/r/251384/#review257926 LGTM. With the sync dispatch this seems safe.
Attachment #8985877 -
Flags: review?(dminor) → review+
Assignee | ||
Comment 14•6 years ago
|
||
While fixing bug 510411 would probably fix most occurrences of this race, the only way to ensure that a race doesn't occur is to dispatch this to the main thread.
Assignee | ||
Comment 15•6 years ago
|
||
This is fairly low risk and limited to Linux where the bug takes down the user's entire desktop. That said, we are in the last half of the soft code freeze. Liz, do you think this is appropriate to land?
Flags: needinfo?(lhenry)
Comment 16•6 years ago
|
||
That seems reasonable, please do land it and the fix should end up in the 62.0b2 build by the end of the week.
Flags: needinfo?(lhenry)
Comment 17•6 years ago
|
||
Pushed by na-g@nostrum.com: https://hg.mozilla.org/integration/autoland/rev/fb01c7ab313c ensure X11 DesktopCapture module is created on main thread r=dminor
Comment 18•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/fb01c7ab313c
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Updated•6 years ago
|
Assignee | ||
Comment 20•6 years ago
|
||
Comment on attachment 8985877 [details] Bug 1456101 - ensure X11 DesktopCapture module is created on main thread [Approval Request Comment] If this is not a sec:{high,crit} bug, please state case for ESR consideration: this has a major impact on Google Hangouts and other sites which use screen sharing User impact if declined: sites that use gUM screen sharing may cause the users entire desktop to freeze Fix Landed on Version: 62 Risk to taking this patch (and alternatives if risky): low, furthermore even if this introduces a crash that would be much preferred to the current behavior of taking down the user's desktop session String or UUID changes made by this patch: none
Flags: needinfo?(na-g)
Attachment #8985877 -
Flags: approval-mozilla-esr60?
Updated•6 years ago
|
Flags: qe-verify+
Comment 21•6 years ago
|
||
Comment on attachment 8985877 [details] Bug 1456101 - ensure X11 DesktopCapture module is created on main thread Fixes screen freezes for users using screen sharing. Approved for ESR 60.2.
Attachment #8985877 -
Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
Comment 22•6 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-esr60/rev/2bbea07a6d67
Comment 23•6 years ago
|
||
I have managed to reproduce this bug on an affected version 62.0b1(buildID=20180619022742). I've verified this bug on build 62.0 (buildID=20180827144429) and 60.2.0esr (buildID=20180828172101), using the STR from comment 0. This was tested on Ubuntu 16.04x64.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in
before you can comment on or make changes to this bug.
Description
•