Closed Bug 1590984 Opened 5 years ago Closed 5 years ago

Crash in [@ raise | abort | __libc_message | __fortify_fail_abort | __fortify_fail | __chk_fail | __fdelt_chk | webrtc::videocapturemodule::DeviceInfoLinux::EventCheck]

Categories

(Core :: WebRTC: Audio/Video, defect, P3)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla72
Tracking Status
firefox-esr68 --- fixed
firefox70 --- wontfix
firefox71 --- wontfix
firefox72 --- fixed

People

(Reporter: gsvelto, Assigned: gsvelto)

References

()

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-abe63a7e-7862-4e4d-b2f7-313140191023.

Top 10 frames of crashing thread:

0 libc-2.27.so raise /build/glibc-OTsEL5/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51
1 libc-2.27.so abort /build/glibc-OTsEL5/glibc-2.27/stdlib/abort.c:79
2 libc-2.27.so __libc_message /build/glibc-OTsEL5/glibc-2.27/libio/../sysdeps/posix/libc_fatal.c:181
3 libc-2.27.so __fortify_fail_abort /build/glibc-OTsEL5/glibc-2.27/debug/fortify_fail.c:33
4 libc-2.27.so __fortify_fail /build/glibc-OTsEL5/glibc-2.27/debug/fortify_fail.c:44
5 libc-2.27.so __chk_fail /build/glibc-OTsEL5/glibc-2.27/debug/chk_fail.c:28
6 libc-2.27.so __fdelt_chk /build/glibc-OTsEL5/glibc-2.27/debug/fdelt_chk.c:25
7 libxul.so webrtc::videocapturemodule::DeviceInfoLinux::EventCheck /build/firefox-WlAST4/firefox-69.0.2+build1/media/webrtc/trunk/webrtc/modules/video_capture/linux/device_info_linux.cc:92
8 libxul.so webrtc::videocapturemodule::DeviceInfoLinux::ProcessInotifyEvents /build/firefox-WlAST4/firefox-69.0.2+build1/media/webrtc/trunk/webrtc/modules/video_capture/linux/device_info_linux.cc:133
9 libxul.so webrtc::videocapturemodule::DeviceInfoLinux::InotifyProcess /build/firefox-WlAST4/firefox-69.0.2+build1/media/webrtc/trunk/webrtc/modules/video_capture/linux/device_info_linux.cc:166

This crash is happening in the Ubuntu-packaged version of Firefox. It seems to affect 64-bit users only. I'll file a bug on Ubuntu's tracker and link it here.

I filed a bug on Ubuntu's tracker. I've also dug a little bit further in the crashes and this is not related to WebRTC per-se. The most visible signatures are in WebRTC code but they're not the only ones. This is caused by checks in libc that check if the file descriptor values passed to the FD_SET, FD_CLR and FD_ISSET macros are less than the FD_SETSIZE value. All code calling select() both directly and indirectly seems to be affected. In WebRTC we're receiving file descriptors from inotify_init(), is it possible that this will return values larger than FD_SETSIZE?

Switching to poll() would probably fix the issue here but there's a few things that are odd about this bug: for starters, why is it only happening on Ubuntu/Debian and not on our builds? Also, this is happening in code that calls select() outside of Firefox and those crashes are also happening only on Ubuntu/Debian builds, not ours.

I did some further digging and this is happening all over the place but mostly in code we don't control. It's always select()'s fault though. Daniel, this code is different from upstream WebRTC (which is also using select() but in other places), is there a reason for this? I can cook up a patch to switch this code to poll() instead but I wonder if it's the right thing to do or if this should be fixed upstream first.

Flags: needinfo?(dminor)

(In reply to Gabriele Svelto [:gsvelto] from comment #2)

I did some further digging and this is happening all over the place but mostly in code we don't control. It's always select()'s fault though. Daniel, this code is different from upstream WebRTC (which is also using select() but in other places), is there a reason for this? I can cook up a patch to switch this code to poll() instead but I wonder if it's the right thing to do or if this should be fixed upstream first.

Hi Gabriele, this code is our local modification to webrtc.org that we have never merged upstream. As such, feel free to rewrite it to use poll() if that will fix things. I'm not the original author, I just happened to do the last upstream merge, so I can't comment on why select was chosen rather than poll. :jesup might know if you want to follow up with him.

One thing to be careful of is that this code is handling cameras being plugged and unplugged which is not something we have coverage for in automation, so you'll have to do some manual testing of your changes. The STR from https://bugzilla.mozilla.org/show_bug.cgi?id=1581193#c0 might be helpful here.

Flags: needinfo?(dminor)
Component: WebRTC → WebRTC: Audio/Video
Priority: -- → P3

(In reply to Dan Minor [:dminor] from comment #3)

Hi Gabriele, this code is our local modification to webrtc.org that we have never merged upstream. As such, feel free to rewrite it to use poll() if that will fix things. I'm not the original author, I just happened to do the last upstream merge, so I can't comment on why select was chosen rather than poll. :jesup might know if you want to follow up with him.

One thing to be careful of is that this code is handling cameras being plugged and unplugged which is not something we have coverage for in automation, so you'll have to do some manual testing of your changes. The STR from https://bugzilla.mozilla.org/show_bug.cgi?id=1581193#c0 might be helpful here.

Thanks, I'll try to cook up a patch tomorrow. I must have a USB camera sitting around somewhere so testing shouldn't be a problem.

Taking this as I've got a WIP patch. It just needs some good testing.

Assignee: nobody → gsvelto
Status: NEW → ASSIGNED
Crash Signature: [@ raise | abort | __libc_message | __fortify_fail_abort | __fortify_fail | __chk_fail | __fdelt_chk | webrtc::videocapturemodule::DeviceInfoLinux::EventCheck] [@ raise | abort | __libc_message | __fortify_fail_abort | __fortify_fail | __chk_fail | __fde… → [@ __fortify_fail | __chk_fail | __fdelt_chk | ProcessInotifyEvents] [@ raise | abort | __libc_message | __fortify_fail | __chk_fail | __fdelt_chk | webrtc::videocapturemodule::DeviceInfoLinux::ProcessInotifyEvents] [@ raise | abort | __libc_message | _…
Blocks: 1591869

The use of select() was leading to crashes when the file descriptor value was
larger than FD_SETSIZE. Recent versions of glibc have checks in the FD_CLR(),
FD_SET() and FD_ISSET() macros that will abort() the program instead of doing
an out-of-bounds access. poll() doesn't have limitations on the file
descriptor values and provides behavior that is otherwise identical to
select() thus solving the problem.

Pushed by gsvelto@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/59fb6760bb67 Use poll() instead of select() in WebRTC code r=drno
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

Hi Gabriele, is qa needed here? And if so, could you please provide us some steps? Thanks!

Flags: needinfo?(gsvelto)

No testing is needed, thanks Catalin.

Flags: needinfo?(gsvelto)

There's quite a few crashes coming from ESR68 on Debian. The patch is not large and low-risk, maybe we could uplift it. I'll see if it applies cleanly there.

Looks like this grafts cleanly to ESR68 - did you want to nominate it for uplift?

Flags: needinfo?(gsvelto)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #13)

Looks like this grafts cleanly to ESR68 - did you want to nominate it for uplift?

Yes, let's do it.

Flags: needinfo?(gsvelto)

Comment on attachment 9104623 [details]
Bug 1590984 - Use poll() instead of select() in WebRTC code r=drno

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: This fixes a crash that can easily occur when using WebRTC.
  • User impact if declined: Firefox may crash when using WebRTC; this will happen independently of the user since it's triggered by the file descriptor numbers used by the WebRTC devices (camera, microphone, etc...).
  • Fix Landed on Version: 72
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The fix is small and almost mechanical and the patch has been in nightly and then beta for two months now without causing regressions.
  • String or UUID changes made by this patch: None
Attachment #9104623 - Flags: approval-mozilla-esr68?

Comment on attachment 9104623 [details]
Bug 1590984 - Use poll() instead of select() in WebRTC code r=drno

Fixes a Linux-only WebRTC crash. Approved for 68.4esr.

Attachment #9104623 - Flags: approval-mozilla-esr68? → approval-mozilla-esr68+
No longer depends on: 1646904
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: