Open
Bug 1053757
Opened 11 years ago
Updated 3 years ago
[Linux] filehandles to resources (including /dev/video) not closed when restarting from crashreport
Categories
(Toolkit :: Crash Reporting, defect)
Tracking
()
NEW
People
(Reporter: bmaris, Unassigned)
Details
Attachments
(1 file)
2.75 KB,
text/plain
|
Details |
Reproduced on Ubuntu 14.04 32bit using latest Nightly 34.0a1 (buildID: 20140813030201).
STR:
1. Start Firefox
2. Install crashme addon: http://ted.mielczarek.org/mozilla/crashme.html
3. Wait for OpenH264 addon to install. (see about:addons)
4. Visit http://mozilla.github.io/webrtc-landing/pc_test.html and start a call.
5. Crash Firefox using one of the options from crashme addon.
6. After submitting the crash open http://mozilla.github.io/webrtc-landing/pc_test.html again and try to make a call.
Expected result: After Firefox crashes the call is interrupted.
Actual result: After Firefox crashes the call is still on (light from camera is on). If I try another call I get message 'Failure callback: "Starting video failed"' message.
Notes:
1. Possible regression, will investigate further.
2. It only reproduces on Linux (I used Ubuntu 14.04 32bit)
Comment 1•11 years ago
|
||
After using crashme, please do a "ps wax | grep firefox" to verify that firefox isn't still running (in debug linux builds, it will wait 5 minutes for a debugger to attach before exiting).
Comment 2•11 years ago
|
||
All the camera access happens from the main process, not the GMP process, right?
Comment 3•11 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #2)
> All the camera access happens from the main process, not the GMP process,
> right?
Right. GMP plugin-container has no access.
Note that crashing the plugin doesn't stop that tab from capturing the camera; the STR aren't clear as to what "open pc_test.html again and try to make a call" means (same tab? Different tab?) First test would be to crash the plugin, submit, then hit reload for the tab, and see if the active capture indicator goes away.
Flags: needinfo?(bogdan.maris)
Reporter | ||
Comment 4•11 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #1)
> After using crashme, please do a "ps wax | grep firefox" to verify that
> firefox isn't still running (in debug linux builds, it will wait 5 minutes
> for a debugger to attach before exiting).
If I run 'ps wax | grep firefox' I get this:
11477 pts/0 Sl 0:00 /home/bogdanmaris/Documents/Latest Nightly/crashreporter /home/bogdanmaris/.mozilla/firefox/zqlqevcq.Test WEBRTC/minidumps/0df78b79-29ea-a687-119cefaa-19290738.dmp
11487 pts/1 S+ 0:00 grep --color=auto firefox
I used the command when the crash reporter is up. If I hit restart Firefox opens the pc_test.html in the same tab. If I open a new tab, load the pc_test.html and start the call, I will get 'Failure callback: "Starting video failed" (https://db.tt/kF5DwqZA). If I go to the first tab where the pc_test.html is loaded already and start a call there as well, the global indicator appears but camera shows nothing (https://db.tt/lHKXp1gK).
(In reply to Randell Jesup [:jesup] from comment #3)
> First test would be to
> crash the plugin, submit, then hit reload for the tab, and see if the active
> capture indicator goes away.
I used media.gmp.plugin.crash to crash the plugin and after submitting the crash indicator goes away, then if I reloading the tab I can make another call.
Note that I did not even start a h264 video, starting h264 video has the same result.
Flags: needinfo?(bogdan.maris) → needinfo?(rjesup)
Comment 5•11 years ago
|
||
This appears to be an issue with the CrashReporter's Restart feature; it's not closing all the resources uses by the original process (on linux).
Reproduces on Fedora 19 with a crashreporter-enabled m-c opt build.
Component: WebRTC → Breakpad Integration
Flags: needinfo?(rjesup) → needinfo?(ted)
Product: Core → Toolkit
Comment 6•11 years ago
|
||
The firefox process does a fork and exec to run the crash reporter:
http://hg.mozilla.org/mozilla-central/annotate/0753f7b93ab7/toolkit/crashreporter/nsExceptionHandler.cpp#l825
and then calls _exit(1). The "restart" button in the crash reporter just re-launches the Firefox binary.
I don't see how anything the crashreporter binary could do would be a problem here. What resources are we holding on to that aren't being released on _exit?
Flags: needinfo?(ted)
Comment 7•11 years ago
|
||
Anthony, is this something you might want to track for Loop?
Flags: needinfo?(anthony.s.hughes)
CCing Maire so she is aware of this bug for Loop. I'm going to suggest we block MVP on this.
Comment 9•11 years ago
|
||
Did some more testing:
1) It's not an Openh264 issue. It happens in normal VP8 calls
2) It's not a loop issue. It happens with plain getUserMedia pages, no peerconnections in sight
3) I'm 99% certain this isn't new (unless crashreporter changed a lot recently), and may go back as far as 22.
4) It strongly appears the issue is a failure to release /dev/video when you hit "restart"
5) I believe this may be causing other files to be left open
Before restart:
ls -l /proc/NNNNN/fd | grep video ->
lrwx------ 1 jesup jesup 64 Sep 12 17:30 72 -> /dev/video0
After restart, without re-opening anything (FF home page shown):
ls -l /proc/XXXXX/fd -> | grep video
lrwx------ 1 jesup jesup 64 Sep 12 17:31 72 -> /dev/video0
This should not block anything. This should get looked at from the crashreporter side; this may be causing other problems for people who only restart the browser when it crashes.
No longer blocks: loop_mvp
Flags: needinfo?(bogdan.maris) → needinfo?(ted)
Summary: [Linux] webRTC call still on after Firefox crash → [Linux] filehandles to resources (including /dev/video) not closed when restarting from crashreport
Comment 10•11 years ago
|
||
As I stated in comment 6: by the time you click "Restart" the Firefox process should be *dead*, we call _exit(1). The crash reporter client re-launches a new instance of Firefox, but it should have no impact on the old one.
What is holding on to /dev/video? Does it somehow persist past fork/exec?
Flags: needinfo?(ted)
Comment 11•11 years ago
|
||
CrashReporter is inheriting at least some open files from firefox. Not only /dev/video0, but also it appears jprof-log (I use --enable-jprof), though there are two fds to it.
The default is to inherit fd's on fork/exec/execve.
I'll note none of these are opened with O_CLOEXEC/F_DUPFD_CLOEXEC/FD_CLOEXEC
Note also that linux mq's are inherited.
Comment 12•11 years ago
|
||
Comment 13•11 years ago
|
||
I guess I never realized that, but that's terrible. The supported way seems to be "use O_CLOEXEC everywhere". We could fix this particular issue by doing that when opening video devices.
To fix the general case I guess we'd have to add some code to iterate and close all open fds before we exec, which doesn't seem to be easy.
Comment 14•11 years ago
|
||
This will probably work (but I don't know if it's always safe?):
int m = getdtablesize();
for (int i = 3; i < m; i++) {
close(i);
}
This would need to use the sys_ wrappers from linux_syscall_support.h, so it'd have to use sys_getrlimit instead of getdtablesize.
Updated•3 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•