Closed
Bug 904784
Opened 11 years ago
Closed 11 years ago
Local shared active media does not stop recording on tab close
Categories
(Firefox for Android Graveyard :: General, defect)
Tracking
(firefox23 disabled, firefox24+ fixed, firefox25+ verified, firefox26 verified, firefox-esr24 wontfix, fennec24+)
People
(Reporter: aaronmt, Assigned: jesup)
References
Details
(Keywords: reproducible, Whiteboard: [getUserMedia][android-gum?])
Attachments
(9 files, 1 obsolete file)
4.22 KB,
text/plain
|
Details | |
13.21 KB,
text/plain
|
Details | |
9.30 KB,
text/plain
|
Details | |
7.91 KB,
text/plain
|
Details | |
11.67 KB,
patch
|
Details | Diff | Splinter Review | |
6.07 KB,
patch
|
benjamin
:
review-
|
Details | Diff | Splinter Review |
3.32 KB,
patch
|
mwu
:
review+
|
Details | Diff | Splinter Review |
7.09 KB,
patch
|
mwu
:
review+
bajaj
:
approval-mozilla-aurora+
bajaj
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
3.36 KB,
patch
|
jesup
:
review+
|
Details | Diff | Splinter Review |
Akin to bug 885783 which was a followup from bug 88136, I can reproduce a situation where our local shared active media notification is stuck and will not dismiss on tab close. Unlike last time, the media lock is not held. Jason was also able to reproduce this.
This is likely a race condition, close the tab quickly after sharing.
Steps:
i) http://mozilla.github.io/webrtc-landing/gum_test.html
ii) Share 'audio and video', close tab (quickly)
--
Nightly (08/13)
Samsung Galaxy SIV (Android 4.3)
Updated•11 years ago
|
Blocks: android-webrtc
Reporter | ||
Comment 1•11 years ago
|
||
Comment 2•11 years ago
|
||
Not blocking only because this requires timing to reproduce this. A user can definitely hit this, but it's not a common use case.
Whiteboard: [getUserMedia][android-gum-]
Updated•11 years ago
|
tracking-fennec: ? → +
Comment 3•11 years ago
|
||
Please renom if the product team feels this is a webRTC ship blocker.
tracking-firefox24:
? → ---
Reporter | ||
Comment 4•11 years ago
|
||
New video URL; thus far I've hit this only via a race. I will renom if I can this any other method.
http://www.youtube.com/watch?v=8_VVeT15lhY
Reporter | ||
Comment 5•11 years ago
|
||
Able to reproduce this in normal conditions
Requires 2 phones:
i) Have both phones join a room at http://talky.io/<room>
ii) Accept and share a camera and microphone
iii) After a short while (~20 seconds), close the active tab on one device
See the notification on the device with the device closed still remaining.
tracking-fennec: + → ?
Summary: Local shared active media notification does not dismiss on tab close (race) → Local shared active media notification does not dismiss on tab close
Reporter | ||
Updated•11 years ago
|
Whiteboard: [getUserMedia][android-gum-] → [getUserMedia][android-gum?]
Reporter | ||
Comment 6•11 years ago
|
||
It looks like the larger problem here is that the media is still active after tab close.
I captured output into a log with appropriate logging and on tab close, after my logging of audio shutdown in this case I can continue to see preview frame length logging while idle with no tabs open.
Comment 7•11 years ago
|
||
(In reply to Aaron Train [:aaronmt] from comment #6)
> Created attachment 792256 [details]
> Raw log (Nightly 08/19)
>
> It looks like the larger problem here is that the media is still active
> after tab close.
>
> I captured output into a log with appropriate logging and on tab close,
> after my logging of audio shutdown in this case I can continue to see
> preview frame length logging while idle with no tabs open.
Now that sounds actually quite worse - we're persisting a getUserMedia session after shutdown. Could there be a leak here potentially?
Reporter | ||
Comment 8•11 years ago
|
||
Another log, actions shown
Visited my home-screen
> I/ActivityManager( 517): START u0 {act=android.intent.action.MAIN cat=[android.intent.category.HOME] flg=0x10200000 cmp=com.android.launcher/com.android.launcher2.Launcher} from pid 517
Launched my camera successfully
> I/ActivityManager( 517): START u0 {act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x10200000 cmp=com.google.android.gallery3d/com.android.camera.CameraLauncher bnds=[40,785][200,985]} from pid 874
Re-launched Fennec, idle on about:home, no tabs open
> I/ActivityManager( 517): START u0 {act=android.intent.action.MAIN cat=[android.intent.category.HOME] flg=0x10200000 cmp=com.android.launcher/com.android.launcher2.Launcher} from pid 517
Immediately my console continues to flood with WebRTC preview frame length messages (again, zero tabs open; idle on about:home)
> V/WEBRTC-JC(19252): preview frame length 497664 context-2126770176
Reporter | ||
Comment 9•11 years ago
|
||
Tapping the notification, does not dismiss the notification (no tabs open, sitting on about:home)
Reporter | ||
Comment 10•11 years ago
|
||
On further attempt to initiate a new gUM session in a brand new tab with this corrupt situation, no hardware drop-down is displayed on attempt to share, nor is any preview frame is displayed
Demo page used to resolve the situation: http://mozilla.github.io/webrtc-landing/gum_test.html
This is in-line with what this users also reported: https://groups.google.com/forum/#!topic/mozilla.dev.media/SxDpgWm8zao
Reporter | ||
Comment 11•11 years ago
|
||
The current work-around is to force-stop (in Android Application Settings) to wipe the lock the browser has on local media.
Summary: Local shared active media notification does not dismiss on tab close → Local shared active media notification does not dismiss on tab close; possible gUM leak
Reporter | ||
Comment 12•11 years ago
|
||
Alternative STR that works for me on my Nexus 4 (Android 4.3)
i) http://mozilla.github.io/webrtc-landing/gum_test.html
ii) Share Audio & Video
iii) Hit the device home-button
iv) Restore the browser, close the active tab
Comment 13•11 years ago
|
||
(In reply to Aaron Train [:aaronmt] from comment #12)
> Alternative STR that works for me on my Nexus 4 (Android 4.3)
>
> i) http://mozilla.github.io/webrtc-landing/gum_test.html
> ii) Share Audio & Video
> iii) Hit the device home-button
> iv) Restore the browser, close the active tab
That doesn't reproduce the bug for me on a Galaxy Nexus on Android 4.2.
Reporter | ||
Comment 14•11 years ago
|
||
Another log, right after closing the active tab and seeing the notification remain
Updated•11 years ago
|
Assignee: nobody → bnicholson
tracking-fennec: ? → 24+
Updated•11 years ago
|
tracking-firefox24:
--- → ?
Updated•11 years ago
|
tracking-firefox24:
? → ---
Comment 15•11 years ago
|
||
Tracking since this came up in the product meeting as a possible 24 fix, not sure if it's worth taking at this point (edge casey).
tracking-firefox24:
--- → ?
Comment 16•11 years ago
|
||
Using the STR from comment 12, I traced the remove messages, and I found that "Notification:Hide" is never sent from WebrtcUI.js (notify() itself isn't called when this bug happens). I'm not too familiar with this code; what should trigger these notify() events?
Flags: needinfo?(wjohnston)
Comment 17•11 years ago
|
||
Should fire whenever we get a recording-device-events message with the data set to "shutdown". i.e. here's one: http://mxr.mozilla.org/mozilla-central/source/dom/camera/DOMCameraControl.cpp#322.
Or this call in MediaManager should also be firing:
http://mxr.mozilla.org/mozilla-central/source/dom/camera/DOMCameraControl.cpp#322
I'm not sure which path we should be on for closing a tab with an active session though.
From reading this, I think(?) there are two bugs here. Maybe we should spilt them up to avoid confusion.
1.) The race where you quickly close the tab before the notification is closed
2.) The bug in comments 5, 6, and 7 where the lock on media isn't being closed ever?
Flags: needinfo?(wjohnston)
Comment 18•11 years ago
|
||
As Aaron said in comment 6, I'm seeing these capture events in logcat even after the tab is closed. In my situation, it seems that these events are flooding Gecko and heavily delaying the callbacks from executing. During this time, Gecko isn't responsive at all; opening a new tab does nothing, the about:home favicon doesn't load, etc. After following the STR from comment 12, if I wait a long time (around 5-10 minutes), I do eventually see the media events stop. At that point, the notification disappears and Gecko becomes responsive again.
It looks like this is the path taken for a tab close:
OnNavigation calls Invalidate: http://mxr.mozilla.org/mozilla-central/source/dom/media/MediaManager.cpp#1255
Invalidate dispatches a MEDIA_STOP Runnable: http://mxr.mozilla.org/mozilla-central/source/dom/media/MediaManager.cpp#1574
This dispatches a STOPPING media event: http://mxr.mozilla.org/mozilla-central/source/dom/media/MediaManager.h#320
Which finally triggers the "recording-device-events" notification: http://mxr.mozilla.org/mozilla-central/source/dom/media/MediaManager.cpp#1634
After the STR from comment 12, I found that even MediaManager#OnNavigation isn't called while Gecko is hosed.
Comment 19•11 years ago
|
||
Tracking for Fx24 and would take a low risk patch if available for Fx24 but definitely not a show stopper for WebRTC ship in Fx24.
tracking-firefox25:
--- → ?
Assignee | ||
Comment 20•11 years ago
|
||
Brian's path for shutdown in MediaManager is correct: if we're not getting OnNavigation, we won't ever start the process of killing the capture.
Note that STOPPING is only sent once we've actually stopped the media (and this is done on the MediaThread).
Normally, OnNavigation is called from Navigator::OnNavigation; it's inside #ifdef MOZ_MEDIA_NAVIGATOR
Comment 21•11 years ago
|
||
Correction: I'm actually using Aaron's steps from comment 5, not comment 12. The STR from comment 12 haven't worked for me using my Droid RAZR.
Upon closing the tab for talky.io, I'm seeing three different possible outcomes:
1) The tab closes and the notification disappears, as expected.
2) The tab closes, the notification remains, and Gecko is frozen for several minutes as described in comment 18. I haven't actually hit this since yesterday.
3) The tab closes, the notification remains, and Gecko is still responsive.
I've been running into #3 a lot today, and what I've found is that OnNavigation is called, we get to Invalidate, and a MEDIA_STOP runnable is dispatched here: http://mxr.mozilla.org/mozilla-central/source/dom/media/MediaManager.cpp#1578. But the runnable's Run() callback (http://mxr.mozilla.org/mozilla-central/source/dom/media/MediaManager.h#257) is never executed.
I'm not really sure how to proceed from here -- I don't think I'm familiar enough with the code (or GDB) to debug why the queue is blocked.
Updated•11 years ago
|
Comment 22•11 years ago
|
||
After reproducing this somewhat reliably with minimal steps (just share audio+video and wait), I think this bug is a major problem. As has already been pointed out, this isn't a notification issue; the media itself never stops streaming after the tab is closed.
That means:
1) The notification hangs around after the tab is closed (this is actually the smallest issue IMO)
2) The media continues to stream, draining the user's battery and raising the privacy concerns of recording when it's not expected or wanted
3) To top it all off, Firefox can't even be closed using the normal method (swiping from the recent apps list) since we're using a foreground notification, keeping Firefox alive forever. The only way out is to force close Firefox as Aaron mentioned in comment 11, which many users won't know how to do.
Summary: Local shared active media notification does not dismiss on tab close; possible gUM leak → Local shared active media does not stop recording on tab close
Comment 23•11 years ago
|
||
Yikes. That's a whole lot worse than we initially expected. I'm going to followup in regards to this bug and webrtc crasher over email to discuss if we should still ship on 24 with these issues present.
Reporter | ||
Comment 24•11 years ago
|
||
This is also trivial to reproduce with Jason's script @ https://bugzilla.mozilla.org/attachment.cgi?id=799865
Severity: normal → major
Comment 25•11 years ago
|
||
Monica - Can you comment here to indicate if from a privacy perspective that you are okay or not okay with shipping with this webrtc bug? See comment 22 for context on the privacy concern.
Flags: needinfo?(mmc)
Comment 26•11 years ago
|
||
I agree with Brian. That the media continues to stream after the tab is closed violates the privacy principles of no surprises and user control (http://www.mozilla.org/en-US/privacy/).
Flags: needinfo?(mmc)
Comment 27•11 years ago
|
||
(In reply to Brian Nicholson (:bnicholson) from comment #22)
> After reproducing this somewhat reliably with minimal steps (just share
> audio+video and wait), I think this bug is a major problem. As has already
> been pointed out, this isn't a notification issue; the media itself never
> stops streaming after the tab is closed.
>
> That means:
> 1) The notification hangs around after the tab is closed (this is actually
> the smallest issue IMO)
> 2) The media continues to stream, draining the user's battery and raising
> the privacy concerns of recording when it's not expected or wanted
> 3) To top it all off, Firefox can't even be closed using the normal method
> (swiping from the recent apps list) since we're using a foreground
> notification, keeping Firefox alive forever. The only way out is to force
> close Firefox as Aaron mentioned in comment 11, which many users won't know
> how to do.
I've been able to reproduce this as well on a Galaxy S4 - please call/txt (phonebook) if you need my help.
Assignee | ||
Comment 28•11 years ago
|
||
(In reply to Monica Chew [:mmc] (please use needinfo) from comment #26)
> I agree with Brian. That the media continues to stream after the tab is
> closed violates the privacy principles of no surprises and user control
The camera remains on, however, the application that was getting the data is gone, and origin restrictions mean no one else should be able to get at it (as normal). The only issue for the user will be that the camera is indicated as on (and it may drain their battery) - neither of which are good, but there's no actual privacy leak so far as I know.
I'm actively trying to narrow down exactly what's happening. it sounds like something has hung the MediaThread so it never processes the STOP.
Assignee | ||
Comment 29•11 years ago
|
||
bnicholson has been able to repro it and get me logs from MediaThread with some debugs forced on. It seems highly timing dependent; he needs to clear all the app data/profile before running the STR to get it to repro on the N4. I'm still totally unable to repro it on a Galaxy S4 or a Nexus 10 (using either STR).
Audio is hanging in shutdown on his N4. It's hanging in mVoeBase->StopSend() somewhere (webrtc.org code). This hangs the thread and it never gets to shut down audio or video capture. (Also note per above that capture continues, but nothing will be consuming the audio/video streams.)
If I had to guess, it's a deadlock between _callbackCritSect (grabbed in Channel::StopSend() - which does very little under lock - and _shared->crit_sect() (grabbed in VoEBaseImpl::StopSend(), which calls Channel::StopSend()). but that's just a guess right now, and may well be wrong until I get to either debug it locally or get deeper into the callstack. I'll have bnicholson see if he can repro it on an N4 Debug build, which may let us verify if we're deadlocked and if so where.
One contemplated idea is to add a deadman switch to MediaThread, that calls MOZ_CRASH() if MediaThread takes too long to process a command (to stop). This would guarantee we don't end up hanging around no matter what causes the thread to hang (and it's a much safer/easier/guaranteed solution), and much easier to feel confident about.
Comment 30•11 years ago
|
||
A watchdog here seems like a useful debugging aid but I think we should actually
fix the underlying problem before ship.
Assignee | ||
Comment 31•11 years ago
|
||
Try: https://tbpl.mozilla.org/?tree=Try&rev=d6da23f71804 This is a deadman switch to avoid any potential issues if anything locks up starting or stopping media. Tested on Linux and Android (though I can't repro this bug, so if others who can would try this I'd appreciate it). I'll post an easier-to-read diff (I had to move a class, so the patch looks bigger than it is)
Assignee | ||
Comment 32•11 years ago
|
||
easier-to-read patch that assumes the class was already moved
Assignee | ||
Comment 33•11 years ago
|
||
FYI, bnicholson verified it deadmans correctly when this problem is hit on his N4. He's going to try to get me a post-failure thread dump so I can identify the likely deadlock.
Assignee | ||
Updated•11 years ago
|
Attachment #800796 -
Flags: review?(benjamin)
Assignee | ||
Comment 34•11 years ago
|
||
bnicholson has
a) gotten me stacks that show MediaManager (MediaThread) is in ~AudioRecord(), and it's waiting on the delivery thread to exit. That thread may be deadlocked somewhere in the webrtc.org code against the shutdown code on some critical section locks.
b) he verified it does happen in Beta, and that the task is hung in the same place (~AudioRecord()). This also means it can't be an regression from bug 884365.
We're trying to get better stacks to identify the mutexes involved
Assignee | ||
Comment 35•11 years ago
|
||
Slightly ugly root-cause fix attempt that adds a second lock for callback<->recordthread communication, so the callback never touches crit_sect_, and StopRecording() doesn't hold callback_crit_sect while destroying the openlsles objects. Working on a cleaner alternative.
Assignee | ||
Comment 36•11 years ago
|
||
alternate patch that drops the crit_sect lock while releasing the opensles objects
Updated•11 years ago
|
Assignee: bnicholson → rjesup
Comment 37•11 years ago
|
||
Comment on attachment 800968 [details] [diff] [review]
use a separate critical section for the recording callback
Review of attachment 800968 [details] [diff] [review]:
-----------------------------------------------------------------
Do we need to hold the new lock in StartRecording?
Makes sense overall, though messy due to adding a new lock.
Reporter | ||
Comment 38•11 years ago
|
||
Using this try-build [1], this works for me. Unable to reproduce with the second patch posted. Great stuff!
[1] https://tbpl.mozilla.org/?tree=Try&rev=d713d5895ebe
Assignee | ||
Comment 39•11 years ago
|
||
Add a few more pedantic callback_lock grabs to ensure is_recording_ is accessed under lock always
Assignee | ||
Updated•11 years ago
|
Attachment #800968 -
Attachment is obsolete: true
Assignee | ||
Updated•11 years ago
|
Attachment #801071 -
Attachment description: use a separate critical section for the recording callback → Version 1: use a separate critical section for the recording callback
Attachment #801071 -
Flags: review?(mwu)
Assignee | ||
Updated•11 years ago
|
Attachment #800980 -
Attachment description: Avoid audio shutdown deadlock in opensles by dropping lock while releasing opensles → Version 2: Avoid audio shutdown deadlock in opensles by dropping lock while releasing opensles
Attachment #800980 -
Flags: review?(mwu)
Updated•11 years ago
|
Attachment #800980 -
Flags: review?(mwu) → review+
Updated•11 years ago
|
Attachment #801071 -
Flags: review?(mwu) → review+
Comment 40•11 years ago
|
||
I prefer the 2nd approach of dropping the lock while releasing opensles, but the 1st approach is fine too. Pick whichever one seems less risky to you.
Assignee | ||
Comment 41•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/69e78915fc92
Landed this one because while both were r+'d, and mwu liked #2 and blassey liked #1 (though both were ok with both patches), I think #1 is less prone to errors or unexpected results (though the cases that would cause that are almost certainly not hittable in our codebase, like calling StopRecording() from a second thread while one thread was already in the middle (unlocked) part of StopRecording().
Target Milestone: --- → Firefox 26
Assignee | ||
Comment 42•11 years ago
|
||
Comment on attachment 801071 [details] [diff] [review]
Version 1: use a separate critical section for the recording callback
pre-emptive uplift request per release-coordination meeting - patch is on inbound
[Approval Request Comment] (for Aurora and Beta)
Bug caused by (feature/regressing bug #): N/A
User impact if declined: Certain devices or OS versions may experience random failures to shut down media capture, leading to the camera/mic being locked and burning battery.
Testing completed (on m-c, etc.): Tested by bnicholson on a local build. aaronmt tested the alternative patch with no problems - different patch, but shows that we correctly identified the cause of the lockups.
Risk to taking this patch (and alternatives if risky): Risk is low, and most likely failure mode would be a mutex deadlock (which is what we had before).
String or IDL/UUID changes made by this patch: none
Attachment #801071 -
Flags: approval-mozilla-beta?
Attachment #801071 -
Flags: approval-mozilla-aurora?
Comment 43•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•11 years ago
|
Status: RESOLVED → VERIFIED
Comment 44•11 years ago
|
||
Comment on attachment 801071 [details] [diff] [review]
Version 1: use a separate critical section for the recording callback
Approving the low-risk patch for uplift as this issue is a shipping blocker for webRTC on Android.
Please make sure to land asap to get this into tomorrow's beta.
Request to QA to verify this on a couple of devices post mozilla-beta landing.
Attachment #801071 -
Flags: approval-mozilla-beta?
Attachment #801071 -
Flags: approval-mozilla-beta+
Attachment #801071 -
Flags: approval-mozilla-aurora?
Attachment #801071 -
Flags: approval-mozilla-aurora+
Assignee | ||
Comment 45•11 years ago
|
||
Reporter | ||
Comment 46•11 years ago
|
||
I am still *on occasion* hitting this bug albeit not as frequent as prior only on mozilla-beta (24.0b10) and mozilla-aurora (09/10) thus far. I am testing on mozilla-central (09/10) currently and forcefully trying to reproduce this but no luck yet.
I am using my LG Nexus 4 (Android 4.3) and visiting http://talky.io/123, after sharing my camera and microphone, I close the tab and I see the original issue as reported and as seen in my console output with the continuation of preview frame logging. I am hitting this not as frequent as before.
See video: http://www.youtube.com/watch?v=jurvzQTD9Pc
This leads me to believe that it might be possible to question wether there is there any possible code variance in the associated files between mozilla-central and mozilla-beta here that could cause contention or any potential mishap in the uplift?
Comment 47•11 years ago
|
||
Comment on attachment 800796 [details] [diff] [review]
Implement MediaThread start/stop capture deadman switch
I'm finding it very hard to review this because the class decl moved, and I'm not sure what is existing code and what's new. Is it possible to split it up so that moving the class decl is in one rubberstamp patch and the actual function change is separate?
Attachment #800796 -
Flags: review?(benjamin)
Assignee | ||
Comment 48•11 years ago
|
||
Comment on attachment 800797 [details] [diff] [review]
deadman.patch
this is the patch that assumes the class has been moved first
Attachment #800797 -
Flags: review?(benjamin)
Comment 49•11 years ago
|
||
Comment on attachment 800797 [details] [diff] [review]
deadman.patch
The major problem with this patch is the lack of guards for things like process sleep. It at least requires that we reset or cancel the timer if the computer goes to sleep.
I'm also a little concerned about what happens if the computer clock changes. I'm not sure what clock nsITimer uses to schedule its timeouts, but if it's a system clock, this will not survive clock resets and it needs to be double-checked against the results of mozilla::TimeStamp.
Attachment #800797 -
Flags: review?(benjamin) → review-
Comment 50•11 years ago
|
||
(In reply to Aaron Train [:aaronmt] from comment #46)
> I am still *on occasion* hitting this bug albeit not as frequent as prior
can you help confirm the frequency here ? Is it 1/10, 1/15..etc ?
> only on mozilla-beta (24.0b10) and mozilla-aurora (09/10) thus far. I am
> testing on mozilla-central (09/10) currently and forcefully trying to
> reproduce this but no luck yet.
>
> I am using my LG Nexus 4 (Android 4.3) and visiting http://talky.io/123,
> after sharing my camera and microphone, I close the tab and I see the
> original issue as reported and as seen in my console output with the
> continuation of preview frame logging. I am hitting this not as frequent as
> before.
>
> See video: http://www.youtube.com/watch?v=jurvzQTD9Pc
>
> This leads me to believe that it might be possible to question wether there
> is there any possible code variance in the associated files between
> mozilla-central and mozilla-beta here that could cause contention or any
> potential mishap in the uplift?
Jesup, any ideas here ?
Reporter | ||
Comment 51•11 years ago
|
||
(In reply to bhavana bajaj [:bajaj] from comment #50)
> (In reply to Aaron Train [:aaronmt] from comment #46)
> > I am still *on occasion* hitting this bug albeit not as frequent as prior
>
> can you help confirm the frequency here ? Is it 1/10, 1/15..etc ?
Well, I hate hard numbers but I just did a 1/8 attempt on Firefox 24 Beta 10.
Assignee | ||
Comment 52•11 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #49)
> Comment on attachment 800797 [details] [diff] [review]
> deadman.patch
>
> The major problem with this patch is the lack of guards for things like
> process sleep. It at least requires that we reset or cancel the timer if the
> computer goes to sleep.
TimerThread recomputes all timeouts after a sleep/suspend: See TimerThread::DoAfterSleep() and it doesn't fire them during a sleep/suspend.
>
> I'm also a little concerned about what happens if the computer clock
> changes. I'm not sure what clock nsITimer uses to schedule its timeouts, but
> if it's a system clock, this will not survive clock resets and it needs to
> be double-checked against the results of mozilla::TimeStamp.
It uses PR_WaitCondVar(), which devolves down to pthread_cond_timedwait() at the core. From the manpage:
For cases when the system clock is advanced discontinuously by an operator,
it is expected that implementations process any timed wait expiring at an
intervening time as if that time had actually occurred.
This would imply a clock change could cause all our timer to fire early (or very late, though that may be self-correcting by the insertion of new events). However, when TimerThread wakes up, it compares TimeStamp::Now() to the calculated TimeStamp for the timer to fire, and if we're not past that time, it will sleep again. So a clock change forward would cause TimerThread to wake up early, but not fire the timer early, assuming TimeStamp::Now() is monotonic.
Reporter | ||
Comment 53•11 years ago
|
||
From #mobile testing out Randell's Try builds for the same issue here on m-b
* https://tbpl.mozilla.org/?tree=Try&rev=686e87689507
** Reproducible. Fail-safe crashes as expected https://crash-stats.mozilla.com/report/index/1a240b13-f689-4828-a9a1-84cfa2130910
* https://tbpl.mozilla.org/?tree=Try&rev=36f3fdb22427
** This works for me. I'm unable to reproduce the issues reported with this Try (m-b) build. This looks to be the silver-bullet.
Assignee | ||
Comment 54•11 years ago
|
||
upstream patch from webrtc.org
Assignee | ||
Comment 55•11 years ago
|
||
Comment on attachment 802453 [details] [diff] [review]
Upstream deadlock fix for issue 2201 (rev 4589)
r+ from me and upstream (fishman)
Attachment #802453 -
Flags: review+
Assignee | ||
Comment 56•11 years ago
|
||
FYI, see comment 52 which I believe answers your comments.
That said, I don't think we need the deadman if we have a verified root-cause fix
Flags: needinfo?(benjamin)
Comment 57•11 years ago
|
||
hrm, ok. In that case I don't think we should use the deadman. Thanks for the investigation.
Flags: needinfo?(benjamin)
Comment 58•11 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #56)
> FYI, see comment 52 which I believe answers your comments.
>
> That said, I don't think we need the deadman if we have a verified
> root-cause fix
Can you please help with the risk-evaluation of this upstream root-cause fix that is planning to be applied in a new beta that may be spun for this ? Also has this had any baketime/testing yet on other channel ? How confident are we with this not introducing new regressions at this point ?
Assignee | ||
Comment 59•11 years ago
|
||
On both as beta->release merge has happened - backout beta patch, then reland original patch on top of bug 899159
https://hg.mozilla.org/releases/mozilla-release/rev/aaa392740eea
https://hg.mozilla.org/releases/mozilla-release/rev/111306ce572f
https://hg.mozilla.org/releases/mozilla-beta/rev/cb5b1d8e602a
https://hg.mozilla.org/releases/mozilla-beta/rev/8c2e54e91eca
Assignee | ||
Comment 60•11 years ago
|
||
Comment 61•11 years ago
|
||
Merged to esr24 as well.
https://hg.mozilla.org/releases/mozilla-esr24/rev/cb5b1d8e602a
https://hg.mozilla.org/releases/mozilla-esr24/rev/8c2e54e91eca
status-firefox-esr24:
--- → fixed
Reporter | ||
Comment 62•11 years ago
|
||
verified fixed on android 24.0 beta 10 build-candidate #2
Comment 63•11 years ago
|
||
Backed out from esr24 because it's mobile-only and shouldn't have been merged there.
https://hg.mozilla.org/releases/mozilla-esr24/rev/c9b5ea714522
Comment 64•11 years ago
|
||
Verified fixed on
Build: Firefox for Android 25 Beta 2
Device: Samsung Galaxy Nexus
OS: Android 4.1.1
Updated•4 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•