Closed
Bug 856848
Opened 12 years ago
Closed 12 years ago
Move PCMedia ctor to after thread acquisition in PCImpl
Categories
(Core :: WebRTC: Signaling, defect)
Tracking
()
VERIFIED
FIXED
mozilla23
People
(Reporter: ekr, Assigned: abr)
References
Details
(Keywords: crash, testcase, Whiteboard: [WebRTC][blocking-webrtc+])
Attachments
(3 files, 1 obsolete file)
6.16 KB,
text/plain
|
Details | |
9.78 KB,
patch
|
abr
:
review+
abr
:
review+
lsblakk
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
9.63 KB,
patch
|
ehugg
:
review+
|
Details | Diff | Splinter Review |
No description provided.
Reporter | ||
Comment 1•12 years ago
|
||
* * *
Bug 856433. Fix thread crash--fix
Reporter | ||
Comment 2•12 years ago
|
||
Comment on attachment 732100 [details] [diff] [review]
Fix thread crash
Review of attachment 732100 [details] [diff] [review]:
-----------------------------------------------------------------
carrying forward r+ from ehugg when this was bug 856433
Attachment #732100 -
Flags: review+
Comment 3•12 years ago
|
||
Blocking+ as bug 856433 blocked as well
Whiteboard: [WebRTC][blocking-webrtc+]
Updated•12 years ago
|
Assignee: nobody → ekr
Priority: -- → P1
Updated•12 years ago
|
Whiteboard: [WebRTC][blocking-webrtc+] → [WebRTC][blocking-webrtc+][webrtc-uplift]
Comment 4•12 years ago
|
||
Can we land the patch on this bug when we land TURN support (Bug 786235)?
Flags: needinfo?(ekr)
Assignee | ||
Updated•12 years ago
|
Assignee: ekr → adam
Assignee | ||
Comment 5•12 years ago
|
||
The patch applies cleanly, albeit with offsets, and passes unit, mochi, and crash tests. However, when running with the test case from bug 860143, I have once (and only once, out of about a dozen attempts) had it assert when attempting to dispatch to the STS thread.
Notably, in examining the STS thread, you'll see:
mThread = {
mRawPtr = 0x0
},
Which I take to mean that the STS thread is gone at this point.
I think the solution here is to make RUN_ON_THREAD resilient to NS_ERROR_NOT_INITIALIZED.
Assignee | ||
Comment 6•12 years ago
|
||
It's probably worth noting that, the one time it did trigger, I was doing repeated reloads of the test case, and then (rather quickly after a reload) shut down the browser with Command-Q. I suspect that's relevant.
Assignee | ||
Comment 7•12 years ago
|
||
Assignee | ||
Updated•12 years ago
|
Attachment #732100 -
Attachment is obsolete: true
Assignee | ||
Comment 8•12 years ago
|
||
Comment on attachment 735967 [details] [diff] [review]
Fix thread crash
ekr -- All I need a review on is the change in runnable_utils.h; the rest is unchanged since ehugg's review. See Comment 5 for rationale.
Attachment #735967 -
Flags: review?(ekr)
Assignee | ||
Comment 9•12 years ago
|
||
Comment on attachment 735967 [details] [diff] [review]
Fix thread crash
Carrying forward ehugg's r+
Attachment #735967 -
Flags: review+
Assignee | ||
Updated•12 years ago
|
Flags: needinfo?(ekr)
Assignee | ||
Comment 10•12 years ago
|
||
Excerpts from an IRC discussion on the RUN_ON_THREAD changes:
[10:49am] ekr: I'm not sold that this patch is right
[10:49am] abr: The change to RUN_ON_THREAD?
[10:49am] ekr: So, we want these things to happen.
[10:50am] ekr: and it's not clear to me it's cool to have them just fail
...
[10:50am] abr: Okay. And if the thread is gone, should we just run them on the current thread?
[10:50am] ekr: I am not sure.
[10:50am] ekr: it's a real problem
[10:50am] ekr: The whole shutdown thing is a nightmare
[10:50am] abr: In practice, the only way the threads we're currently using this for (main and sts) could be gone is if (1) the system is shutting down, or (2) the system is messed up beyond description.
[10:51am] ekr: yeah
[10:51am] ekr: So, is it better to have a silent fail or an assert?
[10:51am] ekr: I mean, I bet we are assuming that these succeed
[10:51am] abr: In (1), failing to run this stuff might leak resources. I guess that's bad from an orangefactor perspective.
[10:51am] ekr: so it could easily be creating some sort of "can't happen"
[10:51am] abr: In (2), of course, it doesn't matter.
[10:52am] ekr: But say we have a dispatch for an operation that's supposed to happen and we depend on that
[10:52am] abr: I would think a silent fail? The ensure_success will emit a logging message.
[10:52am] ekr: it might
[10:52am] ekr: i mean, it will emit a logging message
[10:52am] ekr: But suppose it creates some later crash
[10:53am] abr: Are you assuming (1) or (2) above?
[10:53am] ekr: 1
[10:53am] abr: I'm thinking through what that kind of crash might look like.
[10:53am] ekr: not sure...
...
[10:59am] abr: Okay. So, from a practical perspective, I see a handful of choices. (1) Land this patch as-is, which might let some resources hit the floor in a shutdown situation; (2) Remove the RUN_ON_THREAD changes, which may (and probably will) lead to intermittent oranges; (3) Change the patch to run the code on current thread if the target thread is gone (which will require modification to several thread-check asserts); or (4) Try to ferret out and fix the root cause of the STS thread going away before the PC is torn down (which, now that I think about it, is going to involve the relative shutdown orders of GC/CC, Main, and STS).
[11:01am] abr: Actually, that's ineresting. The real sequencing problem here is that STS is torn down before GC/CC finishes its job.
...
[11:06am] abr: Yeah, none of those 4 seem ideal.
[11:06am] ekr: I guess your patch is probably the best choice, but I really don't love it
[11:08am] abr: Understood. I'll add something to Bug 829591 describing this situation, and maybe when we fix that issue (if doing so is even possible), we can take this into account and avoid it happening at all.
Assignee | ||
Comment 11•12 years ago
|
||
Comment on attachment 735967 [details] [diff] [review]
Fix thread crash
r+ from ekr via IRC
Attachment #735967 -
Flags: review?(ekr) → review+
Assignee | ||
Comment 12•12 years ago
|
||
Comment 13•12 years ago
|
||
Assignee | ||
Comment 14•12 years ago
|
||
re-trying after replacing ASSERT_TRUE with EXPECT_TRUE in constructor:
https://tbpl.mozilla.org/?tree=Try&rev=47b937880eca
Comment 15•12 years ago
|
||
Targeted to ship for Fx22, tracking to get on rel mgmt radar.
tracking-firefox22:
--- → ?
Assignee | ||
Comment 16•12 years ago
|
||
That push worked just fine, so here we go again:
https://hg.mozilla.org/integration/mozilla-inbound/rev/70f7ccdb5a98
Updated•12 years ago
|
Flags: in-testsuite?
Updated•12 years ago
|
Assignee | ||
Comment 18•12 years ago
|
||
> Group: qualcomm-confidential
Um... What?
Assignee | ||
Updated•12 years ago
|
Flags: needinfo?(akeybl)
Comment 19•12 years ago
|
||
That was probably unintended. Removing.
Group: qualcomm-confidential
Flags: needinfo?(akeybl)
Comment 20•12 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #19)
> That was probably unintended. Removing.
Yep, unintentional thanks.
Updated•12 years ago
|
Comment 21•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Flags: in-testsuite? → in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Updated•12 years ago
|
Whiteboard: [WebRTC][blocking-webrtc+][webrtc-uplift] → [WebRTC][blocking-webrtc+][webrtc-uplift][qa-]
Comment 22•12 years ago
|
||
Was verified and confirmed that this patch fixes the crash in the dupe.
Status: RESOLVED → VERIFIED
Whiteboard: [WebRTC][blocking-webrtc+][webrtc-uplift][qa-] → [WebRTC][blocking-webrtc+][webrtc-uplift]
Comment 23•12 years ago
|
||
Comment on attachment 735967 [details] [diff] [review]
Fix thread crash
[Approval Request Comment]
Bug caused by (feature/regressing bug #): N/A
User impact if declined: intermittent (null de-ref) crashes and debug MOZ_ASSERT crashes
Testing completed (on m-c, etc.): On m-c. ALso used to verify that several other crash bugs reported were actually duplicates of this one. (Not all are marked as dups of this, as some were other bugs that triggered the RUN_ON_THREAD MOZ_ASSERT() as a side-effect.
Risk to taking this patch (and alternatives if risky): Low; moves initialization of some important thread pointers to occur before they can used for Dispatch()es
String or IDL/UUID changes made by this patch: none
Attachment #735967 -
Flags: approval-mozilla-aurora?
Comment 24•12 years ago
|
||
Triage Comment:
Since we've left the other untracked TURN bugs un-approved on this round, we'll hold off here just for now, it can be approved if/when bug 786235 and bug 855769 are landed.
Reporter | ||
Comment 25•12 years ago
|
||
This is not a TURN bug, it's just behind TURN in the commit history.
Assignee | ||
Comment 26•12 years ago
|
||
Eric is 100% correct. This problem can and has shown up in a variety of non-TURN scenarios (most notably, if you simply replace "turn" with "stun," the problem remains). Unless we uplift this patch prior to shipping, we're going to end up with a situation that makes it very easy to crash the browser from javascript, both accidentally and maliciously.
Comment 27•12 years ago
|
||
Comment on attachment 735967 [details] [diff] [review]
Fix thread crash
Thanks for clarifying, we can take this uplift then since it will not impact whether we enable TURN support in FF22.
Attachment #735967 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 28•12 years ago
|
||
status-firefox23:
--- → fixed
Comment 29•12 years ago
|
||
Up on Aurora, removing webrtc-uplift
Whiteboard: [WebRTC][blocking-webrtc+][webrtc-uplift] → [WebRTC][blocking-webrtc+]
Comment 30•12 years ago
|
||
Backed out for bustage. Going to leave WebRTC uplifts for the pros from now on...
https://hg.mozilla.org/releases/mozilla-aurora/rev/ed6f79077395
https://tbpl.mozilla.org/php/getParsedLog.php?id=21929064&tree=Mozilla-Aurora
Updated•12 years ago
|
Priority: P1 → --
Whiteboard: [WebRTC][blocking-webrtc+] → [WebRTC][blocking-webrtc+][webrtc-uplift]
Assignee | ||
Comment 31•12 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #30)
> Backed out for bustage. Going to leave WebRTC uplifts for the pros from now
> on...
Even though this isn't a TURN bug, it does need to land on top of the TURN patches (or be ported to Aurora without those patches, which I would not recommend).
Comment 32•12 years ago
|
||
Comment 33•12 years ago
|
||
Comment on attachment 739874 [details] [diff] [review]
Fix thread crash (patch ported to Aurora)
This is the m-c patch ported to Aurora; the only non-mechanical merge change is changing addStunServer to addServer.
With the signaling_unittest patch from bug 856644 (which we should also uplift), all signaling unittests pass.
spraying the r? request because I'm not sure who's available.
Attachment #739874 -
Flags: review?(ethanhugg)
Attachment #739874 -
Flags: review?(ekr)
Attachment #739874 -
Flags: review?(adam)
Comment 34•12 years ago
|
||
Comment on attachment 739874 [details] [diff] [review]
Fix thread crash (patch ported to Aurora)
Review of attachment 739874 [details] [diff] [review]:
-----------------------------------------------------------------
r+ based on comparing it to previously reviewed M-C version.
Attachment #739874 -
Flags: review?(ethanhugg) → review+
Comment 35•12 years ago
|
||
Comment on attachment 739874 [details] [diff] [review]
Fix thread crash (patch ported to Aurora)
Carrying forward a=lsblakk on the un-bitrot for Aurora though I can't mark it.
Attachment #739874 -
Flags: review?(ekr)
Attachment #739874 -
Flags: review?(adam)
Comment 36•12 years ago
|
||
Whiteboard: [WebRTC][blocking-webrtc+][webrtc-uplift] → [WebRTC][blocking-webrtc+]
Comment 37•12 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #22)
> Was verified and confirmed that this patch fixes the crash in the dupe.
Jason, is it safe to mark the status flags verified?
Comment 38•12 years ago
|
||
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #37)
> (In reply to Jason Smith [:jsmith] from comment #22)
> > Was verified and confirmed that this patch fixes the crash in the dupe.
>
> Jason, is it safe to mark the status flags verified?
For 23, yeah.
You need to log in
before you can comment on or make changes to this bug.
Description
•