Closed Bug 856848 Opened 12 years ago Closed 12 years ago

Move PCMedia ctor to after thread acquisition in PCImpl

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla23

Tracking Flags:

Tracking

Status

firefox22

fixed

firefox23

---

verified

People

(Reporter: ekr, Assigned: abr)

References

Details

(Keywords: crash, testcase, Whiteboard: [WebRTC][blocking-webrtc+])

Attachments

(3 files, 1 obsolete file)

Fix thread crash 12 years ago Eric Rescorla (:ekr) 8.00 KB, patch	ekr : review+	Details \| Diff \| Splinter Review
Stack Trace (and some diagnostics) for Intermittent Assert 12 years ago Adam Roach [:abr] 6.16 KB, text/plain		Details
Fix thread crash 12 years ago Adam Roach [:abr] 9.78 KB, patch	abr : review+ abr : review+ lsblakk : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
Fix thread crash (patch ported to Aurora) 12 years ago Randell Jesup [:jesup] (needinfo me) 9.63 KB, patch	ehugg : review+	Details \| Diff \| Splinter Review

Eric Rescorla (:ekr)

Reporter

Description

•

12 years ago

No description provided.

Eric Rescorla (:ekr)

Reporter

Comment 1

•

12 years ago

Attached patch Fix thread crash (obsolete) — Details — Splinter Review

* * * Bug 856433. Fix thread crash--fix

Eric Rescorla (:ekr)

Reporter

Comment 2

•

12 years ago

Comment on attachment 732100 [details] [diff] [review] Fix thread crash Review of attachment 732100 [details] [diff] [review]: ----------------------------------------------------------------- carrying forward r+ from ehugg when this was bug 856433

Attachment #732100 - Flags: review+

Jason Smith [:jsmith]

Comment 3

•

12 years ago

Blocking+ as bug 856433 blocked as well

Whiteboard: [WebRTC][blocking-webrtc+]

Maire Reavy [:mreavy]

Updated

•

12 years ago

Assignee: nobody → ekr

Priority: -- → P1

Maire Reavy [:mreavy]

Updated

•

12 years ago

Whiteboard: [WebRTC][blocking-webrtc+] → [WebRTC][blocking-webrtc+][webrtc-uplift]

Maire Reavy [:mreavy]

Comment 4

•

12 years ago

Can we land the patch on this bug when we land TURN support (Bug 786235)?

Flags: needinfo?(ekr)

Adam Roach [:abr]

Assignee

Updated

•

12 years ago

Blocks: 860143

Adam Roach [:abr]

Assignee

Updated

•

12 years ago

Assignee: ekr → adam

Adam Roach [:abr]

Assignee

Comment 5

•

12 years ago

Attached file Stack Trace (and some diagnostics) for Intermittent Assert — Details

The patch applies cleanly, albeit with offsets, and passes unit, mochi, and crash tests. However, when running with the test case from bug 860143, I have once (and only once, out of about a dozen attempts) had it assert when attempting to dispatch to the STS thread. Notably, in examining the STS thread, you'll see: mThread = { mRawPtr = 0x0 }, Which I take to mean that the STS thread is gone at this point. I think the solution here is to make RUN_ON_THREAD resilient to NS_ERROR_NOT_INITIALIZED.

Adam Roach [:abr]

Assignee

Comment 6

•

12 years ago

It's probably worth noting that, the one time it did trigger, I was doing repeated reloads of the test case, and then (rather quickly after a reload) shut down the browser with Command-Q. I suspect that's relevant.

Adam Roach [:abr]

Assignee

Comment 7

•

12 years ago

Attached patch Fix thread crash — Details — Splinter Review

Adam Roach [:abr]

Assignee

Updated

•

12 years ago

Attachment #732100 - Attachment is obsolete: true

Adam Roach [:abr]

Assignee

Comment 8

•

12 years ago

Comment on attachment 735967 [details] [diff] [review] Fix thread crash ekr -- All I need a review on is the change in runnable_utils.h; the rest is unchanged since ehugg's review. See Comment 5 for rationale.

Attachment #735967 - Flags: review?(ekr)

Adam Roach [:abr]

Assignee

Comment 9

•

12 years ago

Comment on attachment 735967 [details] [diff] [review] Fix thread crash Carrying forward ehugg's r+

Attachment #735967 - Flags: review+

Adam Roach [:abr]

Assignee

Updated

•

12 years ago

Flags: needinfo?(ekr)

Adam Roach [:abr]

Assignee

Comment 10

•

12 years ago

Excerpts from an IRC discussion on the RUN_ON_THREAD changes: [10:49am] ekr: I'm not sold that this patch is right [10:49am] abr: The change to RUN_ON_THREAD? [10:49am] ekr: So, we want these things to happen. [10:50am] ekr: and it's not clear to me it's cool to have them just fail ... [10:50am] abr: Okay. And if the thread is gone, should we just run them on the current thread? [10:50am] ekr: I am not sure. [10:50am] ekr: it's a real problem [10:50am] ekr: The whole shutdown thing is a nightmare [10:50am] abr: In practice, the only way the threads we're currently using this for (main and sts) could be gone is if (1) the system is shutting down, or (2) the system is messed up beyond description. [10:51am] ekr: yeah [10:51am] ekr: So, is it better to have a silent fail or an assert? [10:51am] ekr: I mean, I bet we are assuming that these succeed [10:51am] abr: In (1), failing to run this stuff might leak resources. I guess that's bad from an orangefactor perspective. [10:51am] ekr: so it could easily be creating some sort of "can't happen" [10:51am] abr: In (2), of course, it doesn't matter. [10:52am] ekr: But say we have a dispatch for an operation that's supposed to happen and we depend on that [10:52am] abr: I would think a silent fail? The ensure_success will emit a logging message. [10:52am] ekr: it might [10:52am] ekr: i mean, it will emit a logging message [10:52am] ekr: But suppose it creates some later crash [10:53am] abr: Are you assuming (1) or (2) above? [10:53am] ekr: 1 [10:53am] abr: I'm thinking through what that kind of crash might look like. [10:53am] ekr: not sure... ... [10:59am] abr: Okay. So, from a practical perspective, I see a handful of choices. (1) Land this patch as-is, which might let some resources hit the floor in a shutdown situation; (2) Remove the RUN_ON_THREAD changes, which may (and probably will) lead to intermittent oranges; (3) Change the patch to run the code on current thread if the target thread is gone (which will require modification to several thread-check asserts); or (4) Try to ferret out and fix the root cause of the STS thread going away before the PC is torn down (which, now that I think about it, is going to involve the relative shutdown orders of GC/CC, Main, and STS). [11:01am] abr: Actually, that's ineresting. The real sequencing problem here is that STS is torn down before GC/CC finishes its job. ... [11:06am] abr: Yeah, none of those 4 seem ideal. [11:06am] ekr: I guess your patch is probably the best choice, but I really don't love it [11:08am] abr: Understood. I'll add something to Bug 829591 describing this situation, and maybe when we fix that issue (if doing so is even possible), we can take this into account and avoid it happening at all.

Adam Roach [:abr]

Assignee

Comment 11

•

12 years ago

Comment on attachment 735967 [details] [diff] [review] Fix thread crash r+ from ekr via IRC

Attachment #735967 - Flags: review?(ekr) → review+

Adam Roach [:abr]

Assignee

Comment 12

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/f556245d6da1

Ryan VanderMeulen [:RyanVM]

Comment 13

•

12 years ago

Backed out for bustage. https://hg.mozilla.org/integration/mozilla-inbound/rev/f23121d10690 https://tbpl.mozilla.org/php/getParsedLog.php?id=21693422&tree=Mozilla-Inbound

Adam Roach [:abr]

Assignee

Comment 14

•

12 years ago

re-trying after replacing ASSERT_TRUE with EXPECT_TRUE in constructor: https://tbpl.mozilla.org/?tree=Try&rev=47b937880eca

Jason Smith [:jsmith]

Comment 15

•

12 years ago

Targeted to ship for Fx22, tracking to get on rel mgmt radar.

tracking-firefox22: --- → ?

Adam Roach [:abr]

Assignee

Comment 16

•

12 years ago

That push worked just fine, so here we go again: https://hg.mozilla.org/integration/mozilla-inbound/rev/70f7ccdb5a98

Jason Smith [:jsmith]

Updated

•

12 years ago

Flags: in-testsuite?

Alex Keybl [:akeybl]

Updated

•

12 years ago

Group: qualcomm-confidential

status-firefox22: --- → affected

tracking-firefox22: ? → +

Adam Roach [:abr]

Assignee

Comment 18

•

12 years ago

> Group: qualcomm-confidential Um... What?

Adam Roach [:abr]

Assignee

Updated

•

12 years ago

Flags: needinfo?(akeybl)

Jason Smith [:jsmith]

Comment 19

•

12 years ago

That was probably unintended. Removing.

Group: qualcomm-confidential

Flags: needinfo?(akeybl)

Alex Keybl [:akeybl]

Comment 20

•

12 years ago

(In reply to Jason Smith [:jsmith] from comment #19) > That was probably unintended. Removing. Yep, unintentional thanks.

Jason Smith [:jsmith]

Updated

•

12 years ago

Severity: normal → critical

Keywords: crash, testcase

Ryan VanderMeulen [:RyanVM]

Comment 21

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/70f7ccdb5a98

Status: NEW → RESOLVED

Closed: 12 years ago

Flags: in-testsuite? → in-testsuite+

Resolution: --- → FIXED

Target Milestone: --- → mozilla23

Jason Smith [:jsmith]

Updated

•

12 years ago

Whiteboard: [WebRTC][blocking-webrtc+][webrtc-uplift] → [WebRTC][blocking-webrtc+][webrtc-uplift][qa-]

Jason Smith [:jsmith]

Comment 22

•

12 years ago

Was verified and confirmed that this patch fixes the crash in the dupe.

Status: RESOLVED → VERIFIED

Whiteboard: [WebRTC][blocking-webrtc+][webrtc-uplift][qa-] → [WebRTC][blocking-webrtc+][webrtc-uplift]

Randell Jesup [:jesup] (needinfo me)

Updated

•

12 years ago

Blocks: 861190

Randell Jesup [:jesup] (needinfo me)

Comment 23

•

12 years ago

Comment on attachment 735967 [details] [diff] [review] Fix thread crash [Approval Request Comment] Bug caused by (feature/regressing bug #): N/A User impact if declined: intermittent (null de-ref) crashes and debug MOZ_ASSERT crashes Testing completed (on m-c, etc.): On m-c. ALso used to verify that several other crash bugs reported were actually duplicates of this one. (Not all are marked as dups of this, as some were other bugs that triggered the RUN_ON_THREAD MOZ_ASSERT() as a side-effect. Risk to taking this patch (and alternatives if risky): Low; moves initialization of some important thread pointers to occur before they can used for Dispatch()es String or IDL/UUID changes made by this patch: none

Attachment #735967 - Flags: approval-mozilla-aurora?

Lukas Blakk [:lsblakk] use ?needinfo

Comment 24

•

12 years ago

Triage Comment: Since we've left the other untracked TURN bugs un-approved on this round, we'll hold off here just for now, it can be approved if/when bug 786235 and bug 855769 are landed.

Eric Rescorla (:ekr)

Reporter

Comment 25

•

12 years ago

This is not a TURN bug, it's just behind TURN in the commit history.

Adam Roach [:abr]

Assignee

Comment 26

•

12 years ago

Eric is 100% correct. This problem can and has shown up in a variety of non-TURN scenarios (most notably, if you simply replace "turn" with "stun," the problem remains). Unless we uplift this patch prior to shipping, we're going to end up with a situation that makes it very easy to crash the browser from javascript, both accidentally and maliciously.

Lukas Blakk [:lsblakk] use ?needinfo

Comment 27

•

12 years ago

Comment on attachment 735967 [details] [diff] [review] Fix thread crash Thanks for clarifying, we can take this uplift then since it will not impact whether we enable TURN support in FF22.

Attachment #735967 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Ryan VanderMeulen [:RyanVM]

Comment 28

•

12 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/673b1f899184

status-firefox22: affected → fixed

status-firefox23: --- → fixed

Maire Reavy [:mreavy]

Comment 29

•

12 years ago

Up on Aurora, removing webrtc-uplift

Whiteboard: [WebRTC][blocking-webrtc+][webrtc-uplift] → [WebRTC][blocking-webrtc+]

Ryan VanderMeulen [:RyanVM]

Comment 30

•

12 years ago

Backed out for bustage. Going to leave WebRTC uplifts for the pros from now on... https://hg.mozilla.org/releases/mozilla-aurora/rev/ed6f79077395 https://tbpl.mozilla.org/php/getParsedLog.php?id=21929064&tree=Mozilla-Aurora

status-firefox22: fixed → affected

Maire Reavy [:mreavy]

Updated

•

12 years ago

Priority: P1 → --

Whiteboard: [WebRTC][blocking-webrtc+] → [WebRTC][blocking-webrtc+][webrtc-uplift]

Adam Roach [:abr]

Assignee

Comment 31

•

12 years ago

(In reply to Ryan VanderMeulen [:RyanVM] from comment #30) > Backed out for bustage. Going to leave WebRTC uplifts for the pros from now > on... Even though this isn't a TURN bug, it does need to land on top of the TURN patches (or be ported to Aurora without those patches, which I would not recommend).

Randell Jesup [:jesup] (needinfo me)

Comment 32

•

12 years ago

Attached patch Fix thread crash (patch ported to Aurora) — Details — Splinter Review

Randell Jesup [:jesup] (needinfo me)

Comment 33

•

12 years ago

Comment on attachment 739874 [details] [diff] [review] Fix thread crash (patch ported to Aurora) This is the m-c patch ported to Aurora; the only non-mechanical merge change is changing addStunServer to addServer. With the signaling_unittest patch from bug 856644 (which we should also uplift), all signaling unittests pass. spraying the r? request because I'm not sure who's available.

Attachment #739874 - Flags: review?(ethanhugg)

Attachment #739874 - Flags: review?(ekr)

Attachment #739874 - Flags: review?(adam)

Ethan Hugg [:ehugg]

Comment 34

•

12 years ago

Comment on attachment 739874 [details] [diff] [review] Fix thread crash (patch ported to Aurora) Review of attachment 739874 [details] [diff] [review]: ----------------------------------------------------------------- r+ based on comparing it to previously reviewed M-C version.

Attachment #739874 - Flags: review?(ethanhugg) → review+

Randell Jesup [:jesup] (needinfo me)

Comment 35

•

12 years ago

Comment on attachment 739874 [details] [diff] [review] Fix thread crash (patch ported to Aurora) Carrying forward a=lsblakk on the un-bitrot for Aurora though I can't mark it.

Attachment #739874 - Flags: review?(ekr)

Attachment #739874 - Flags: review?(adam)

Randell Jesup [:jesup] (needinfo me)

Comment 36

•

12 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/a582bade4e53

status-firefox22: affected → fixed

Whiteboard: [WebRTC][blocking-webrtc+][webrtc-uplift] → [WebRTC][blocking-webrtc+]

u279076

Comment 37

•

12 years ago

(In reply to Jason Smith [:jsmith] from comment #22) > Was verified and confirmed that this patch fixes the crash in the dupe. Jason, is it safe to mark the status flags verified?

Jason Smith [:jsmith]

Comment 38

•

12 years ago

(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #37) > (In reply to Jason Smith [:jsmith] from comment #22) > > Was verified and confirmed that this patch fixes the crash in the dupe. > > Jason, is it safe to mark the status flags verified? For 23, yeah.

status-firefox23: fixed → verified

You need to log in before you can comment on or make changes to this bug.