Closed
Bug 799419
Opened 12 years ago
Closed 12 years ago
WebRTC Assertion failure: mod != NULL, at pk11slot.c:1766
Categories
(Core :: WebRTC: Networking, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla19
Tracking | Status | |
---|---|---|
firefox15 | --- | unaffected |
firefox16 | --- | unaffected |
firefox17 | --- | unaffected |
firefox18 | - | affected |
firefox19 | --- | fixed |
firefox-esr10 | --- | unaffected |
People
(Reporter: posidron, Assigned: jesup)
References
Details
(Keywords: crash, testcase, Whiteboard: [WebRTC], [blocking-webrtc+] [qa-])
Attachments
(3 files, 4 obsolete files)
594 bytes,
text/html
|
Details | |
1.85 KB,
patch
|
jesup
:
review+
briansmith
:
review+
|
Details | Diff | Splinter Review |
1.59 KB,
patch
|
jesup
:
review+
|
Details | Diff | Splinter Review |
The testcase crashes with an m-c optimized -O1 build.
changeset: 109677:87f3548e638e (tip)
Reporter | ||
Updated•12 years ago
|
status-firefox-esr10:
--- → unaffected
status-firefox15:
--- → unaffected
status-firefox16:
--- → unaffected
status-firefox17:
--- → unaffected
status-firefox18:
--- → affected
tracking-firefox18:
--- → ?
Comment 1•12 years ago
|
||
Does it also crash with {video: true, fake: true}?
Reporter | ||
Comment 2•12 years ago
|
||
Yup
Assignee | ||
Comment 3•12 years ago
|
||
Also crashes with a debug build
The stack backtrace is:
#2 0x00007ffff7ac3e53 in PR_Assert (s=0x7ffff72348f4 "mod != NULL", file=0x7ffff7234818 "pk11slot.c", ln=1766) at ../../../../../nsprpub/pr/src/io/prlog.c:554
#3 0x00007ffff71450e2 in PK11_GetInternalSlot () at pk11slot.c:1766
#4 0x00007ffff43929e9 in mozilla::nr_crypto_nss_random_bytes (buf=0x7fffffff4930 "%K\027\365\377\177", len=4) at nricectx.cpp:93
#5 0x00007ffff437873d in nr_ice_random_string (str=0x7fffffff49e0 "", len=8) at ../../../../../../media/mtransport/third_party/nICEr/src/ice/ice_ctx.c:439
#6 0x00007ffff4377af5 in nr_ice_ctx_create (label=0x7fffe8c741a8 "PC:a07cadf4ff7f0000", flags=5, ctxp=0x7fffe71acb38)
at ../../../../../../media/mtransport/third_party/nICEr/src/ice/ice_ctx.c:209
#7 0x00007ffff4393715 in mozilla::NrIceCtx::Create (name="PC:a07cadf4ff7f0000", offerer=true, set_interface_priorities=true) at nricectx.cpp:291
#8 0x00007ffff43ed1fd in sipcc::PeerConnectionImpl::Initialize (this=0x7fffd71f1d50, aObserver=0x7fffd8cfed80, aWindow=0x7fffd9764000, aThread=0x7ffff7d3d230)
at ../../../../../media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp:411
Assignee | ||
Comment 4•12 years ago
|
||
Pretty sure this is a "called nss without being nss properly initialized" issue.
Assignee | ||
Updated•12 years ago
|
Whiteboard: [WebRTC] → [WebRTC], [blocking-webrtc+]
Comment 5•12 years ago
|
||
Yes. You will recall that bsmith has been hassling us about making sure that NSS is available in PC? This is it.
Bsmith: can you point us to the incantations we need to catch NSS startup and NSS shutdown?
Comment 6•12 years ago
|
||
To initialize PSM, which is responsible for initializing NSS:
http://mxr.mozilla.org/mozilla-central/source/toolkit/identity/IdentityCryptoService.cpp?rev=4af7c8a2159f#220
For shutdown, you need to observe the "profile-change-net-teardown" event, and tear down all network connections and stop using NSS and PSM functionality. After that point, you should always fail to create any new WebRTC connections when requested.
if you need to use NSS beyond the "profile-change-net-teardown" event, then you need to implement the nsNSSShutdownObject protocol, which is documented in https://mxr.mozilla.org/mozilla/source/security/manager/ssl/src/nsNSSShutDown.h#197
Comment 7•12 years ago
|
||
You should look at how nsHttpHandler deals with profile-change-net-teardown and profile-change-net-restore.
Handling those events in a similar way will also get you support for offline mode "for free".
Also, I see that DataChannel.cpp is already observing "profile-change-net-teardown" but not "profile-change-net-restore".
Updated•12 years ago
|
Updated•12 years ago
|
Priority: -- → P1
Comment 8•12 years ago
|
||
Not tracking this for FF18,considering there is no movement on the bug and as WebRTC lands as an experimental feature in FF18 , not a blocker for release.
Updated•12 years ago
|
Flags: in-testsuite?
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → rjesup
Assignee | ||
Comment 9•12 years ago
|
||
Work-in-process patch - Doesn't actually disable network access in the transportlayerfoo classes. For some reason I haven't figured out, ::Observe() is never getting called (I verified AddObserver was called) on linux debug (where I'm testing) when closing the window, Quitting, or going offline and back.
Assignee | ||
Updated•12 years ago
|
Attachment #682807 -
Flags: feedback?(mcmanus)
Attachment #682807 -
Flags: feedback?(ekr)
Attachment #682807 -
Flags: feedback?(bsmith)
Comment 10•12 years ago
|
||
I'm not sure this is the angle I would take. A few comments:
There are a lot of places in WebRTC that use NSS and/or STS so I think individually disabling all of them is problematic. Instead, what I would do is catch the event in a single place and then destroy all the PCs. This will shut down all use of NSS and the STS without having to have catches in a whole bunch of places. Certainly there's no reason why we couldn't force NSS *startup* in PC.
I'm especially concerned about having TransportFlow do this, because it has no current dependency on either of these services. If we *must* have elements of mtransport monitor this stuff (and again, I think this is a bad idea), then that should be limited to those which use NSS and/or STS.
Comment 11•12 years ago
|
||
Brian,
Can you elaborate on the conditions when "profile-change-net-teardown" fires? Is it both offline and change of profile?
My view is that when the profile changes that's like a differnet user and we should tear down the call. Maybe not so much with network failures.
Regardless, I suggest that for now we just fix the startup problem which we know is causing crashes and is simple and leave the shutdown problem for later.
Comment 12•12 years ago
|
||
See https://developer.mozilla.org/en-US/docs/Observer_Notifications
and https://wiki.mozilla.org/XPCOM_Shutdown
and https://groups.google.com/forum/#!topic/mozilla.dev.platform/cMQ0-ICOW-w
We don't actually support profile switching anymore. so net-teardown should only occur during shutdown and maybe when entering offline mode. ("Work Offline" is in the developer menu, instead of the main menu, so I'm guessing that it is now intended as a "simulate what happens when there is no network connection" feature at this point.)
I agree with Eric that you should have a single place that observes the event and tears down all connections, instead of handling shutdown in multiple places. If you do that in net-teardown, then you will still have NSS alive throughout, as NSS is shut down in profile-change-teardown, which occurs after net-teardown. Just make sure you don't allow any data channels to be created between net-teardown and net-restore. (On shutdown, you will never get a net-restore, but I guess you may get one if somebody turns off Offline Mode after turning it on. Test it out and see.)
Comment 13•12 years ago
|
||
(In reply to Brian Smith (:bsmith) from comment #12)
> ("Work Offline" is in the developer menu, instead of the main menu, so I'm
> guessing that it is now intended as a "simulate what happens when there is
> no network connection" feature at this point.)
Ignore the above sentence.
> Just make sure you don't allow any data channels
s/data channels/PeerConnections/.
Put another way, you must ensure that after net-teardown that you don't use NSS anymore unless/until you get a net-restore.
Assignee | ||
Comment 14•12 years ago
|
||
ok, revised heavily to Close() all peerconnections on teardown or incipient offline status ('work offline'). Note that for this, I've made it synchronous close() because once we return from the observer it will continue on with shutdown.
Assignee | ||
Updated•12 years ago
|
Attachment #682807 -
Attachment is obsolete: true
Attachment #682807 -
Flags: feedback?(mcmanus)
Attachment #682807 -
Flags: feedback?(ekr)
Attachment #682807 -
Flags: feedback?(bsmith)
Assignee | ||
Comment 15•12 years ago
|
||
Comment on attachment 682857 [details] [diff] [review]
Watch network (tear)down events and kill PeerConnections; ensure init of NSS
Note: I realize I need to rev the UUID in IPeerConnection.idl as well
Attachment #682857 -
Flags: review?(ekr)
Attachment #682857 -
Flags: review?(bugs)
Attachment #682857 -
Flags: review?(bsmith)
Comment 16•12 years ago
|
||
Comment 17•12 years ago
|
||
Comment on attachment 682857 [details] [diff] [review]
Watch network (tear)down events and kill PeerConnections; ensure init of NSS
Review of attachment 682857 [details] [diff] [review]:
-----------------------------------------------------------------
I would suggest solving this bug with a patch like the one I attached and then file a separate bug to address the shutdown problem.
::: media/mtransport/transportlayerdtls.cpp
@@ +437,5 @@
> if (verification_mode_ == VERIFY_UNSET) {
> MOZ_MTLOG(PR_LOG_ERROR, "Can't start DTLS without specifying a verification mode");
> return false;
> }
>
This doesn't happen early enough. If you look at the stack trace you will see that the error is happening in PCImpl::Initialize(), long before any DTLS structures are fired up. That's because both ICE and PC itself use NSS.
I believe a better approach would be to allow mtransport to assume NSS is available and instead to set it up at the top of PeerConnectionImpl. I've attached a patch that does that.
::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +964,5 @@
> // Note that no media calls may be made after this point
> // because we have removed the pointer.
> // Forget the reference so that we can transfer it to
> // SelfDestruct().
> RUN_ON_THREAD(mThread, WrapRunnable(mMedia.forget().get(),
I believe that using NS_DISPATCH_SYNC here will result in a regression on bug 811183, which resulted from synchronously closing.
Attachment #682857 -
Flags: review?(ekr) → review-
Assignee | ||
Comment 18•12 years ago
|
||
Comment on attachment 682888 [details] [diff] [review]
Force NSS startup
Good by me. All uses of NSS via transport/ice/etc are hung off PeerConnection, so this is good. If mtransport/etc is ever used independently of PeerConnection, then we'd have to move or copy this down into that code.
Attachment #682888 -
Flags: review+
Assignee | ||
Comment 19•12 years ago
|
||
Comment on attachment 682888 [details] [diff] [review]
Force NSS startup
Should be an easy review
Attachment #682888 -
Flags: review?(bsmith)
Assignee | ||
Comment 20•12 years ago
|
||
Comment on attachment 682857 [details] [diff] [review]
Watch network (tear)down events and kill PeerConnections; ensure init of NSS
Now in the dependent bug
Attachment #682857 -
Attachment is obsolete: true
Attachment #682857 -
Flags: review?(bugs)
Attachment #682857 -
Flags: review?(bsmith)
Comment 21•12 years ago
|
||
Comment on attachment 682888 [details] [diff] [review]
Force NSS startup
Review of attachment 682888 [details] [diff] [review]:
-----------------------------------------------------------------
::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +370,5 @@
> +#ifdef MOZILLA_INTERNAL_API
> + // This code interferes with the C++ unit test startup code.
> + nsCOMPtr<nsISupports> nssDummy = do_GetService("@mozilla.org/psm;1", &res);
> + NS_ENSURE_SUCCESS(res, res);
> +#endif
IMO, the C++ unit tests should be changed to initialize NSS the same way Gecko does, so that this is a non-issue. Not in this patch though.
Use the name "rv" for nsresults or "nrv" when there is another rv variable in the same scope.
Attachment #682888 -
Flags: review?(bsmith) → review+
Assignee | ||
Comment 22•12 years ago
|
||
Target Milestone: --- → mozilla19
Comment 23•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 24•12 years ago
|
||
Jesse, which type of test could be used here or don't we have a chance at all to get this tested? Or even not worth the time?
Updated•12 years ago
|
Flags: needinfo?(jruderman)
Comment 25•12 years ago
|
||
The testcase looks like it would work as a crashtest.
Flags: needinfo?(jruderman)
Updated•12 years ago
|
Whiteboard: [WebRTC], [blocking-webrtc+] → [WebRTC], [blocking-webrtc+] [qa-]
Comment 26•12 years ago
|
||
This is not fixed and the testcase still crashes the browser. In this case I have tested with the recent nightly build Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:20.0) Gecko/20.0 Firefox/20.0 ID:20121125030729
Crash report: bp-1c2c040f-1bcd-467e-bb30-e269c2121126
0 @0x10b6ecb80
1 CMIOUnits CMIOUnits@0x63e4b
2 CMIOUnits CMIOUnits@0x63d95
3 VDC VDC@0x16a73
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 27•12 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #26)
> This is not fixed and the testcase still crashes the browser. In this case I
> have tested with the recent nightly build Mozilla/5.0 (Macintosh; Intel Mac
> OS X 10.7; rv:20.0) Gecko/20.0 Firefox/20.0 ID:20121125030729
>
> Crash report: bp-1c2c040f-1bcd-467e-bb30-e269c2121126
>
> 0 @0x10b6ecb80
> 1 CMIOUnits CMIOUnits@0x63e4b
> 2 CMIOUnits CMIOUnits@0x63d95
> 3 VDC VDC@0x16a73
Hmm...the stack doesn't tell us much. Is this the same bug with the same stack or a different bug? I recall there being some other bug open that was tracking a crash that would happen after this patch was applied as a followup.
Comment 28•12 years ago
|
||
Make probably more sense to file it as a new bug. I will do it right away.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 30•12 years ago
|
||
Updated•12 years ago
|
Attachment #688161 -
Attachment is obsolete: true
Comment 31•12 years ago
|
||
Updated•12 years ago
|
Attachment #688162 -
Attachment is obsolete: true
Comment 32•12 years ago
|
||
Comment 33•12 years ago
|
||
Comment on attachment 688164 [details] [diff] [review]
Crashtest v3
Here's another crashtest that basically ports the test case into a crashtest with fake video instead.
Attachment #688164 -
Flags: review?(rjesup)
Assignee | ||
Updated•12 years ago
|
Attachment #688164 -
Flags: review?(rjesup) → review+
Updated•12 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 34•12 years ago
|
||
Comment on attachment 688164 [details] [diff] [review]
Crashtest v3
https://hg.mozilla.org/integration/mozilla-inbound/rev/2d2e54c90ae0
Assignee | ||
Updated•12 years ago
|
Keywords: checkin-needed
Comment 35•12 years ago
|
||
Updated•12 years ago
|
Flags: in-testsuite? → in-testsuite+
Updated•12 years ago
|
status-firefox19:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•