Closed Bug 986762 Opened 6 years ago Closed 6 years ago

Intermittent test_dataChannel_basicAudioVideo.html | application crashed [@ mozilla::DataChannelConnection::~DataChannelConnection()]

Categories

(Core :: WebRTC, defect, critical)

x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla31
Tracking Status
firefox28 --- wontfix
firefox29 + fixed
firefox30 --- fixed
firefox31 --- fixed
firefox-esr24 --- unaffected
b2g-v1.3 ? affected
b2g-v1.3T --- affected
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed

People

(Reporter: philor, Assigned: jesup)

References

Details

(Keywords: crash, intermittent-failure, Whiteboard: [webrtc-uplift])

Attachments

(2 files, 3 obsolete files)

https://tbpl.mozilla.org/php/getParsedLog.php?id=36531919&tree=Mozilla-Inbound
Ubuntu VM 12.04 mozilla-inbound opt test mochitest-3 on 2014-03-21 17:14:50 PDT for push aa1886be6aec
slave: tst-linux32-spot-748

17:20:16     INFO -  -1531647168[9fa91e40]: [GSM Task|fsm_sm] fsm.c:157: SIPCC-GSM_DBG_PTR: FSM 10  : fsm_get_fcb_by_call_id_and_type    : fcb= (nil)
17:20:16     INFO -  -1219873024[b721a240]: [main|PeerConnectionImpl] PeerConnectionImpl.cpp:1621: CloseInt: Destroying DataChannelConnection 0xa16e9f00 for c6034164134ebc06
17:20:16     INFO -  -1219873024[b721a240]: Destroying DataChannelConnection a16e9f00
17:20:16     INFO -  -1219873024[b721a240]: Closing all channels (connection a16e9f00)
17:20:16     INFO -  -1219873024[b721a240]: Deregistered a16e9f00 from the SCTP stack.
17:20:18  WARNING -  TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/mochitest/test_dataChannel_basicAudioVideo.html | application terminated with exit code 11
17:20:18     INFO -  INFO | runtests.py | Application ran for: 0:02:47.388968
17:20:18     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpK5yOvZpidlog
17:20:18     INFO -  ==> process 2381 launched child process 2428
17:20:18     INFO -  ==> process 2381 launched child process 2454
17:20:18     INFO -  ==> process 2381 launched child process 2477
17:20:18     INFO -  ==> process 2381 launched child process 2493
17:20:18     INFO -  ==> process 2381 launched child process 2553
17:20:18     INFO -  INFO | zombiecheck | Checking for orphan process with PID: 2428
17:20:18     INFO -  INFO | zombiecheck | Checking for orphan process with PID: 2454
17:20:18     INFO -  INFO | zombiecheck | Checking for orphan process with PID: 2477
17:20:18     INFO -  INFO | zombiecheck | Checking for orphan process with PID: 2493
17:20:18     INFO -  INFO | zombiecheck | Checking for orphan process with PID: 2553
17:20:18     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux/1395444434/firefox-31.0a1.en-US.linux-i686.crashreporter-symbols.zip
17:20:46  WARNING -  PROCESS-CRASH | /tests/dom/media/tests/mochitest/test_dataChannel_basicAudioVideo.html | application crashed [@ mozilla::DataChannelConnection::~DataChannelConnection()]
17:20:46     INFO -  Crash dump filename: /tmp/tmpZEpRIk/minidumps/6b5469cc-979b-089b-3c3a3b60-7641dad7.dmp
17:20:46     INFO -  Operating system: Linux
17:20:46     INFO -                    0.0.0 Linux 3.2.0-23-generic-pae #36-Ubuntu SMP Tue Apr 10 22:19:09 UTC 2012 i686
17:20:46     INFO -  CPU: x86
17:20:46     INFO -       GenuineIntel family 6 model 26 stepping 5
17:20:46     INFO -       1 CPU
17:20:46     INFO -  Crash reason:  SIGSEGV
17:20:46     INFO -  Crash address: 0x0
17:20:46     INFO -  Thread 63 (crashed)
17:20:46     INFO -   0  libxul.so!mozilla::DataChannelConnection::~DataChannelConnection() [DataChannel.cpp:aa1886be6aec : 223 + 0x0]
17:20:46     INFO -      eip = 0xb3cf3de1   esp = 0x9209b010   ebp = 0x9209b038   ebx = 0xb6e885a8
17:20:46     INFO -      esi = 0xa0e47700   edi = 0x9209b0f8   eax = 0x99844f9c   ecx = 0x00000000
17:20:46     INFO -      edx = 0x99844f9c   efl = 0x00010297
17:20:46     INFO -      Found by: given as instruction pointer in context
17:20:46     INFO -   1  libxul.so!mozilla::DataChannelConnection::~DataChannelConnection() [DataChannel.cpp:aa1886be6aec : 240 + 0x8]
17:20:46     INFO -      eip = 0xb3cf3f78   esp = 0x9209b040   ebp = 0x9209b058   ebx = 0xb6e885a8
17:20:46     INFO -      esi = 0xa0e47700   edi = 0x9209b0f8
17:20:46     INFO -      Found by: call frame info
17:20:46     INFO -   2  libxul.so!mozilla::DataChannelConnection::Release() [DataChannel.cpp:aa1886be6aec : 295 + 0xb]
17:20:46     INFO -      eip = 0xb3cef080   esp = 0x9209b060   ebp = 0x9209b088   ebx = 0xb6e885a8
17:20:46     INFO -      esi = 0xa0e47700   edi = 0x9209b0f8
17:20:46     INFO -      Found by: call frame info
17:20:46     INFO -   3  libxul.so!nsRefPtr<mozilla::DataChannelConnection>::~nsRefPtr() [nsAutoPtr.h:aa1886be6aec : 894 + 0x8]
17:20:46     INFO -      eip = 0xb3cef4ae   esp = 0x9209b090   ebp = 0x9209b0a8   ebx = 0xb6e885a8
17:20:46     INFO -      esi = 0xa1766850   edi = 0x9209b0f8
17:20:46     INFO -      Found by: call frame info
17:20:46     INFO -   4  libxul.so!mozilla::DataChannelConnection::ReadBlob(already_AddRefed<mozilla::DataChannelConnection>, unsigned short, nsIInputStream*) [DataChannel.cpp:aa1886be6aec : 2355 + 0xa]
17:20:46     INFO -      eip = 0xb3cefcb3   esp = 0x9209b0b0   ebp = 0x9209b128   ebx = 0xb6e885a8
17:20:46     INFO -      esi = 0xa1766850   edi = 0x9209b0f8
17:20:46     INFO -      Found by: call frame info
17:20:46     INFO -   5  libxul.so!mozilla::ReadBlobRunnable::Run() [DataChannel.cpp:aa1886be6aec : 2289 + 0x4]
17:20:46     INFO -      eip = 0xb3cefd04   esp = 0x9209b130   ebp = 0x9209b158   ebx = 0xb6e885a8
17:20:46     INFO -      esi = 0xaac368c0   edi = 0x00000000
17:20:46     INFO -      Found by: call frame info
Attachment #8395299 - Attachment is obsolete: true
also fixes missing check of return value for Available() - can fail if the file is read-protected, for example
Attachment #8395322 - Attachment is obsolete: true
Attachment #8395325 - Flags: review?(khuey)
This is a safe MOZ_CRASH due to using ASSERT_WEBRTC() for critical thread-safety asserts in DataChannels, which asserts even in opt/release builds.
Note: this failure is very hard to hit (either 1 or 2 failures since Feb 1) since the blob read has to fail *and* all other references to the DataChannelConnection go away while this runnable is in flight.
I would suggest not fixing this for 1.3, given how hard it is to hit (almost impossible) and that it fails with a safe crash.
Comment on attachment 8395325 [details] [diff] [review]
don't release DataChannelConnection on transient thread on readblob failure

>+    // We must release DataChannelConnection on MainThread to avoid issues (bug 876167)
>+    DataChannelConnection *connection;
>+    nsRefPtr<DataChannelConnection> forgettable(aThis);
>+    forgettable.forget(&connection);
>+    NS_ProxyRelease(mainThread, connection);
NS_ProxyRelease(mainThread, aThis.take());
should work.
Attachment #8395325 - Flags: review?(khuey) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/82b3e6f5e0ae
With switch to .take()
Target Milestone: --- → mozilla31
https://hg.mozilla.org/mozilla-central/rev/82b3e6f5e0ae
Assignee: nobody → rjesup
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Comment on attachment 8395325 [details] [diff] [review]
don't release DataChannelConnection on transient thread on readblob failure

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 892630

User impact if declined: Crashes (MOZ_CRASH) in quite rare instances; rare tbpl oranges

Testing completed (on m-c, etc.): On m-c, green Try, manual testing

Risk to taking this patch (and alternatives if risky): Very low risk; moves a release to MainThread, and avoids a possible minor problem if you try to send an unreadable blob (which normally will still cleanly fail anyways).

String or IDL/UUID changes made by this patch: None
Attachment #8395325 - Flags: approval-mozilla-aurora?
Attachment #8395325 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8395325 [details] [diff] [review]
don't release DataChannelConnection on transient thread on readblob failure

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 892630

User impact if declined: Rare TBPL oranges, possible but very unlikely safe crash in the field (basically when trying to send a read-protected file).

Testing completed (on m-c, etc.): on m-c and aurora

Risk to taking this patch (and alternatives if risky): very low.

String or IDL/UUID changes made by this patch: none
Attachment #8395325 - Flags: approval-mozilla-beta?
Attachment #8395325 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
marking for 1.3? to resolve if this should be taken for 1.3 or not. In comment 6 I suggest it not be since it's a extremely hard to hit, totally safe crash (MOZ_CRASH).  The only repeatable way to provoke it is to try to send a read-only file, which I suspect would be very rare if ever be hit on FxOS.

If it's not, we should mark 1.3/1.3T as wontfix.
.take() -> .get() and add the needed include for Beta
Attachment #8398632 - Attachment is obsolete: true
Comment on attachment 8398633 [details] [diff] [review]
Don't release DataChannelConnection on transient thread on readblob failure. (beta)

Kyle: this needed tweaking since your already_AddRefed changes aren't on Beta.

For completeness, asking for beta approval again.  Local build is happy.
Attachment #8398633 - Flags: review?(khuey)
Attachment #8398633 - Flags: approval-mozilla-beta?
FYI, I am waiting the review before accepting the uplift.
Attachment #8398633 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Blocks: 1030372
You need to log in before you can comment on or make changes to this bug.