Closed Bug 1052169 Opened 8 years ago Closed 8 years ago

webrtc::ViEEncoder crashes while WebRTC session is closing


(Core :: WebRTC, defect)

Gonk (Firefox OS)
Not set



blocking-b2g 2.0+
Tracking Status
firefox32 --- wontfix
firefox33 --- wontfix
firefox34 --- fixed
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed


(Reporter: diego, Assigned: jesup)




(Keywords: crash, Whiteboard: [caf priority: p2][CR 705426][b2g-crash])


(8 files, 2 obsolete files)

I see two crashes (attached) repeated several times while closing a device-to-device WebRTC session.


1. Set up WebRTC session on phone 1 and phone 2.
2. Close session on phone 1.
3. Attempt to close session on phone 2.
4. App crashes!
[Blocking Requested - why for this release]: This is a major stability issue on one of the main features for V2.0
blocking-b2g: --- → 2.0?
Flags: needinfo?(rjesup)
Severity: normal → critical
Keywords: crash
Whiteboard: [b2g-crash]
I presume these were 2.0 builds; were they running on a KK base?  Were they using H.264?  Opt or debug?

"Set up WebRTC session on phone 1 and phone 2" - session using?  apptrtc?  Loop app?

"2. Close session on phone 1.
3. Attempt to close session on phone 2.
4. App crashes!"

Implies to me that it's the loop app.  Is there a timing component?

Flags: needinfo?(rjesup)
Also, if it's H.264, can you try with VP8?
Flags: needinfo?(dwilson)
blocking-b2g: 2.0? → 2.0+
One more question: is this a (recent?) regression?  Since I haven't seen this before or heard it reported; I'm guessing it's a regression (unless it's a timing bug we hadn't hit before)
Assignee: nobody → rjesup
Attached file Logcat during crashes (obsolete) —
Flags: needinfo?(dwilson)
These crashes happened on a KK 8x10 build with H.264 enabled (no debug).

I'm using my own homegrown test app based on SimpleWebRTC library [1]. It's not very different from the sample app in the project itself.

I want to say it's a recent regression, but I would have to do some extra testing to confirm.
This is only reproducible with H.264. It works fine when I configure for VP8.
Also, this is not reproducible on with the H.264 config in comment 10
Whiteboard: [b2g-crash] → [CR 705426][b2g-crash]
Whiteboard: [CR 705426][b2g-crash] → [caf priority: p2][CR 705426][b2g-crash]
As per discussion with Randell, this may be related to bug 1047442
QA Whiteboard: [2.0-signoff-need+]
This will need a new test case to be covered.
QA Whiteboard: [2.0-signoff-need+] → [2.0-signoff-need?][QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(srapanan)
Test case added in moztrap:
QA Whiteboard: [2.0-signoff-need?][QAnalyst-Triage?] → [2.0-signoff-need?][QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(srapanan)
Flags: in-moztrap+
I tested attachment 8466234 [details] [diff] [review] in bug 1047442. It didn't help :(
Added "NSPR_LOG_MODULES=signaling:5,webrtc_trace:65535 WEBRTC_TRACE_FILE=nspr" for extra logging as per Randell's request
Attachment #8471659 - Attachment is obsolete: true
Flags: needinfo?(rjesup)
Attachment #8473883 - Attachment description: logcat.txtLogcat during crashes v2 → Logcat during crashes v2
Diego: give this a try.  It appears it would resolve the problem I saw in the last log you uploaded; I'm not sure it would cover every crash you've seen.  Also, I modified the logging so that you can turn on signaling:5 and get all the OMXH264 logs without zillions of audio logs
QA Whiteboard: [2.0-signoff-need?][QAnalyst-Triage+] → [2.0-signoff-need+][QAnalyst-Triage+]
I see the same two crashes with the WIP patch in attachment 8474173 [details] [diff] [review]. I attached the log.
Comment on attachment 8474924 [details] [diff] [review]
Proxy thread destruction to avoid recursing event loops within OMX H264 shutdown

Review of attachment 8474924 [details] [diff] [review]:

20 calls with no crash with this patch!
Attachment #8474924 - Flags: feedback+
Comment on attachment 8474924 [details] [diff] [review]
Proxy thread destruction to avoid recursing event loops within OMX H264 shutdown

Note: Clone of the patch we did for GMP plugins to proxy the thread destruction.  The Conduit code didn't expect to recurse the event loop when shutting down a ViE (communications) channel (causing other conduits it interacts with to shut down while we're in mid-shutdown).
Attachment #8474924 - Flags: review?(benjamin)
Flags: needinfo?(rjesup)
Cleaned up - boot and suspenders patch to deal with async returns of decode buffers
Attachment #8474173 - Attachment is obsolete: true
Comment on attachment 8475341 [details] [diff] [review]
Ignore decode-complete callbacks during OMX decoder shutdown WIP

I'll take a review from whomever feels good doing so.  This is a pretty simple "safety" patch to deal with delayed frame decodes after we've started to shut down.
Attachment #8475341 - Flags: review?(jolin)
Attachment #8475341 - Flags: review?(cpearce)
QA Whiteboard: [2.0-signoff-need+][QAnalyst-Triage+] → [2.0-signoff-need+][QAnalyst-Triage+][lead-review+]
Attachment #8475341 - Flags: review?(jolin) → review+
Whiteboard: [caf priority: p2][CR 705426][b2g-crash] → [caf priority: p2][CR 705426][b2g-crash][webrtc-uplift]
Target Milestone: --- → mozilla34
Closed: 8 years ago
Resolution: --- → FIXED
Forgot to mark this leave-open.  We actually need the other patch that's up for review to bsmedberg to fix the crash found by Diego.  Once that lands, we can close this bug.
Resolution: FIXED → ---
Attachment #8474924 - Flags: review?(benjamin) → review+
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
Whiteboard: [caf priority: p2][CR 705426][b2g-crash][webrtc-uplift] → [caf priority: p2][CR 705426][b2g-crash]
You need to log in before you can comment on or make changes to this bug.