webrtc::ViEEncoder crashes while WebRTC session is closing

RESOLVED FIXED in Firefox 34, Firefox OS v2.0

Status

()

Core
WebRTC
--
critical
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: diego, Assigned: jesup)

Tracking

({crash})

unspecified
mozilla34
ARM
Gonk (Firefox OS)
crash
Points:
---
Bug Flags:
in-moztrap +

Firefox Tracking Flags

(blocking-b2g:2.0+, firefox32 wontfix, firefox33 wontfix, firefox34 fixed, b2g-v2.0 fixed, b2g-v2.1 fixed)

Details

(Whiteboard: [caf priority: p2][CR 705426][b2g-crash], URL)

Attachments

(8 attachments, 2 obsolete attachments)

(Reporter)

Description

3 years ago
I see two crashes (attached) repeated several times while closing a device-to-device WebRTC session.

STR:

1. Set up WebRTC session on phone 1 and phone 2.
2. Close session on phone 1.
3. Attempt to close session on phone 2.
4. App crashes!
(Reporter)

Comment 1

3 years ago
Created attachment 8471165 [details]
webrtc::ViEEncoder::SendStatistics() crash minidump
(Reporter)

Comment 2

3 years ago
Created attachment 8471167 [details]
webrtc::ViEEncoder::SendStatistics() crash extra
(Reporter)

Comment 3

3 years ago
Created attachment 8471170 [details]
webrtc::ViEEncoder::~ViEEncoder() crash minidump
(Reporter)

Comment 4

3 years ago
Created attachment 8471171 [details]
webrtc::ViEEncoder::~ViEEncoder() crash extra
(Reporter)

Comment 5

3 years ago
[Blocking Requested - why for this release]: This is a major stability issue on one of the main features for V2.0
blocking-b2g: --- → 2.0?
Flags: needinfo?(rjesup)
(Reporter)

Updated

3 years ago
Blocks: 1041241

Updated

3 years ago
Severity: normal → critical
Keywords: crash
Whiteboard: [b2g-crash]
(Assignee)

Comment 6

3 years ago
I presume these were 2.0 builds; were they running on a KK base?  Were they using H.264?  Opt or debug?

"Set up WebRTC session on phone 1 and phone 2" - session using?  talky.io?  apptrtc?  Loop app?

"2. Close session on phone 1.
3. Attempt to close session on phone 2.
4. App crashes!"

Implies to me that it's the loop app.  Is there a timing component?

Thanks
Flags: needinfo?(rjesup)
(Assignee)

Comment 7

3 years ago
Also, if it's H.264, can you try with VP8?
Flags: needinfo?(dwilson)

Updated

3 years ago
blocking-b2g: 2.0? → 2.0+
(Assignee)

Comment 8

3 years ago
One more question: is this a (recent?) regression?  Since I haven't seen this before or heard it reported; I'm guessing it's a regression (unless it's a timing bug we hadn't hit before)
Assignee: nobody → rjesup
(Reporter)

Comment 9

3 years ago
Created attachment 8471659 [details]
Logcat during crashes
Flags: needinfo?(dwilson)
(Reporter)

Comment 10

3 years ago
These crashes happened on a KK 8x10 build with H.264 enabled (no debug).

I'm using my own homegrown test app based on SimpleWebRTC library [1]. It's not very different from the sample app in the project itself.

I want to say it's a recent regression, but I would have to do some extra testing to confirm.

https://github.com/HenrikJoreteg/SimpleWebRTC
(Reporter)

Comment 11

3 years ago
This is only reproducible with H.264. It works fine when I configure for VP8.
(Reporter)

Comment 12

3 years ago
Also, this is not reproducible on talky.io with the H.264 config in comment 10

Updated

3 years ago
Whiteboard: [b2g-crash] → [CR 705426][b2g-crash]

Updated

3 years ago
Whiteboard: [CR 705426][b2g-crash] → [caf priority: p2][CR 705426][b2g-crash]
(Reporter)

Comment 13

3 years ago
As per discussion with Randell, this may be related to bug 1047442

Updated

3 years ago
QA Whiteboard: [2.0-signoff-need+]

Comment 14

3 years ago
This will need a new test case to be covered.
QA Whiteboard: [2.0-signoff-need+] → [2.0-signoff-need?][QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(srapanan)
Test case added in moztrap:

https://moztrap.mozilla.org/manage/case/14330/
QA Whiteboard: [2.0-signoff-need?][QAnalyst-Triage?] → [2.0-signoff-need?][QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(srapanan)
Flags: in-moztrap+
(Reporter)

Comment 16

3 years ago
I tested attachment 8466234 [details] [diff] [review] in bug 1047442. It didn't help :(
(Reporter)

Comment 17

3 years ago
Created attachment 8473883 [details]
Logcat during crashes v2

Added "NSPR_LOG_MODULES=signaling:5,webrtc_trace:65535 WEBRTC_TRACE_FILE=nspr" for extra logging as per Randell's request
Attachment #8471659 - Attachment is obsolete: true
Flags: needinfo?(rjesup)
(Reporter)

Updated

3 years ago
Attachment #8473883 - Attachment description: logcat.txtLogcat during crashes v2 → Logcat during crashes v2
(Assignee)

Comment 18

3 years ago
Created attachment 8474173 [details] [diff] [review]
Ignore decode-complete callbacks during OMX decoder shutdown WIP

Diego: give this a try.  It appears it would resolve the problem I saw in the last log you uploaded; I'm not sure it would cover every crash you've seen.  Also, I modified the logging so that you can turn on signaling:5 and get all the OMXH264 logs without zillions of audio logs

Updated

3 years ago
QA Whiteboard: [2.0-signoff-need?][QAnalyst-Triage+] → [2.0-signoff-need+][QAnalyst-Triage+]
(Reporter)

Comment 19

3 years ago
Created attachment 8474626 [details]
Logcat with patch in attachment 8474173 [details] [diff] [review]
(Reporter)

Comment 20

3 years ago
I see the same two crashes with the WIP patch in attachment 8474173 [details] [diff] [review]. I attached the log.
(Assignee)

Comment 21

3 years ago
Created attachment 8474924 [details] [diff] [review]
Proxy thread destruction to avoid recursing event loops within OMX H264 shutdown

Diego: try this.....
(Reporter)

Comment 22

3 years ago
Comment on attachment 8474924 [details] [diff] [review]
Proxy thread destruction to avoid recursing event loops within OMX H264 shutdown

Review of attachment 8474924 [details] [diff] [review]:
-----------------------------------------------------------------

20 calls with no crash with this patch!
Attachment #8474924 - Flags: feedback+
(Assignee)

Comment 23

3 years ago
Comment on attachment 8474924 [details] [diff] [review]
Proxy thread destruction to avoid recursing event loops within OMX H264 shutdown

Note: Clone of the patch we did for GMP plugins to proxy the thread destruction.  The Conduit code didn't expect to recurse the event loop when shutting down a ViE (communications) channel (causing other conduits it interacts with to shut down while we're in mid-shutdown).
Attachment #8474924 - Flags: review?(benjamin)
Flags: needinfo?(rjesup)
(Assignee)

Comment 24

3 years ago
Created attachment 8475341 [details] [diff] [review]
Ignore decode-complete callbacks during OMX decoder shutdown WIP

Cleaned up - boot and suspenders patch to deal with async returns of decode buffers
(Assignee)

Updated

3 years ago
Attachment #8474173 - Attachment is obsolete: true
(Assignee)

Comment 25

3 years ago
Comment on attachment 8475341 [details] [diff] [review]
Ignore decode-complete callbacks during OMX decoder shutdown WIP

I'll take a review from whomever feels good doing so.  This is a pretty simple "safety" patch to deal with delayed frame decodes after we've started to shut down.
Attachment #8475341 - Flags: review?(jolin)
Attachment #8475341 - Flags: review?(cpearce)

Updated

3 years ago
QA Whiteboard: [2.0-signoff-need+][QAnalyst-Triage+] → [2.0-signoff-need+][QAnalyst-Triage+][lead-review+]
Attachment #8475341 - Flags: review?(jolin) → review+
Attachment #8475341 - Flags: review?(cpearce)
(Assignee)

Comment 26

3 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/4bb05704044d
Whiteboard: [caf priority: p2][CR 705426][b2g-crash] → [caf priority: p2][CR 705426][b2g-crash][webrtc-uplift]
Target Milestone: --- → mozilla34
https://hg.mozilla.org/mozilla-central/rev/4bb05704044d
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Forgot to mark this leave-open.  We actually need the other patch that's up for review to bsmedberg to fix the crash found by Diego.  Once that lands, we can close this bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Updated

3 years ago
Attachment #8474924 - Flags: review?(benjamin) → review+
(Assignee)

Comment 29

3 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/89e3e198baad
status-b2g-v2.0: --- → affected
status-firefox34: --- → affected
https://hg.mozilla.org/mozilla-central/rev/89e3e198baad
Status: REOPENED → RESOLVED
Last Resolved: 3 years ago3 years ago
Resolution: --- → FIXED
https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/82b04a3ec0f4
https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/7faa4942d230
status-b2g-v2.0: affected → fixed
status-b2g-v2.1: --- → fixed
status-firefox32: --- → wontfix
status-firefox33: --- → wontfix
status-firefox34: affected → fixed
(Assignee)

Updated

3 years ago
Whiteboard: [caf priority: p2][CR 705426][b2g-crash][webrtc-uplift] → [caf priority: p2][CR 705426][b2g-crash]
You need to log in before you can comment on or make changes to this bug.