1827021 - Crash in [@ mozilla::JsepTrackNegotiatedDetails::GetEncodingCount] on poison value

Andrew McCreight [:mccr8]

Reporter

Description

•

2 years ago

Crash report: https://crash-stats.mozilla.org/report/index/42828703-f8b6-4f3f-b24b-5b8bb0230403

Reason: SIGSEGV / SEGV_MAPERR

Top 10 frames of crashing thread:

0  libxul.so  std::__ndk1::vector<mozilla::UniquePtr<mozilla::JsepTrackEncoding, mozilla::DefaultDelete<mozilla::JsepTrackEncoding> >, std::__ndk1::allocator<mozilla::UniquePtr<mozilla::JsepTrackEncoding, mozilla::DefaultDelete<mozilla::JsepTrackEncoding> > > >::size const  /builds/worker/fetches/android-ndk/sources/cxx-stl/llvm-libc++/include/vector:656
0  libxul.so  mozilla::JsepTrackNegotiatedDetails::GetEncodingCount const  dom/media/webrtc/jsep/JsepTrack.h:42
0  libxul.so  mozilla::dom::RTCRtpTransceiver::GetNegotiatedRecvCodecs const  dom/media/webrtc/jsapi/RTCRtpTransceiver.cpp:708
0  libxul.so  mozilla::PeerConnectionImpl::GetCodecStats  dom/media/webrtc/jsapi/PeerConnectionImpl.cpp:3157
1  libxul.so  mozilla::PeerConnectionImpl::GetStats  dom/media/webrtc/jsapi/PeerConnectionImpl.cpp:3268
2  libxul.so  mozilla::PeerConnectionCtx::EverySecondTelemetryCallback_m  dom/media/webrtc/jsapi/PeerConnectionCtx.cpp:396
3  libxul.so  mozilla::detail::VariantImplementation<unsigned char,   mfbt/Variant.h
3  libxul.so  mozilla::detail::VariantImplementation<unsigned char,   mfbt/Variant.h:318
3  libxul.so  mozilla::detail::VariantImplementation<unsigned char,   mfbt/Variant.h:318
3  libxul.so  mozilla::detail::VariantImplementation<unsigned char,   mfbt/Variant.h:318

Looking back at the last 6 months, I only see these crashes on 110 and 111, even when I scan the protosignature for JsepTrackNegotiatedDetails, so I don't know if it might be a regression. There are 26 crashes with this signature on poison values in the last 6 months.

Jim Mathies [:jimm]

Updated

•

2 years ago

Blocks: webrtc-triage

Jim Mathies [:jimm]

Updated

•

1 year ago

Flags: needinfo?(docfaraday)

Jim Mathies [:jimm]

Comment 1

•

1 year ago

Byron will triage this once he's back.

Byron Campen [:bwc]

Assignee

Comment 2

•

1 year ago

It looks like this was happening every 2-3 days on average until 112 was released last week. The bug might already be gone, but I'll look at this again later to see.

Andrew McCreight [:mccr8]

Reporter

Comment 3

•

1 year ago

The rollout of 112 was halted very quickly due to a cookie regression, and only unfrozen with the release of 112.0.1 today, so I think it is too early to say anything about what is happening on 112 in release, unfortunately.

Byron Campen [:bwc]

Assignee

Comment 4

•

1 year ago

So, looking at this, it is not at all clear to me how we're ending up there. I'm pretty sure that the only way to observe this is for there to be a RefPtr<JsepTransceiver> holding onto a freed pointer. While JsepTransceiver has a threadsafe refcount (that it does not need), I guess there's some chance that a RefPtr<JsepTransceiver> itself is being used on more than one thread, which could lead to a bad pointer inside. Giving JsepTransceiver a non-threadsafe refcount is probably warranted, especially if it will help us detect a rare misuse of the RefPtr.

https://treeherder.mozilla.org/jobs?repo=try&revision=b91022146b9ad833d5aedd3435242908f5614ce4
https://treeherder.mozilla.org/jobs?repo=try&revision=dfbe8876eb7e00313e581b4f94494dd960214a6d

Byron Campen [:bwc]

Assignee

Comment 5

•

1 year ago

We might actually be able to ditch the refcount on JsepTransceiver entirely, now that I think about it. I'll look closer at that tomorrow.

Flags: needinfo?(docfaraday)

Byron Campen [:bwc]

Assignee

Updated

•

1 year ago

Depends on: 1829667

Andrew McCreight [:mccr8]

Reporter

Comment 6

•

1 year ago

I found another little cluster of Jsep Android crashes. There are 16 like this in the last month.
I searched for: Fenix, "proto signature" contains SyncToJsep, "address" contains e5e5.

bp-8e25cd70-a942-4385-9a7f-71f830230421
bp-3da18505-c4db-4478-914d-b32900230423
bp-51381306-0101-468d-912b-a88c60230424

The signature is useless ([@ arena_dalloc | Allocator<T>::free | free | delete ]) and then there's a zillion frames of library code, but then there's
JsepTrack::UpdateStreamIds()
RTCRtpSender::SyncToJsep()
RTCRtpTransceiver::SyncToJsep()
mozilla::PeerConnectionImpl::SyncToJsep()

The RTCRtpSender::SyncToJsep() line it is crashing on is aJsepTransceiver.mSendTrack.UpdateStreamIds(streamIds);.

Jim Mathies [:jimm]

Updated

•

1 year ago

No longer depends on: 1829667

Jim Mathies [:jimm]

Updated

•

1 year ago

Flags: needinfo?(docfaraday)

Jim Mathies [:jimm]

Updated

•

1 year ago

Assignee: nobody → docfaraday

Byron Campen [:bwc]

Assignee

Comment 7

•

1 year ago

As far as I can tell, this has never happened on 113.

Status: NEW → RESOLVED

Closed: 1 year ago

Flags: needinfo?(docfaraday)

Resolution: --- → WORKSFORME

Jim Mathies [:jimm]

Updated

•

1 year ago

No longer blocks: webrtc-triage

Byron Campen [:bwc]

Assignee

Comment 8

•

1 year ago

Finally saw this on 113:

https://crash-stats.mozilla.org/report/index/28259ba1-46c0-4d4d-84ab-5ce350230510

Status: RESOLVED → REOPENED

Resolution: WORKSFORME → ---

Jim Mathies [:jimm]

Comment 9

•

1 year ago

•

Edited

There may be a fix or the signature morphs in 114.

Andrew McCreight [:mccr8]

Reporter

Comment 10

•

1 year ago

I searched for Fenix crashes with major version > 113, and I didn't see any crashes with either SyncToJsep or JsepTrackNegotiatedDetails in the proto signature. (On 113, doing that I see 2 crashes on poison values.)

Andrew McCreight [:mccr8]

Reporter

Updated

•

1 year ago

status-firefox113: --- → wontfix

Jim Mathies [:jimm]

Comment 11

•

1 year ago

Low volume, so we'll wait for 114 to roll out before we close this.

Jim Mathies [:jimm]

Updated

•

1 year ago

status-firefox114: --- → fixed

status-firefox115: --- → fixed

Daniel Veditz [:dveditz]

Updated

•

1 year ago

Whiteboard: [adv-main114+r]

Daniel Veditz [:dveditz]

Comment 12

•

1 year ago

Because of the low volume (and perhaps the devices involved) we have never seen this crash on a nightly or beta version. The fact we haven't seen any 114 crashes means nothing yet.

Since we didn't check in any fixes and don't even have suspicions of a patch that might have fixed it, calling this "fixed" anywhere is the wrong thing to do for a security bug (it triggers advisories, people will ask for the patch.... a mess). We don't have "worksforme" as a release-specific status so the next best choice would be "unaffected". If, that is, we had enough evidence that the crash was gone, which we don't yet.

status-firefox114: fixed → ?

status-firefox115: fixed → ?

status-firefox-esr102: --- → unaffected

Keywords: regression

Whiteboard: [adv-main114+r]

Version: unspecified → Firefox 110

Jim Mathies [:jimm]

Comment 13

•

1 year ago

(In reply to Daniel Veditz [:dveditz] from comment #12)

Because of the low volume (and perhaps the devices involved) we have never seen this crash on a nightly or beta version. The fact we haven't seen any 114 crashes means nothing yet.

Since we didn't check in any fixes and don't even have suspicions of a patch that might have fixed it, calling this "fixed" anywhere is the wrong thing to do for a security bug (it triggers advisories, people will ask for the patch.... a mess). We don't have "worksforme" as a release-specific status so the next best choice would be "unaffected". If, that is, we had enough evidence that the crash was gone, which we don't yet.

We did do work here in a related bug (in the see also list). That's why we think this might be fixed.

Andrew McCreight [:mccr8]

Reporter

Comment 14

•

1 year ago

The "see also" is bug 1827022, which was marked WORKSFORME after nothing landed. Should this bug have a "depends on" or "see also" for bug 1829667? You removed the depends on 2023-04-25 and it wasn't clear to me why. Thanks.

Flags: needinfo?(jmathies)

Jim Mathies [:jimm]

Updated

•

1 year ago

Depends on: 1829667

Flags: needinfo?(jmathies)

Jim Mathies [:jimm]

Updated

•

1 year ago

Status: REOPENED → RESOLVED

Closed: 1 year ago → 1 year ago

Resolution: --- → FIXED

Ryan VanderMeulen [:RyanVM]

Updated

•

1 year ago

Group: media-core-security → core-security-release

status-firefox114: ? → fixed

status-firefox115: ? → ---

Target Milestone: --- → 114 Branch

Daniel Veditz [:dveditz]

Updated

•

11 months ago

Group: core-security-release