Closed Bug 1363563 Opened 3 years ago Closed 3 years ago

Firefox crashes when using WebRTC and renegotiating with a new offer that has less RTP header extensions than the previous one.

Categories

(Core :: WebRTC: Networking, defect, P1)

53 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox-esr52 --- unaffected
firefox53 --- wontfix
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: saeed, Assigned: drno)

References

Details

(Keywords: crash, Whiteboard: regression)

Attachments

(1 file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36

Steps to reproduce:

I was not able to come up with a clean isolated repro of this that I can share yet, but I will try to later. We noticed Firefox would crash when another participant joined the WebRTC call and did not support the 'toffset' RTP header extension.

In such a case we renegotiate the existing peer connection by setting a new offer SDP as the remote description.  The original remote offer SDP would have both 'abs-send-time' (with id=1) and 'toffset' (with id=2).  The new remote offer we use after a participant joins a call would only contain 'abs-send-time' (with id=1) and cause Firefox to fail at the below assertion and crash.

We tracked this down to a possible issue with WebrtcVideoConduit::AddLocalRTPExtensions(), but not entirely sure.

https://dxr.mozilla.org/mozilla-central/source/media/webrtc/signaling/src/media-conduit/VideoConduit.cpp#260

It seems that extList will always continue to grow, and the values become corrupt since once remove_if() is called, the new extensions are added to the old pre-remove_if end of the list.


Actual results:

The following assertion failure causes Firefox to crash:

#
# Fatal error in /home/worker/workspace/build/src/media/webrtc/trunk/webrtc/video/video_receive_stream.cc, line 213
# Check failed: 0 == vie_channel_->SetReceiveAbsoluteSendTimeStatus(true, id) (0 vs. -1)
# 
#
Component: Untriaged → WebRTC: Networking
Keywords: crash
Product: Firefox → Core
Status: UNCONFIRMED → NEW
Rank: 19
Ever confirmed: true
Priority: -- → P1
FYI we believe the issue is the call to std::remove_if not being passed to std::erase, which leaves the last few elements in extList in an undefined state.  If this happens to leave multiple entries with the same ID (from the following insert call), then the lower levels abort later when the mapping of ID to extension clobbers itself.
Thanks for tracking this so far down already.

Indeed the missing erase() looks like trouble. I'm working on test to verify the theory.
Assignee: nobody → drno
Looks like the test was not crashing on any OS.

Can you open about:crashes on a Firefox instance which has crashed like described and put the crash link in here?
Flags: needinfo?(saeed)
Thanks for taking a look Nils.

https://crash-stats.mozilla.com/report/index/4ff3cf1f-2914-4b29-be15-bfeef0170511
Flags: needinfo?(saeed)
This is somewhat strange, because while the crash log I just sent you seems to talk about failing on an assert on line 279 of video_receive_stream.cc, message printed to my stdout/err seems to talk about a different assert from a few lines above:

#
# Fatal error in /home/worker/workspace/build/src/media/webrtc/trunk/webrtc/video/video_receive_stream.cc, line 213
# Check failed: 0 == vie_channel_->SetReceiveAbsoluteSendTimeStatus(true, id) (0 vs. -1)
# 
#
To be perfectly honest, I'm not sure why we have an _Add_LocalRTPExtensions function. Why is it not _Set_LocalRTPExtensions? If an RTP extension is negotiated off, we ought to remove it, right?
Comment on attachment 8866647 [details]
Bug 1363563: remove and erase existing header extensions.

https://reviewboard.mozilla.org/r/138252/#review141630

Looks good to me.
Attachment #8866647 - Flags: review?(mfroman) → review+
(In reply to Byron Campen [:bwc] from comment #8)
> To be perfectly honest, I'm not sure why we have an _Add_LocalRTPExtensions
> function. Why is it not _Set_LocalRTPExtensions? If an RTP extension is
> negotiated off, we ought to remove it, right?

Indeed. I also don't understand why the code tries to remove extension, just to add it back in immediately - as if removing it from the vector would result in some other actions, but it does not as far as I can tell.

My only concern with a _Set_ instead would be, that if we start re-using a config, we might actually leave extensions turned on. But the current code can't handle that either...
(In reply to Saeed Jahed from comment #7)
> This is somewhat strange, because while the crash log I just sent you seems
> to talk about failing on an assert on line 279 of video_receive_stream.cc,
> message printed to my stdout/err seems to talk about a different assert from
> a few lines above:
> 
> #
> # Fatal error in
> /home/worker/workspace/build/src/media/webrtc/trunk/webrtc/video/
> video_receive_stream.cc, line 213
> # Check failed: 0 == vie_channel_->SetReceiveAbsoluteSendTimeStatus(true,
> id) (0 vs. -1)
> # 
> #

That is indeed strange.

As I had no luck in trying to repro hitting that assert with a mochitest, can you download a build from here https://treeherder.mozilla.org/#/jobs?repo=try&revision=98f8bef987b3 - these builds include the erase() call. So if you are right, then this should fix the problem. If the problem is somewhere else we will hopefully know then as well.

You have to click on the green B for your OS, and then search in the lower left corner for the download link. If you let me know the target OS I can also extract the direct link and post it here.
Flags: needinfo?(saeed)
> As I had no luck in trying to repro hitting that assert with a mochitest,
> can you download a build from here
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=98f8bef987b3 - these
> builds include the erase() call. So if you are right, then this should fix
> the problem. If the problem is somewhere else we will hopefully know then as
> well.
> 
> You have to click on the green B for your OS, and then search in the lower
> left corner for the download link. If you let me know the target OS I can
> also extract the direct link and post it here.

I don't see a Linux build there, and it seems like I can't repro the issue on my Mac (Firefox 54.0b5 (64-bit)) to begin with.
Flags: needinfo?(saeed)
(In reply to Saeed Jahed from comment #12)
> I don't see a Linux build there, and it seems like I can't repro the issue
> on my Mac (Firefox 54.0b5 (64-bit)) to begin with.

That might explain why I was not able to repro then on my Mac :-)

Our treeherder UI is not really intuitive.. This should be the link to the Linux 64bit optimized build: https://queue.taskcluster.net/v1/task/ToO0fp0iRJeIsQ242-eNKw/runs/0/artifacts/public/build/target.tar.bz2

Would be great if you could report back if it fixes the issue for you.
Flags: needinfo?(saeed)
Ah! Sorry, I should have mentioned that my crash was on my Linux :)

That link didn't quite work for me (libXt.so.6: cannot open shared object file), but I tried target.tar.bz2 from the green B under Linux x64 opt and that worked:
https://queue.taskcluster.net/v1/task/WJ9rTOTiSE6pirhAC4HBkA/runs/0/artifacts/public/build/target.tar.bz2

It no longer crashes!  So that seems to have fixed it.
Flags: needinfo?(saeed)
(In reply to Saeed Jahed from comment #14)
> That link didn't quite work for me (libXt.so.6: cannot open shared object
> file), but I tried target.tar.bz2 from the green B under Linux x64 opt and
> that worked:
> https://queue.taskcluster.net/v1/task/WJ9rTOTiSE6pirhAC4HBkA/runs/0/
> artifacts/public/build/target.tar.bz2

Sorry I gave you accidentally the Linux 32bit link instead of the 64bit one.
 
> It no longer crashes!  So that seems to have fixed it.

Awesome. Thanks for helping with the verification!
Pushed by drno@ohlmeier.org:
https://hg.mozilla.org/integration/autoland/rev/ba7f87ca199a
remove and erase existing header extensions. r=mjf
https://hg.mozilla.org/mozilla-central/rev/ba7f87ca199a
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment on attachment 8866647 [details]
Bug 1363563: remove and erase existing header extensions.

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1250356
[User impact if declined]: Can cause crashes in WebRTC calls.
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: It has been verified by the reporter in a try build.
[Needs manual test from QE? If yes, steps to reproduce]: No, as we were unable to repro it locally.
[List of other uplifts needed for the feature/fix]: N/A
[Is the change risky?]: Not really
[Why is the change risky/not risky?]: Because it uses a std operation to cleanup a vector, which was left in a dirty state before.
[String changes made/needed]: N/A
Attachment #8866647 - Flags: approval-mozilla-beta?
Blocks: 1344557
Whiteboard: regression
Comment on attachment 8866647 [details]
Bug 1363563: remove and erase existing header extensions.

Fix a WebRtc crash and was verified. Beta54+. Should be in 54 beta 8.
Attachment #8866647 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Blocks: 1250356
Version: 54 Branch → 53 Branch
You need to log in before you can comment on or make changes to this bug.