crash in nr_ice_component_process_incoming_check while trying to receive an incoming call in Firefox Loop

RESOLVED WORKSFORME

Status

()

Core
WebRTC: Networking
P1
critical
Rank:
10
RESOLVED WORKSFORME
3 years ago
2 years ago

People

(Reporter: Martijn Wargers (dead), Assigned: drno)

Tracking

({crash, csectype-uaf, sec-high})

Trunk
All
Mac OS X
crash, csectype-uaf, sec-high
Points:
---

Firefox Tracking Flags

(firefox37 affected, firefox38 affected, firefox39 affected, firefox-esr38 affected)

Details

(crash signature)

(Reporter)

Description

3 years ago
I got this crash once, while I was waiting on to get a call from someone from Firefox Loop. This was on a real slow buggy network.

This bug was filed from the Socorro interface and is 
report bp-4f95173d-ae0a-49a3-9e94-645132150321.
=============================================================
 0 	XUL 	nr_ice_component_process_incoming_check 	media/mtransport/third_party/nICEr/src/ice/ice_component.c
1 	XUL 	nr_ice_component_stun_server_cb 	media/mtransport/third_party/nICEr/src/ice/ice_component.c
2 	XUL 	nr_stun_server_process_request 	media/mtransport/third_party/nICEr/src/stun/stun_server_ctx.c
3 	XUL 	nr_ice_socket_readable_cb 	media/mtransport/third_party/nICEr/src/ice/ice_socket.c
4 	XUL 	_ZThn176_N7mozilla8NrSocket13OnSocketReadyEP10PRFileDescs 	media/mtransport/nr_socket_prsock.cpp
5 	XUL 	nsSocketTransportService::DoPollIteration(bool) 	netwerk/base/src/nsSocketTransportService2.cpp
6 	XUL 	nsSocketTransportService::Run() 	netwerk/base/src/nsSocketTransportService2.cpp
7 	XUL 	_ZThn24_N24nsSocketTransportService3RunEv 	obj-firefox/x86_64/netwerk/base/src/Unified_cpp_netwerk_base_src2.cpp
8 	XUL 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp
9 	XUL 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp
(Assignee)

Comment 1

3 years ago
Interesting. I'm wondering if the content of the STUN message throws us of.

Comment 2

3 years ago
So, I'm seeing this crash (or similar) happen more than once in crash-stats:

https://crash-stats.mozilla.com/search/?signature=~nr_ice_component_process_incoming_check&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#crash-reports
Group: core-security

Comment 3

3 years ago
Looking at these other crashes, and the code, I'm pretty sure this is a UAF.

Comment 4

3 years ago
There's the same crash on 38 in crash-stats. Don't see anything on 39 though, but that may just be luck.
status-firefox37: --- → affected
status-firefox-esr38: --- → affected

Updated

3 years ago
status-firefox38: --- → affected
Not too many crashes. A couple are clearly UAF:
bp-40978097-b0f9-405a-b26b-611da2150323
bp-5e9c3355-1bf3-421f-b0f6-be9a92150321

A bunch are null or near-null; a handful have wild addresses which could also be a UAF with the freed space reallocated before this code got to it, or could be a different bug entirely (seems to be the same line of code though).
https://crash-stats.mozilla.com/search/?signature=~nr_ice_component_process_incoming_check&date=%3E2015-01-01&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#crash-reports

50 since 1/1

One 39 failure:
https://crash-stats.mozilla.com/report/index/44949191-383f-43f3-a433-907b92150313
First failure 1/27
https://crash-stats.mozilla.com/report/index/60510c6f-9a08-4bbd-93b2-029322150310
Clearly has a UAF address (0xa5a5 etc)

Updated

3 years ago
Keywords: csectype-uaf
https://crash-stats.mozilla.com/report/index/842e4806-0fbd-4fd0-853d-2746d2150127
has a more complete stacktrace than most

Updated

3 years ago
status-firefox39: --- → affected
(Assignee)

Updated

3 years ago
Assignee: nobody → drno
(Assignee)

Comment 9

3 years ago
hg log shows only bugs 1091242 and 1109841 touching nICEr code since 36. I'll take a look at the changes in there.
This may well be an influenced-by-changes-outside-nicer (lifetime changes)
I see crashes in 33, 34, perhaps even earlier by opening the date range wider
https://crash-stats.mozilla.com/report/index/050ab19e-4906-4aca-a5a6-878a42141111

https://crash-stats.mozilla.com/search/?signature=~nr_ice_component_process_incoming_check&date=%3E2014-08-01&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#crash-reports

Comment 12

3 years ago
(In reply to Randell Jesup [:jesup] from comment #11)
> I see crashes in 33, 34, perhaps even earlier by opening the date range wider
> https://crash-stats.mozilla.com/report/index/050ab19e-4906-4aca-a5a6-
> 878a42141111
> 
> https://crash-stats.mozilla.com/search/
> ?signature=~nr_ice_component_process_incoming_check&date=%3E2014-08-
> 01&_facets=signature&_columns=date&_columns=signature&_columns=product&_colum
> ns=version&_columns=build_id&_columns=platform#crash-reports

The reports < 37 seem to all have the signs of stack corruption; the top of the stack (nr_ice_component_process_incoming_check) doesn't fit the rest of the stack at all, and the rest of the stack looks more or less consistent. Probably not the same bug.
Right; first real failure was 12/27 in 37; earliest failures are all spurious given the stacks
Keywords: sec-high
(Assignee)

Comment 14

3 years ago
I did a manual backtrace from where the code calls free on a component or stream. But these only seem to get called in sane cases where either 1) the whole peer context or 2) the ice context get freed.

Martin, did you create a room/conversation and you were waiting for the other side to click on the URL you send them? Or were you making a direct call (without a room) to someone?
Flags: needinfo?(martijn.martijn)
(Reporter)

Comment 15

3 years ago
Yes, iirc, I did create a room/conversation and waiting for the other side. The lan network we were on, was really slow and buggy when this crash happened.
Flags: needinfo?(martijn.martijn)
Group: media-core-security
This is an aging sec-high rated issue. Have we seen this issue more?
Flags: needinfo?(drno)

Comment 17

3 years ago
I see this happening once on 39 on crash stats this year:

https://crash-stats.mozilla.com/api/SuperSearch/?signature=~nr_ice_component_process_incoming_check&date=%3E01%2F01%2F2015&version=39.0a1&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform

I do not see it happening on either 40 or 41. It was also much more rare on 38 than it was on 37. So it seems that the frequency of this crash has been going down significantly, but we haven't yet managed to kill it.
(Assignee)

Comment 18

3 years ago
Byron and me did some code analyzing back when the bug came in. But we were not able to find any sensible way how this could happen. Sorry for not updating the bug back then.
Byron hopefully answered the question. As we currently don't really have an idea how to reproduce this issue and I'm not sure how to proceed.
Flags: needinfo?(drno)

Updated

3 years ago
backlog: --- → webRTC+
Rank: 10
Priority: -- → P1
> I do not see it happening on either 40 or 41. It was also much more rare on
> 38 than it was on 37. So it seems that the frequency of this crash has been
> going down significantly, but we haven't yet managed to kill it.

Since June 1, we have 1 crash in 39.0b3.  That's the only 39 build with crashes ever, and the only crash in 39. I'll note we don't have many crashes in betas in any build, but 37 and 38 certainly had a bunch more than 1 in betas.  (We have no aurora/nightly crashes)

This is mostly good, but still not knowing what causes this is bad.

Can we re-analyze, and look to see if there are any asserts/traps/etc we could add?
Flags: needinfo?(drno)
Flags: needinfo?(docfaraday)

Comment 20

3 years ago
By any chance, had you woken your machine from sleep shortly before this occurred (or perhaps caught your machine just as it was going to sleep)?
Flags: needinfo?(martijn.martijn)
(Reporter)

Comment 21

3 years ago
(In reply to Byron Campen [:bwc] from comment #20)
> By any chance, had you woken your machine from sleep shortly before this
> occurred (or perhaps caught your machine just as it was going to sleep)?

No, pretty sure that was not the case, because we were constantly trying to get Loop to work.
Flags: needinfo?(martijn.martijn)

Comment 22

3 years ago
(In reply to Randell Jesup [:jesup] from comment #19)
> > I do not see it happening on either 40 or 41. It was also much more rare on
> > 38 than it was on 37. So it seems that the frequency of this crash has been
> > going down significantly, but we haven't yet managed to kill it.
> 
> Since June 1, we have 1 crash in 39.0b3.  That's the only 39 build with
> crashes ever, and the only crash in 39. I'll note we don't have many crashes
> in betas in any build, but 37 and 38 certainly had a bunch more than 1 in
> betas.  (We have no aurora/nightly crashes)
> 
> This is mostly good, but still not knowing what causes this is bad.
> 
> Can we re-analyze, and look to see if there are any asserts/traps/etc we
> could add?

   So, I've analyzed this as deeply as I can, and come up with nothing other than things like memory corruption due to code elsewhere. My suspicion is that all of these weird intractable crashes down in nICEr have a common cause with bug 1151046, given that all of them seem to have vanished at the same time we put in a speculative fix.
Flags: needinfo?(docfaraday)
I don't see any post-38 crashes looking in crashstats today.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WORKSFORME

Updated

2 years ago
Group: core-security → core-security-release
Group: media-core-security, core-security-release
(Assignee)

Comment 24

2 years ago
Clearing old NI.
Flags: needinfo?(drno)
You need to log in before you can comment on or make changes to this bug.