Closed Bug 1319566 Opened 3 years ago Closed 3 years ago

Crash in nsTArray_Impl<T>::DestructRange | nsTArray_Impl<T>::RemoveElementsAt | mozilla::MediaEngineSource::Deallocate

Categories

(Core :: WebRTC: Audio/Video, defect, P1, critical)

50 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox50 + fixed
firefox51 + fixed
firefox52 + verified
firefox53 + verified

People

(Reporter: philipp, Assigned: jesup, NeedInfo)

Details

(Keywords: crash, csectype-nullptr, regression, Whiteboard: [sg:dos])

Crash Data

Attachments

(4 files)

This bug was filed from the Socorro interface and is 
report bp-51f223b0-48eb-4bae-b05f-561bc2161122.
=============================================================
Crashing Thread (35)
Frame 	Module 	Signature 	Source
0 	xul.dll 	nsTArray_Impl<RefPtr<mozilla::MediaEngineSource::AllocationHandle>, nsTArrayInfallibleAllocator>::DestructRange(unsigned int, unsigned int) 	obj-firefox/dist/include/nsTArray.h:1861
1 	xul.dll 	nsTArray_Impl<RefPtr<mozilla::MediaEngineSource::AllocationHandle>, nsTArrayInfallibleAllocator>::RemoveElementsAt(unsigned int, unsigned int) 	obj-firefox/dist/include/nsTArray.h:1905
2 	xul.dll 	mozilla::MediaEngineSource::Deallocate(mozilla::MediaEngineSource::AllocationHandle*) 	dom/media/webrtc/MediaEngine.h:248
3 	xul.dll 	mozilla::MediaEngineWebRTCMicrophoneSource::Deallocate(mozilla::MediaEngineSource::AllocationHandle*) 	dom/media/webrtc/MediaEngineWebRTCAudio.cpp:400
4 	xul.dll 	mozilla::MediaEngineWebRTCMicrophoneSource::Shutdown() 	dom/media/webrtc/MediaEngineWebRTCAudio.cpp:823
5 	xul.dll 	mozilla::MediaEngineWebRTC::Shutdown() 	dom/media/webrtc/MediaEngineWebRTC.cpp:438
6 	xul.dll 	`mozilla::MediaManager::Shutdown'::`2'::ShutdownTask::Run 	dom/media/MediaManager.cpp:2914
7 	xul.dll 	MessageLoop::RunTask(already_AddRefed<mozilla::Runnable>) 	ipc/chromium/src/base/message_loop.cc:346
8 	xul.dll 	MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask&&) 	ipc/chromium/src/base/message_loop.cc:354
9 	xul.dll 	MessageLoop::DoWork() 	ipc/chromium/src/base/message_loop.cc:429
10 	xul.dll 	mozilla::ipc::MessagePumpForNonMainUIThreads::DoRunLoop() 	ipc/glue/MessagePump.cpp:423
11 	xul.dll 	base::MessagePumpWin::RunWithDispatcher(base::MessagePump::Delegate*, base::MessagePumpWin::Dispatcher*) 	ipc/chromium/src/base/message_pump_win.cc:56

crashes with this signature are regressing in 32bit firefox 50 builds and later on windows - could this be related to bug 1286096?

some of the user comments are saing they had their camera/microphone in use or were trying to give access:
* Attempted to share devices on facebook (camera and microphone), kept not remembering the device settings, reprompting permissions constantly. It then crashed.
https://crash-stats.mozilla.com/report/index/458384ab-31ba-4bea-b737-4bf242160926
* i opened facebook call,so had given ff all permissions,but the next time i tried a call, they ask me permission again and know what,no matter how many times i click on share devices ,it wont work
https://crash-stats.mozilla.com/report/index/22bc49fe-25aa-4c22-9b62-fd30e2161113
* La camara permaneció activa 2 horas después de acabar una video llamada, al cerrar el navegador desplego esta pantalla
https://crash-stats.mozilla.com/report/index/c21c7d62-3b21-40f2-9567-f89432161120
* it doesnt work the camera and the micraphne in mozila firefox
https://crash-stats.mozilla.com/report/index/f58067f3-b81c-4d5a-86e7-046712161122#tab-details


Correlations for Firefox Beta:
(100.0% in signature vs 02.84% overall) shutdown_progress = profile-before-change
(100.0% in signature vs 03.94% overall) reason = EXCEPTION_ACCESS_VIOLATION_WRITE
(55.10% in signature vs 00.17% overall) address = 0x2
(28.57% in signature vs 00.26% overall) address = 0x1
(16.33% in signature vs 00.05% overall) address = 0x6
Flags: needinfo?(jib)
Thanks, I'll take a look.
Assignee: nobody → jib
Rank: 15
Flags: needinfo?(jib)
Priority: -- → P1
Reproduced it: bp-4c840c49-c091-4b34-b442-cac3e2161123

STR (in FF50 on Windows 10):

1. Open https://jsfiddle.net/wemt0fvs/
2. Click "Share Selected Devices" once (don't choose "Always Share").
3. When video appears, click the yellow (not gray!) camera icon near URL and choose "Stop sharing".
4. Now click the gray camera icon and choose "Share Selected Devices".

Expected result: One frozen video, and a second live one.

Actual result: One frozen video, and no second video. When I quit Firefox, it crashes.
Comment 2 also happens on OSX, the signature is just different: bp-84fd5fd0-61d7-4834-8231-1f8122161123

[@ RefPtr<T>::~RefPtr | mozilla::MediaEngineSource::Deallocate ]
OS: Windows → All
Hardware: x86 → All
Oh, and it's a null-deref, so security bug.
Group: core-security
Fixes crashing cleanup code on shutdown that apparently only runs when microphone sources "leak" in the sense that they somehow escape normal page cleanup.

This takes care of the crash, but obviously there's something else wrong with the source going missing in the first place with the STRs in comment 2, but that can probably be addressed in a separate bug.
Crash Signature: [@ nsTArray_Impl<T>::DestructRange | nsTArray_Impl<T>::RemoveElementsAt | mozilla::MediaEngineSource::Deallocate] → [@ nsTArray_Impl<T>::DestructRange | nsTArray_Impl<T>::RemoveElementsAt | mozilla::MediaEngineSource::Deallocate] [@ RefPtr<T>::~RefPtr | mozilla::MediaEngineSource::Deallocate]
Group: core-security
Whiteboard: [sg:dos]
Jib: The removing of FreeChannel (and DeInit) - what's the justification for that?  Would that not cause a leak?  (and leaking those might leave threads running until shutdown, which can cause shutdown crashes, and also chew CPU until then)
Flags: needinfo?(jib)
Tracking 53+ for this reproducible crash.
Comment on attachment 8813543 [details] [diff] [review]
fixnull50.patch - Immediate fix for 50

Review of attachment 8813543 [details] [diff] [review]:
-----------------------------------------------------------------

I believe I've convinced myself that this safely avoids the nullptr crash or a leak.  I don't know why this was ever passing nullptr to Deallocated() - probably didn't get caught when we started supporting audio constraints.
Attachment #8813543 - Flags: review?(rjesup) → review+
Comment on attachment 8813544 [details] [diff] [review]
fixnull.patch - Immediate fix for 51-53

Review of attachment 8813544 [details] [diff] [review]:
-----------------------------------------------------------------

Actually, r- because we RemoveElementAt() the element inside MediaEngine::Deallocate(), but are iterating the same array in this patch.
We need to stick with a while() loop, and just Deallocate the first entry until the array is empty.
Attachment #8813544 - Flags: review?(rjesup) → review-
Attachment #8813543 - Flags: review+ → review-
Assignee: jib → rjesup
Status: NEW → ASSIGNED
Comment on attachment 8815784 [details] [diff] [review]
ensure all registered handles are cleared properly

Review of attachment 8815784 [details] [diff] [review]:
-----------------------------------------------------------------

We need to make this more understandable soon.
Attachment #8815784 - Flags: review?(padenot) → review+
Attachment #8815786 - Flags: review?(padenot) → review+
Pushed by rjesup@wgate.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/69bdb4f40c1e
ensure all registered handles are cleared properly r=padenot
Comment on attachment 8815784 [details] [diff] [review]
ensure all registered handles are cleared properly

Approval Request Comment
[Feature/Bug causing the regression]: Audio Constraints landing

[User impact if declined]: Shutdown crashes in some cases where media is still flowing

[Is this code covered by automated tests?]: largely yes; the specific shutdown case now (requires manual actions)

[Has the fix been verified in Nightly?]: yes in local builds; waiting for it to merge to central

[Needs manual test from QE? If yes, steps to reproduce]: See comment 2

[List of other uplifts needed for the feature/fix]: none

[Is the change risky?]: No

[Why is the change risky/not risky?]: Simple change that ensures the callback list MUST be empty before finishing shutdown, and avoids trying to shutdown nullptr.

[String changes made/needed]: none
Attachment #8815784 - Flags: approval-mozilla-beta?
Attachment #8815784 - Flags: approval-mozilla-aurora?
Comment on attachment 8815786 [details] [diff] [review]
ensure all registered handles are cleared properly (release patch)

Approval Request Comment
[Feature/Bug causing the regression]: Audio Constraints landing

[User impact if declined]: Shutdown crashes in some cases where media is still flowing

[Is this code covered by automated tests?]: largely yes; the specific shutdown case now (requires manual actions)

[Has the fix been verified in Nightly?]: yes in local builds; waiting for it to merge to central

[Needs manual test from QE? If yes, steps to reproduce]: See comment 2

[List of other uplifts needed for the feature/fix]: none

[Is the change risky?]: No

[Why is the change risky/not risky?]: Simple change that ensures the callback list MUST be empty before finishing shutdown, and avoids trying to shutdown nullptr.

[String changes made/needed]: none

I know this was suggested for 50, so I'm putting it up for consideration.  I would only really take it if we feel the crash volume warrants, as it's a shutdown crash.  Right now due to it being shutdown I'd probably suggest passing.
Attachment #8815786 - Flags: approval-mozilla-release?
Track 51+ as crashes in 51 beta.
https://hg.mozilla.org/mozilla-central/rev/69bdb4f40c1e
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment on attachment 8815784 [details] [diff] [review]
ensure all registered handles are cleared properly

Fix a crash. Beta51+ and Aurora52+. Should be in 51 beta 6.
Attachment #8815784 - Flags: approval-mozilla-beta?
Attachment #8815784 - Flags: approval-mozilla-beta+
Attachment #8815784 - Flags: approval-mozilla-aurora?
Attachment #8815784 - Flags: approval-mozilla-aurora+
Flags: qe-verify+
Comment on attachment 8815786 [details] [diff] [review]
ensure all registered handles are cleared properly (release patch)

Low risk fix, sec issue, let's include it in 50.1.0
Attachment #8815786 - Flags: approval-mozilla-release? → approval-mozilla-release+
I was not able to reproduce the initial crash, I tried on Mac OS X 10.12.1, Windows 10 64-bit and Ubuntu 16.04 64-bit using old build (50.0 RC). 
Though on Ubuntu I got another crash which is fixed in latest builds:
bp-99bb2648-a972-4116-8ee2-7bec12161227

Since on the new builds the UI for device permissions has changed, I am not able to retest using the same steps to see if it's fixed now. If at step 3 from STR posted on comment 2 I hit the 'red' camera Icon and click on 'X' attributed to 'Use the camera' the doorhanger will contain the message: 'You may need to reload the page for changes to apply'. If I don't restart the page and start the second video, it will not work as before, and if I exit Firefox no crashes will be recorded.

Looking at the crash-stats I see that there are still crashes recorded in the last 7 days, 35.7% of them using latest Release Fx 50.1.0 (15 crashes):
https://crash-stats.mozilla.com/signature/?product=Firefox&signature=nsTArray_Impl%3CT%3E%3A%3ADestructRange%20%7C%20nsTArray_Impl%3CT%3E%3A%3ARemoveElementsAt%20%7C%20mozilla%3A%3AMediaEngineSource%3A%3ADeallocate&date=%3E%3D2016-12-20T13%3A04%3A52.000Z&date=%3C2016-12-27T13%3A04%3A52.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1#summary

Should we be concerned about this?
Flags: needinfo?(rjesup)
There are still a few crashes recorded but none on Fx 52 beta or 53.0a2. So I'll go ahead and mark the tracking flag to verified for 52 and 53.
You need to log in before you can comment on or make changes to this bug.