Closed Bug 1339116 Opened 3 years ago Closed 3 years ago

Crash in RefPtr<T>::~RefPtr<T> | mozilla::dom::workers::ServiceWorkerPrivate::NoteIdleWorkerCallback

Categories

(Core :: DOM: Service Workers, defect, critical)

x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox-esr45 --- disabled
firefox51 --- wontfix
firefox52 + fixed
firefox-esr52 --- fixed
firefox53 + fixed
firefox54 + fixed

People

(Reporter: bkelly, Assigned: bkelly)

References

Details

(Keywords: crash, sec-high, Whiteboard: [post-critsmash-triage][adv-main52+])

Crash Data

Attachments

(1 file, 3 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-50829121-ea96-436a-a965-b82b32170213.
=============================================================

ServiceWorkerPrivate::ResetIdleTimeout() needs to hold a strong ref to the ServiceWorkerPrivate while the timer is registered.

This is a UAF, so marking secure.  Its probably hard to force in practice, though.
Attached patch patch (obsolete) — Splinter Review
This patch makes ServiceWorkerPrivate use nsITimerCallback objects instead of a function with a closure.  This allows the nsITimer to hold a strong ref to the SWP and keep it alive.  Previously we were racing the timer against the destruction of the SWP.

In theory the SWP cancels the timer before destruction, but there is a race where the SWP is destroyed while the timer callback runnable is stuck in the event queue.

An alternative would be to call AddRef()/Release() manually on the closure, but I think the callback object is more explicit.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ba32c0bb094eb6f3ac9350bbeaa61ea117ea1ac8
Attachment #8836855 - Flags: review?(bugmail)
Attached patch patch (obsolete) — Splinter Review
Sorry.  Slight change to clear the SWP ref after we fire the method in the timer callback.  It should help avoid the possibility of releasing the SWP on the wrong thread.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1bb5eb9d75501beba3c217621bf960c1807c2d9e
Attachment #8836855 - Attachment is obsolete: true
Attachment #8836855 - Flags: review?(bugmail)
Attachment #8836858 - Flags: review?(bugmail)
Comment on attachment 8836858 [details] [diff] [review]
patch

Review of attachment 8836858 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/ServiceWorkerPrivate.cpp
@@ +1942,5 @@
> +
> +  NS_IMETHOD
> +  Notify(nsITimer* aTimer)
> +  {
> +    (mServiceWorkerPrivate->*mMethod)(aTimer);

My local patch and the try build have:

  mServiceWorkerPrivate = nullptr;

here.  For some reason my SMB share to my build server is not working, so I can't update the patch easily here.
Comment on attachment 8836855 [details] [diff] [review]
patch

Review of attachment 8836855 [details] [diff] [review]:
-----------------------------------------------------------------

Ah, when reading the code I'd glossed over this either assuming the timer code's support or late cancellation and its use of a mutex was sufficient or that the SWP life-cycle invariants had some other protection in place.  But timer-wise, of course it's not, it just narrows the danger zone.  nsTimerImpl::Fire leaves the mutex after stealing the callback out.  Once that happens, the race is on and the callback really needed to be holding a strong reference.
Attachment #8836855 - Attachment is obsolete: false
Comment on attachment 8836855 [details] [diff] [review]
patch

Interesting; I hadn't realized what mid-airs on attachments looked like.  I got a "suspicious activity" warning where I guess they must use a nonce rather than an epoch so they can't distinguish attacks from mid-airs.  Re-obsoleting.
Attachment #8836855 - Attachment is obsolete: true
Attachment #8836858 - Flags: review?(bugmail) → review+
Looking further, normally this situation would be prevented by nsTimerImpl code.  If you call Cancel() on the target thread, then its guaranteed not to call the callback.

In our case, the target thread is main thread.  The ServiceWorkerPrivate, however, can be destroyed and call nsTimerImpl::Cancel() on the worker thread.  This permits a window where the ServiceWorkerPRivate destruction can occur during nsTimerImpl::Fire() when it has extracted the callbacks and dropped the lock.  The callback object becomes garbage.

Holding the strong ref from the callback in nsTimerImpl should prevent this.
Attached patch patch (obsolete) — Splinter Review
Fix a build problem try found.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=363705bf65d022103475fb724f8a4da0a5b473e8
Attachment #8836858 - Attachment is obsolete: true
Attachment #8836885 - Flags: review+
Taking a stab at security rating as sec-high.  This is a UAF, but I think its really hard to trigger.  You have to race the teardown of a service worker with the idle timer.
Keywords: sec-high
Add a vague commit message.
Attachment #8836885 - Attachment is obsolete: true
Attachment #8836894 - Flags: review+
Comment on attachment 8836894 [details] [diff] [review]
Improve the ServiceWorkerPrivate timer code. r=asuth

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

I think it would be difficult.  Its not obvious that the purpose here is to add a strong ref.  In addition, you would need to figure out how to race the timer thread against the ServceWorkerPrivate destruction on separate threads.  Its perhaps possible, but I don't think its trivial.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

Its adding a strong ref, but I don't think its obviously the purpose here.

Which older supported branches are affected by this flaw?

All service worker related branches.  Basically all of them, except for ESR where its disabled.

If not all supported branches, which bug introduced the flaw?

All of them except ESR.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

This should backport relatively cleanly.

How likely is this patch to cause regressions; how much testing does it need?

Minimal chance of regression.  We're basically switching from the "func + void*" variant of nsITimer to the "ref-counted callback object" variant.  The overall code structure has not changed.
Attachment #8836894 - Flags: sec-approval?
Attachment #8836894 - Flags: approval-mozilla-beta?
Attachment #8836894 - Flags: approval-mozilla-aurora?
sec-approval+ and I'll approve branches as well.
Attachment #8836894 - Flags: sec-approval?
Attachment #8836894 - Flags: sec-approval+
Attachment #8836894 - Flags: approval-mozilla-beta?
Attachment #8836894 - Flags: approval-mozilla-beta+
Attachment #8836894 - Flags: approval-mozilla-aurora?
Attachment #8836894 - Flags: approval-mozilla-aurora+
(In reply to Ryan VanderMeulen [:RyanVM] from comment #13)
> https://hg.mozilla.org/releases/mozilla-esr52/rev/1e3e791a5d2e

Thanks, although I think we are going to disable there.  It just hasn't happened yet.
https://hg.mozilla.org/mozilla-central/rev/8160ed6f7ef5
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Group: core-security → core-security-release
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main52+]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.