Closed Bug 1329989 Opened 3 years ago Closed 3 years ago

Crash [@ mozilla::dom::workers::ServiceWorkerManager::NotifyServiceWorkerRegistrationRemoved ]

Categories

(Core :: DOM: Service Workers, defect)

defect
Not set

Tracking

()

VERIFIED FIXED
mozilla53
Tracking Status
firefox-esr45 --- disabled
firefox50 --- wontfix
firefox51 + verified
firefox52 + verified
firefox53 + verified

People

(Reporter: cbook, Assigned: bkelly)

References

(Blocks 1 open bug, )

Details

(Keywords: crash, csectype-uaf, sec-critical, Whiteboard: [post-critsmash-triage][adv-main51+])

Attachments

(3 files)

Attached file bughunter stack
Bughunter found crashes with Signature on windows trunk builds 

mozilla::dom::workers::ServiceWorkerManager::NotifyServiceWorkerRegistrationRemoved mozilla::dom::workers::ServiceWorkerManager::RemoveScopeAndRegistration mozilla::dom::workers::ServiceWorkerManager::RemoveRegistration mozilla::dom::workers::ServiceWorkerManager::MaybeRemoveRegistration mozilla::dom::workers::ServiceWorkerUpdateJob::FailUpdateJob

on a url like https://weather.com/weather/5day/l/10535:4:US and https://m.aliexpress.com/item/32686488037.html?aff_click_id=06e9b06df94d46969abe550788d8150f-1452542766604-05542-UneMJZVf&aff_platform=y

i was not able to reproduce (maybe because i'm a different region and not US based) but seems bughunter is able to.

Also this crashes especially on weather.com are marked as exploitable due to the crash adress so worth filing
Ben, could you take a look at this ?
Flags: needinfo?(bkelly)
I'll take a look.
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Flags: needinfo?(bkelly)
Carsten, what version of FF was this?
Flags: needinfo?(cbook)
My guess is this is related to the AsyncStopListeningRunnable used for the ServiceWorkerRegistrationWorkerThread:

https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerRegistration.cpp#1268

It seems wholely unsafe because the worker can shutdown right after this and it won't try to perform the sync unregister.
Actually, comment 4 makes no sense.  Please ignore it.

Note, I don't see any instances of this crash in crash-stats.  Still investigating.
Are there any instructions on how to run bughunter locally?  I'm not familiar with it at all.
This appears to be a shutdown crash:

10  xul.dll!NS_ProcessPendingEvents(nsIThread *,unsigned int) [nsThreadUtils.cpp:d192a99be4b4 : 332 + 0xc]
    eip = 0x613952b2   esp = 0x0022f978   ebp = 0x0022f990
    Found by: call frame info
11  xul.dll!mozilla::ShutdownXPCOM(nsIServiceManager *) [XPCOMInit.cpp:d192a99be4b4 : 916 + 0x8]
    eip = 0x61390d17   esp = 0x0022f998   ebp = 0x0022f9c8
    Found by: call frame info
12  xul.dll!ScopedXPCOMStartup::~ScopedXPCOMStartup() [nsAppRunner.cpp:d192a99be4b4 : 1414 + 0x7]
    eip = 0x6283391a   esp = 0x0022f9d0   ebp = 0x0022f9e8
    Found by: call frame info

Running this step:

https://dxr.mozilla.org/mozilla-central/source/xpcom/build/XPCOMInit.cpp#916

This is after the xpcom-shutdown observer topic which means we have canceled jobs here:

https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerManager.cpp#3557

This means the crash stop entered FailUpdateJob() because Canceled() returns true here:

https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerUpdateJob.cpp#539

From there I'm not sure why it crashes.

I thought perhaps something was not removing itself from the mServiceWorkerRegistrationListeners.  This would happen if ServiceWorkerManager::GetInstance() was returning nullptr at this point, but that should not be cleared until xpcom-shutdown-final.
I think I will need better steps to reproduce or at least some more crash reports to investigate this further.
(In reply to Ben Kelly [:bkelly] from comment #6)
> Are there any instructions on how to run bughunter locally?  I'm not
> familiar with it at all.

yeah to run it like the test instance you need to run :

firefox -spider -start -quit -hook http://sisyphus.bughunter.ateam.scl3.mozilla.com/bughunter/media/userhooks/test-crash-on-load.js -url https://weather.com/weather/5day/l/10535:4:US

You may need to repeat this several times to get it to reproduce.

Get Spider from http://bclary.com/projects/spider/spider-0.1.0.3-sm+tb+fx+an+fn.xpi and the user hook is also in https://bugzilla.mozilla.org/attachment.cgi?id=8763001 (in case you want to run this locally or no vpn access
Flags: needinfo?(cbook)
(In reply to Ben Kelly [:bkelly] from comment #8)
> I think I will need better steps to reproduce 

indeed i was now able to reproduce this with the steps from comment #9

[Parent 4364] ###!!! ASSERTION: Failed to dispatch stop listening runnable!: 'Error', file c:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/src/dom/workers/ServiceWorkerRegistration.cpp, line 1276
Assertion failure: !mListeningForEvents, at c:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/src/dom/workers/ServiceWorkerRegistration.cpp:1054
(In reply to Carsten Book [:Tomcat] from comment #10)
> (In reply to Ben Kelly [:bkelly] from comment #8)
> > I think I will need better steps to reproduce 
> 
> indeed i was now able to reproduce this with the steps from comment #9
> 
> [Parent 4364] ###!!! ASSERTION: Failed to dispatch stop listening runnable!:
> 'Error', file
> c:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/src/dom/workers/
> ServiceWorkerRegistration.cpp, line 1276
> Assertion failure: !mListeningForEvents, at
> c:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/src/dom/workers/
> ServiceWorkerRegistration.cpp:1054

note that was the recent tinderbox m-c windows debug build (on a windows 7 vm). Seems also to be a windows crash only
Ok, that makes sense.  My guess from comment 4 was close, but its actually due to the SyncStopListeningRunnable failing to dispatch.
So I don't think we need the sync runnable at all.  We can safely use a normal runnable here because the runnable holds the mListener alive.  This avoids the possibility of the sync runnable failing during worker shutdown.

While I was here I simplified the overall runnable scheme using NewRunnableMethod.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=87068eb85c55086b709d034b3fbd526cd32f59a7
Attachment #8825924 - Flags: review?(amarchesini)
Attachment #8825924 - Attachment is patch: true
Carsten, can you retest with the try builds from comment 13?  My nightly build would not let me install the spider xpi for some reason.
Flags: needinfo?(cbook)
See Also: → 1328642
I'm marking this sec-critical instead of -high because I'm assuming it can happen during regular worker shut down and not just at browser shut down time as in the stack from comment 0.
(In reply to Ben Kelly [:bkelly] from comment #14)
> Carsten, can you retest with the try builds from comment 13?  My nightly
> build would not let me install the spider xpi for some reason.

Yeah i noticed that too (i guess the xpi also here https://hg.mozilla.org/automation/sisyphus/file/tip/xpi/all) but in any case tested the try build and does not crash anymore \o/
Flags: needinfo?(cbook)
Attached file leak
andrew: i have a question regarding the testing i did. Using the steps to reproduce of this bug here was firing at the end when firefox was shutdown a leaking the world report. I guess this is unreleated to this fix but should i report this too in a new bug ?
Flags: needinfo?(continuation)
Group: core-security → dom-core-security
Attachment #8825924 - Flags: review?(amarchesini) → review+
[Tracking Requested - why for this release]:

sec-high and bughunter opt builds affected
Track 51+/52+/53+ as sec-critical.
(In reply to Carsten Book [:Tomcat] from comment #17)> andrew: i have a question regarding the testing i did. Using the steps to
> reproduce of this bug here was firing at the end when firefox was shutdown a
> leaking the world report. I guess this is unreleated to this fix but should
> i report this too in a new bug ?

We shutdown workers at browser shutdown, so that is likely a nice way to force the issue.  We stop workers all the time, though, so it can happen when we are not shutting down as well.
Note, service workers are disabled in ESR54.
Comment on attachment 8825924 [details] [diff] [review]
Simplify the ServiceWorkerRegistrationWorkerThread runnables. r=baku

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

I believe you can force the issue by constructing a ServiceWorker that holds a ServiceWorkerRegistration object alive until the worker thread is killed.  Killing the worker thread can be arranged via forcing service worker updates.  This would then leave garbage memory pretty reliably in the list tracking worker registration objects.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

No.  The patch is written as more of a simplifying refactor.  It includes some other changes.  Also you must make the leap that worker shutdown will cause the SyncStopListeningRunnable to fail to dispatch which will leave garbage memory.  I don't think that is obvious at first glance.

Which older supported branches are affected by this flaw?

I believe this code was introduced in bug 1131327 which landed in FF40.  I don't think SW were enabled in that release, though.  It likely affects all of our current branches, but not ESR since we disable SW there.

If not all supported branches, which bug introduced the flaw?

Bug 1131327

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

I believe the patches will uplift easily.  I'll verify.  Perhaps changes to mozilla::NewRunnableMethod() will require some rebasing.

How likely is this patch to cause regressions; how much testing does it need?

I think there is minimal chance of regressions.  The object lifetimes are clear here and we're really just removing the chance of failing the cleanup.
Attachment #8825924 - Flags: sec-approval?
Attachment #8825924 - Flags: approval-mozilla-beta?
Attachment #8825924 - Flags: approval-mozilla-aurora?
12 days until FF51 hits release.  Is it ok to leave it in release channel that long after landing the patch in beta/aurora?
(In reply to Carsten Book [:Tomcat] from comment #17)
> andrew: i have a question regarding the testing i did. Using the steps to
> reproduce of this bug here was firing at the end when firefox was shutdown a
> leaking the world report. I guess this is unreleated to this fix but should
> i report this too in a new bug ?

That's probably a good idea. Just file it as a security bug and add sec-other as a keyword.
Flags: needinfo?(continuation)
Comment on attachment 8825924 [details] [diff] [review]
Simplify the ServiceWorkerRegistrationWorkerThread runnables. r=baku

sec-approval+.

Liz Henry says we can take this on branches to so I'm approving there.
Attachment #8825924 - Flags: sec-approval?
Attachment #8825924 - Flags: sec-approval+
Attachment #8825924 - Flags: approval-mozilla-beta?
Attachment #8825924 - Flags: approval-mozilla-beta+
Attachment #8825924 - Flags: approval-mozilla-aurora?
Attachment #8825924 - Flags: approval-mozilla-aurora+
(In reply to Andrew McCreight [:mccr8] from comment #24)
> (In reply to Carsten Book [:Tomcat] from comment #17)
> That's probably a good idea. Just file it as a security bug and add
> sec-other as a keyword.

filed as bug 1330903
https://hg.mozilla.org/mozilla-central/rev/17a962c65f48
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
The uplifts look good to me.  Thanks!
Flags: needinfo?(bkelly)
Also, just want to say a huge thanks to Carsten and bughunter for finding this!
Whiteboard: [post-critsmash-triage]
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main51+]
Group: dom-core-security → core-security-release
(In reply to Carsten Book [:Tomcat] from comment #9)
> yeah to run it like the test instance you need to run :
> firefox -spider -start -quit -hook
> http://sisyphus.bughunter.ateam.scl3.mozilla.com/bughunter/media/userhooks/
> test-crash-on-load.js -url https://weather.com/weather/5day/l/10535:4:US
Reproduced the assertion on m-c tinderbox debug build 53.0a1 (2017-01-10), Win 7 x64.
Verified fixed 51.0.1, 52.0a2 (2017-01-22), 53.0a1 (2017-01-23) tinderbox debug builds.
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.