Closed Bug 1346507 Opened 4 years ago Closed 4 years ago

Iterator invalidation in FlyWebService::Observe()

Categories

(Core Graveyard :: DOM: Flyweb, defect)

x86_64
Linux
defect
Not set
critical

Tracking

(firefox52 disabled, firefox-esr52 disabled, firefox53 disabled, firefox54 disabled, firefox55 fixed)

RESOLVED FIXED
mozilla55
Tracking Status
firefox52 --- disabled
firefox-esr52 --- disabled
firefox53 --- disabled
firefox54 --- disabled
firefox55 --- fixed

People

(Reporter: geeknik, Assigned: djvj)

Details

(4 keywords, Whiteboard: [post-critsmash-triage])

Attachments

(1 file, 2 obsolete files)

Triggered while fuzzing ASAN Nightly (Build ID 20170310165946) with http://lcamtuf.coredump.cx/ref_fuzz5.html.

MulticastDNS: _advertiseService(): key=12/_flyweb._tcp|16/[object DOMQuad]|43953
MulticastDNS: _onServiceFound()[object DOMQuad]
MulticastDNS: _advertiseService(): key=12/_flyweb._tcp|16/[object DOMQuad]|43953
MulticastDNS: _checkStartBroadcastTimer()
MulticastDNS: _checkStartBroadcastTimer(): Scheduling next check in 92894ms
MulticastDNS: unregisterService(): Hello world
MulticastDNS: unregisterService(): [object HTMLCollection]
ASAN:DEADLYSIGNAL
=================================================================
==99940==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x00000050dcb6 bp 0x7ffeeb1b4370 sp 0x7ffeeb1b4200 T0)
    #0 0x50dcb5 in MOZ_CrashPrintf /home/worker/workspace/build/src/mfbt/Assertions.cpp:63 (discriminator 1)
    #1 0x50dcb5 in ?? ??:0
    #2 0x7f8c99a9ce0f in _Z23InvalidArrayIndex_CRASHmm /home/worker/workspace/build/src/xpcom/ds/nsTArray.cpp:26 (discriminator 1)
    #3 0x7f8c99a9ce0f in ?? ??:0
    #4 0x7f8c9e55f1b6 in ElementAt /home/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:1054 (discriminator 6)
    #5 0x7f8c9e55f1b6 in operator* /home/worker/workspace/build/src/obj-firefox/dist/include/mozilla/ArrayIterator.h:86 (discriminator 6)
    #6 0x7f8c9e55f1b6 in Observe /home/worker/workspace/build/src/dom/flyweb/FlyWebService.cpp:1109 (discriminator 6)
    #7 0x7f8c9e55f1b6 in ?? ??:0
    #8 0x7f8c99a87762 in NotifyObservers /home/worker/workspace/build/src/xpcom/ds/nsObserverList.cpp:112 (discriminator 1)
    #9 0x7f8c99a87762 in ?? ??:0
    #10 0x7f8c99a8b2d4 in NotifyObservers /home/worker/workspace/build/src/xpcom/ds/nsObserverService.cpp:281
    #11 0x7f8c99a8b2d4 in ?? ??:0
    #12 0x7f8c9c37addc in Run /home/worker/workspace/build/src/dom/base/nsGlobalWindow.cpp:9391 (discriminator 3)
    #13 0x7f8c9c37addc in ?? ??:0
    #14 0x7f8c99b73f62 in ProcessNextEvent /home/worker/workspace/build/src/xpcom/threads/nsThread.cpp:1264 (discriminator 1)
    #15 0x7f8c99b73f62 in ?? ??:0
    #16 0x7f8c99b70810 in _Z19NS_ProcessNextEventP9nsIThreadb /home/worker/workspace/build/src/xpcom/threads/nsThreadUtils.cpp:389
    #17 0x7f8c99b70810 in ?? ??:0
    #18 0x7f8c9a982a6f in Run /home/worker/workspace/build/src/ipc/glue/MessagePump.cpp:96 (discriminator 1)
    #19 0x7f8c9a982a6f in ?? ??:0
    #20 0x7f8c9a8f4678 in RunInternal /home/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:238 (discriminator 1)
    #21 0x7f8c9a8f4678 in RunHandler /home/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:231 (discriminator 1)
    #22 0x7f8c9a8f4678 in Run /home/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:211 (discriminator 1)
    #23 0x7f8c9a8f4678 in ?? ??:0
    #24 0x7f8c9fde000f in _ZN14nsBaseAppShell3RunEv /home/worker/workspace/build/src/widget/nsBaseAppShell.cpp:156 (discriminator 1)
    #25 0x7f8c9fde000f in ?? ??:0
    #26 0x7f8ca34470f1 in Run /home/worker/workspace/build/src/toolkit/components/startup/nsAppStartup.cpp:283 (discriminator 2)
    #27 0x7f8ca34470f1 in ?? ??:0
    #28 0x7f8ca361730c in XRE_mainRun /home/worker/workspace/build/src/toolkit/xre/nsAppRunner.cpp:4476 (discriminator 1)
    #29 0x7f8ca361730c in ?? ??:0
    #30 0x7f8ca3618e0c in XRE_main /home/worker/workspace/build/src/toolkit/xre/nsAppRunner.cpp:4654
    #31 0x7f8ca3618e0c in ?? ??:0
    #32 0x7f8ca361a20c in XRE_main /home/worker/workspace/build/src/toolkit/xre/nsAppRunner.cpp:4745 (discriminator 1)
    #33 0x7f8ca361a20c in ?? ??:0
    #34 0x4dffaf in do_main /home/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:236 (discriminator 1)
    #35 0x4dffaf in main /home/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:307 (discriminator 1)
    #36 0x4dffaf in ?? ??:0
    #37 0x7f8cb3822b44 in __libc_start_main /build/glibc-qK83Be/glibc-2.19/csu/libc-start.c:287
    #38 0x7f8cb3822b44 in ?? ??:0
    #39 0x41c3d8 in _start ??:?
    #40 0x41c3d8 in ?? ??:0

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/home/geeknik/firefox/firefox+0x50dcb5)
==99940==ABORTING
Flags: sec-bounty?
This looks like it is hitting an intentional crash due to an out of bounds array access in FlyWeb code.

That is happening in this loop:
  for (FlyWebPublishedServer* server : mServers) {
    if (server->OwnerWindowID() == innerID) {
      server->Close();
    }
  }

I'd guess that closing a server can somehow cause mServers to shrink. While this actual crash is safe, destroying an array while iterating over it might possibly cause other unsafe issues that aren't caught.
Component: MFBT → DOM: Flyweb
Summary: null pointer dereference and segfault in MOZ_CrashPrintf() → Invalid array access in FlyWebService::Observe()
Summary: Invalid array access in FlyWebService::Observe() → Iterator invalidation in FlyWebService::Observe()
Kannan: can you take a look at this?
Flags: needinfo?(kvijayan)
Here's a different stack trace from ASAN build 20170317180410 with the same fuzzing script.

*** registerProtocolHandler(12,false,1000000)
*** registerProtocolHandler(Infinity,[object DOMRect],6)
*** registerProtocolHandler([object Navigator],Infinity,[object Geolocation])
ASAN:DEADLYSIGNAL
=================================================================
==12949==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x00000051a986 bp 0x7fffb0643950 sp 0x7fffb06437e0 T0)
==12949==The signal is caused by a WRITE memory access.
==12949==Hint: address points to the zero page.
    #0 0x51a985 in MOZ_CrashPrintf /home/worker/workspace/build/src/mfbt/Assertions.cpp:63 (discriminator 1)
    #1 0x51a985 in ?? ??:0
    #2 0x7fbbe808bd9f in _Z23InvalidArrayIndex_CRASHmm /home/worker/workspace/build/src/xpcom/ds/nsTArray.cpp:26 (discriminator 1)
    #3 0x7fbbe808bd9f in ?? ??:0
    #4 0x7fbbec92147f in ElementAt /home/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:1054 (discriminator 1)
    #5 0x7fbbec92147f in operator* /home/worker/workspace/build/src/obj-firefox/dist/include/mozilla/ArrayIterator.h:86 (discriminator 1)
    #6 0x7fbbec92147f in Observe /home/worker/workspace/build/src/dom/flyweb/FlyWebService.cpp:1109 (discriminator 1)
    #7 0x7fbbec92147f in ?? ??:0
    #8 0x7fbbe80761ac in NotifyObservers /home/worker/workspace/build/src/xpcom/ds/nsObserverList.cpp:112 (discriminator 1)
    #9 0x7fbbe80761ac in ?? ??:0
    #10 0x7fbbe8079ba4 in NotifyObservers /home/worker/workspace/build/src/xpcom/ds/nsObserverService.cpp:281
    #11 0x7fbbe8079ba4 in ?? ??:0
    #12 0x7fbbea85a631 in Run /home/worker/workspace/build/src/dom/base/nsGlobalWindow.cpp:9437 (discriminator 3)
    #13 0x7fbbea85a631 in ?? ??:0
    #14 0x7fbbe815cd2c in ProcessNextEvent /home/worker/workspace/build/src/xpcom/threads/nsThread.cpp:1269 (discriminator 1)
    #15 0x7fbbe815cd2c in ?? ??:0
    #16 0x7fbbe8177ae1 in NS_InvokeByIndex /home/worker/workspace/build/src/xpcom/reflect/xptcall/md/unix/xptcinvoke_asm_x86_64_unix.S:115
    #17 0x7fbbe8177ae1 in ?? ??:0

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/home/geeknik/firefox/firefox+0x51a985)
==12949==ABORTING
Group: core-security → dom-core-security
This is disabled everywhere, but it could be a bad issue if it was enabled, so I'll mark this sec-high.
Keywords: sec-high
FlyWeb is relatively inactive, but we can't have these issues hanging around.
If Kannan isn't looking, can you Justin?
Flags: needinfo?(jdarcangelo)
Thanks for bringing this up.  I'll take a look at it.
Flags: needinfo?(kvijayan)
Assignee: nobody → kvijayan
Kannan, can you please bump this in your queue?
By now, this is security bug is over a month old.
It was found by using a publicly available fuzzer and reported by an outside contributor.
Flags: needinfo?(kvijayan)
Kannan said he'll only be able to fix this within a month, but he convinced me this is OK:
FlyWeb is disabled (preff edoff) in NIghtly and not even built (compile time flags) in Release/Beta.

I'm still fine with anyone else taking it before though ;)
Flags: needinfo?(kvijayan)
Attached patch bug-1346507.patch (obsolete) — Splinter Review
Quick fix.  The array was being modified when it was iterated over.  This just make as a copy of the array and iterates over that, avoids the issue.
Flags: needinfo?(jdarcangelo)
Attachment #8859628 - Flags: review?(continuation)
Comment on attachment 8859628 [details] [diff] [review]
bug-1346507.patch

Review of attachment 8859628 [details] [diff] [review]:
-----------------------------------------------------------------

I guess this isn't changing anything, but just as somebody who isn't familiar with the code it is a little concerning to me that this array does not keep a strong reference to the contents.
Attachment #8859628 - Flags: review?(continuation) → review+
Yeah, the lifetimes are managed manually for these.  I forget the exact reason these refs are "weak", but I remember sicking changing this around quite a bit over time.
Attachment #8859628 - Flags: checkin?
Comment on attachment 8859628 [details] [diff] [review]
bug-1346507.patch

Please attach a patch with proper commit information.
Attachment #8859628 - Flags: checkin?
It seems this bug is actionable without a proper test case.
How strongly do we actually need a it?
Keywords: testcase-wanted
Kannan, can you modify the patch to have proper commit information as requested Ryan?
Flags: needinfo?(kvijayan)
We really don't need a test case for this.  It's hard-disabled everywhere except nightly, and requires a default-off pref to be turned on before it's enabled on even nightly.

I'll attach a rebased patch with proper security-approval comment.
Flags: needinfo?(kvijayan)
Attached patch bug-1346507.patch (obsolete) — Splinter Review
[Security approval request comment]
How easily could an exploit be constructed based on the patch?

Very difficult.  This code is only enabled on nightly.  Even on nightly, it requires a pref to be turned on before it's enabled.  Even if that's the case, the attacker would need to force the user to manually modify the browser UI to add the FlyWeb UI element.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

Yes, but not that big of a deal because of comments above.

Which older supported branches are affected by this flaw?

None, since this code is disabled on all release builds.

If not all supported branches, which bug introduced the flaw?

Bug 1272099, Bug 1272101

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

N/A

How likely is this patch to cause regressions; how much testing does it need?

Very unlikely, it's a simple and straightforward change.
Attachment #8859628 - Attachment is obsolete: true
Attachment #8866416 - Flags: sec-approval?
Comment on attachment 8866416 [details] [diff] [review]
bug-1346507.patch

sec-approval+
Attachment #8866416 - Flags: sec-approval? → sec-approval+
Attachment #8866416 - Flags: review?(bugs)
Comment on attachment 8866416 [details] [diff] [review]
bug-1346507.patch

What keeps the entries in serversCopy alive when iterating it?
Usually this kinds of array keep strong references.
Attachment #8866416 - Flags: review?(bugs) → review-
Comment on attachment 8866416 [details] [diff] [review]
bug-1346507.patch

The lifetime of the servers is managed manually.  Jonas wrote this code, and it was r+-ed initially - so I can't speak to the direct reasons for why it's done this way.  I think it has to do with having this be a weak pointer due to how the server structures interact with content code.

Jonas did this thing where depending on whether e10s is enabled or not, the server object types refer to either a wrapped type talking over IPC to chrome process, or directly the server object itself.  It makes the lifetime management more finnicky, and I think he takes a manual approach.

I really don't want to go into fully explaining the reasoning for an approach I didn't write.  I plan on ripping this code out entirely soon enough, and it's disabled entirely on non-nightly builds.

Resubmitting same patch for r? with above explanation.
Attachment #8866416 - Flags: review- → review?(bugs)
Comment on attachment 8866416 [details] [diff] [review]
bug-1346507.patch

Based on IRC, new patch coming.
Attachment #8866416 - Flags: review?(bugs)
Attachment #8866416 - Attachment is obsolete: true
Attachment #8867237 - Flags: review?(continuation)
Updated patch which holds strong pointers to the server objects.
Comment on attachment 8867237 [details] [diff] [review]
bug-1346507.patch

Review of attachment 8867237 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/flyweb/FlyWebService.cpp
@@ +1107,5 @@
>    NS_ENSURE_SUCCESS(rv, rv);
>  
> +  // Make a copy of mServers to iterate over, because closing a server
> +  // can remove entries from mServers.
> +  nsCOMArray<FlyWebPublishedServer> serversCopy;

If you made the type nsTArray<RefPtr<FlyWebPublishedServer>> then you could initialize it by doing serversCopy = servers. Either way is fine.
Attachment #8867237 - Flags: review?(continuation) → review+
Thanks :mccr8.  I'm going to push as is because it builds as is, and I don't want to change the code and try a re-build.  Mostly interested in getting this off my plate ASAP and back to pressing other work.  So if it's correct it's good enough for now.
Yeah, that totally makes sense.
https://hg.mozilla.org/mozilla-central/rev/da65be0462e0
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Lowering severity to sec-moderate: we don't have any evidence this could be turned into something worse than a null deref just worries. 

(In reply to Frederik Braun [:freddyb] from comment #13)
> It seems this bug is actionable without a proper test case.
> How strongly do we actually need a it?

"testcase-wanted" doesn't mean "non-actionable", it means there isn't a testcase. It hints at "non-actionable" wrt security triage, but it also means QA would need to create a testcase in order to verify the bug.
Flags: sec-bounty? → sec-bounty+
Group: dom-core-security → core-security-release
Flags: qe-verify+
Whiteboard: [post-critsmash-triage]
Any chance we could get that test case in time to manually verify the fix on 55.0 and 52.3esr? Or is there another way manual QA could help here?
Flags: needinfo?(kvijayan)
Group: core-security-release
Product: Core → Core Graveyard
Flags: needinfo?(kvijayan)
You need to log in before you can comment on or make changes to this bug.