Last Comment Bug 1321066 - (CVE-2016-9079) Reported Firefox SVG 0-day (Iterator invalidation in nsSMILTimeContainer::NotifyTimeChange())
(CVE-2016-9079)
: Reported Firefox SVG 0-day (Iterator invalidation in nsSMILTimeContainer::Not...
Status: VERIFIED FIXED
[adv-main50+][ESR45.5.1+] Appears to ...
: crash, csectype-uaf, sec-critical, testcase
Product: Core
Classification: Components
Component: SVG (show other bugs)
: Trunk
: All All
-- normal (vote)
: mozilla53
Assigned To: Andrew McCreight [:mccr8]
:
: Jet Villegas (:jet)
Mentors:
: 1321100 (view as bug list)
Depends on:
Blocks: 1321357
  Show dependency treegraph
 
Reported: 2016-11-29 12:12 PST by Daniel Veditz [:dveditz]
Modified: 2016-12-10 08:20 PST (History)
55 users (show)
twsmith: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
verified
+
verified
+
fixed
+
fixed
50+
verified


Attachments
0-day sample -- BE CAREFUL, MAY NOT BE FULLY DE-FANGED (5.12 KB, application/zip)
2016-11-29 12:12 PST, Daniel Veditz [:dveditz]
no flags Details
50.1_log.txt (12.57 KB, text/plain)
2016-11-29 13:04 PST, Tyson Smith [:tsmith]
no flags Details
nightly_log.txt (14.12 KB, text/plain)
2016-11-29 13:05 PST, Tyson Smith [:tsmith]
no flags Details
test_case_fixed.zip (4.89 KB, application/x-zip-compressed)
2016-11-29 13:06 PST, Tyson Smith [:tsmith]
no flags Details
single file test case (2.59 KB, text/html)
2016-11-29 14:06 PST, Andrew McCreight [:mccr8]
no flags Details
Explicitly guard against reentrance in nsSMILTimeContainer (4.90 KB, patch)
2016-11-29 14:07 PST, Andrew McCreight [:mccr8]
no flags Details | Diff | Splinter Review
Explicitly guard against reentrance in nsSMILTimeContainer (3.03 KB, patch)
2016-11-29 14:40 PST, Andrew McCreight [:mccr8]
no flags Details | Diff | Splinter Review
Explicitly guard against reentrance in nsSMILTimeContainer. (4.63 KB, patch)
2016-11-29 14:42 PST, Andrew McCreight [:mccr8]
dholbert: review+
Details | Diff | Splinter Review
Explicitly guard against reentrance in nsSMILTimeContainer. (5.04 KB, patch)
2016-11-29 15:10 PST, Andrew McCreight [:mccr8]
continuation: review+
dveditz: approval‑mozilla‑aurora+
dveditz: approval‑mozilla‑beta+
lhenry: approval‑mozilla‑release+
lhenry: approval‑mozilla‑esr45+
dveditz: sec‑approval+
Details | Diff | Splinter Review
smil_crash.html (1.14 KB, text/html)
2016-11-29 15:26 PST, Tyson Smith [:tsmith]
no flags Details
reduced interactive testcase (325 bytes, text/html)
2016-11-29 15:37 PST, Daniel Holbert [:dholbert] (vacation, returning 2/27)
no flags Details
Video for Firefox 51.0b5 (28 bytes, text/plain)
2016-11-29 23:23 PST, Cynthia Tang [:cynthiatang]
no flags Details
Another more real world like test case (680 bytes, text/html)
2016-11-30 18:06 PST, Yuhong Bao
no flags Details

Description User image Daniel Veditz [:dveditz] 2016-11-29 12:12:26 PST
Created attachment 8815435 [details]
0-day sample -- BE CAREFUL, MAY NOT BE FULLY DE-FANGED

An admin at obscuredfiles.com received this sample as an email attachment. The vulnerability uses SVG animation and workers, and includes what looks like ASLR ROP programming to attack windows. The example is Firefox specific, and the letter writer speculates that some of the Obscured folks are known to use the Tor browser and maybe that's why they were sent this Firefox-specific file.

The cssbanner.js text has been pretty-printed by the obscured folks. The original was minified and they did not send us the original version.

I've commented out the original payload so hopefully this PoC will "work" but be harmless. Would appreciate people being extra careful with it in case there's a landmine hidden in the rop code that I left.
Comment 1 User image Tyson Smith [:tsmith] 2016-11-29 13:04:13 PST
Created attachment 8815445 [details]
50.1_log.txt
Comment 2 User image Tyson Smith [:tsmith] 2016-11-29 13:05:22 PST
Created attachment 8815446 [details]
nightly_log.txt
Comment 3 User image Tyson Smith [:tsmith] 2016-11-29 13:06:35 PST
Created attachment 8815447 [details]
test_case_fixed.zip
Comment 4 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 13:11:05 PST
The nightly_log.txt ASAN stacks seem to be mostly in SMIL timing-model stuff (e.g. nsSMILTimeContainer, nsSMILTimedElement, MilestoneEntry), which birtles knows the most about.

CC'ing him [he may be the best able to get to the bottom of this], but I'll take a closer look as well.
Comment 5 User image Andrew McCreight [:mccr8] 2016-11-29 13:17:31 PST
Just from the stacks, it looks like a case of iterator invalidation in nsSMILTimeContainer::NotifyTimeChange(). We iterate over an array, using a raw pointer to the elements, calling some method, and in some particular case the method we call causes the array to be reallocated, destroying the thing we're holding a pointer to.
Comment 6 User image Andrew McCreight [:mccr8] 2016-11-29 13:18:33 PST
nsTObserverArray is designed to handle this kind of reentrance issue, if it can't be prevented.
Comment 7 User image Andrew McCreight [:mccr8] 2016-11-29 13:21:18 PST
I'm not sure if this is more of a bounds failure or a UAF.
Comment 8 User image Nathan Froyd [:froydnj] 2016-11-29 13:23:19 PST
(In reply to Andrew McCreight [:mccr8] from comment #5)
> Just from the stacks, it looks like a case of iterator invalidation in
> nsSMILTimeContainer::NotifyTimeChange(). We iterate over an array, using a
> raw pointer to the elements, calling some method, and in some particular
> case the method we call causes the array to be reallocated, destroying the
> thing we're holding a pointer to.

We ought to be able to detect that in a debug build, then, or changing the assert in NotifyTimeChange into a release assert.
Comment 9 User image Nathan Froyd [:froydnj] 2016-11-29 13:41:52 PST
Though I guess if it were simply reallocated, but was the same length, the assert wouldn't fire...
Comment 10 User image Andrew McCreight [:mccr8] 2016-11-29 14:06:31 PST
Created attachment 8815468 [details]
single file test case

Here's a stripped down version of the original test case that is a single file and does not have any of the ROP/exploit/shell code in it. It still triggers the issue for me, using a patch I will attach.
Comment 11 User image Andrew McCreight [:mccr8] 2016-11-29 14:07:50 PST
Created attachment 8815469 [details] [diff] [review]
Explicitly guard against reentrance in nsSMILTimeContainer

MozReview-Commit-ID: 9#Feedback: feedbacker@example.com
Comment 12 User image Andrew McCreight [:mccr8] 2016-11-29 14:09:11 PST
This is basically the hackiest fix imaginable: every time we're holding one of the scary raw pointers to mMilestoneEntries, we set a field to true. Every time before we write to mMileStoneEntries, we release assert that the field is not true.
Comment 13 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 14:16:20 PST
Comment on attachment 8815469 [details] [diff] [review]
Explicitly guard against reentrance in nsSMILTimeContainer

Review of attachment 8815469 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for posting this!

IMO it'd make sense to take this patch as a way to neuter this attack for the moment, and then we can separately examine the (complex) logic around the reentrance here in a more measured way.

Several nits:
First, the extended commit message has "MozReview-Commit-ID: 9#Feedback: feedbacker@example.com" on its last line. That looks like a mistake? MozReview-Commit-ID is supposed to contain a longer UUID (longer than "9" :)) -- though the UUID doesn't matter here since this isn't using MozReview.  And I've never seen this "Feedback: feedbacker@example.com" in patch headers before, but it doesn't seem particularly useful.

::: dom/smil/nsSMILTimeContainer.cpp
@@ +277,5 @@
>  void
>  nsSMILTimeContainer::Traverse(nsCycleCollectionTraversalCallback* aCallback)
>  {
> +  MOZ_RELEASE_ASSERT(!mHoldingEntries); // XXX technically this might be okay, but then we'd have to track a depth.
> +  mHoldingEntries = true;

We should use a RAII AutoRestore here (and in the other place where we set this to true for a little while).  Then we won't have to worry about unsetting it in possible-future-early-return cases.

::: dom/smil/nsSMILTimeContainer.h
@@ +291,5 @@
>    // taken care of the milestones before the current sample time but before we
>    // actually do the full sample.
>    nsTPriorityQueue<MilestoneEntry> mMilestoneEntries;
> +
> +  bool mHoldingEntries;

Please shift this up higher to be alongside the other bools, so it has a better chance of packing nicely with another bool.  (You'll need to update the constructor init list accordingly, too, since it needs to be in the same order.)
Comment 14 User image Andrew McCreight [:mccr8] 2016-11-29 14:24:34 PST
Yeah, the weird commit message is because Emacs dies while I was editing it while uploading, so there was some extra meta data in there. Sorry about that. I'll fix up the AutoRestore stuff, I was just being lazy.
Comment 16 User image Daniel Veditz [:dveditz] 2016-11-29 14:27:17 PST
Appears to be public. mccr8 googled _resolve_pe_structures and found this pastebin from yesterday (900+ views): http://pastebin.com/7xZ27haj

Someone received a copy and posted the exploit to the tor-talk mailing list about 30 minutes ago:
https://lists.torproject.org/pipermail/tor-talk/2016-November/042639.html
Comment 17 User image Andrew McCreight [:mccr8] 2016-11-29 14:38:14 PST
This file hasn't changed substantially in 5 years, so I imagine all branches are affected.
Comment 18 User image Ryan VanderMeulen [:RyanVM] 2016-11-29 14:40:49 PST
[Tracking Requested - why for this release]:
Comment 19 User image Andrew McCreight [:mccr8] 2016-11-29 14:40:57 PST
Created attachment 8815483 [details] [diff] [review]
Explicitly guard against reentrance in nsSMILTimeContainer

I need to do more testing (to check if this actually helps in release builds, and to make sure it doesn't break any SVG tests), but I guess it can be reviewed now.
Comment 20 User image Andrew McCreight [:mccr8] 2016-11-29 14:42:34 PST
Created attachment 8815484 [details] [diff] [review]
Explicitly guard against reentrance in nsSMILTimeContainer.
Comment 21 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 14:45:41 PST
*** Bug 1321100 has been marked as a duplicate of this bug. ***
Comment 22 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 14:54:12 PST
Comment on attachment 8815484 [details] [diff] [review]
Explicitly guard against reentrance in nsSMILTimeContainer.

Review of attachment 8815484 [details] [diff] [review]:
-----------------------------------------------------------------

r=me; two nits:

::: dom/smil/nsSMILTimeContainer.cpp
@@ +265,5 @@
>    bool gotOne = false;
>    while (!mMilestoneEntries.IsEmpty() &&
>        mMilestoneEntries.Top().mMilestone == containerMilestone)
>    {
> +    MOZ_RELEASE_ASSERT(!mHoldingEntries);

Let's pull this assert up a few lines, to happen just before the "while" loop.

(It's wasteful & unnecessary to re-check the bool on every loop iteration -- the variable isn't going to change value during the loop, except possibly during a function-call, but then that function call will restore the old value.)

@@ +276,5 @@
>  
>  void
>  nsSMILTimeContainer::Traverse(nsCycleCollectionTraversalCallback* aCallback)
>  {
> +  AutoRestore<bool> saveHolding(mHoldingEntries);

Please add the include for this at the top of this file:
#include "mozilla/AutoRestore.h"
Comment 23 User image Andrew McCreight [:mccr8] 2016-11-29 15:10:03 PST
Created attachment 8815492 [details] [diff] [review]
Explicitly guard against reentrance in nsSMILTimeContainer.
Comment 24 User image Andrew McCreight [:mccr8] 2016-11-29 15:12:50 PST
(In reply to Daniel Holbert [:dholbert] from comment #22)
> Let's pull this assert up a few lines, to happen just before the "while"
> loop.
Fixed.
 
> during a function-call, but then that function call will restore the old
> value.)

Ah, good point, I hadn't thought about that.

> Please add the include for this at the top of this file:
> #include "mozilla/AutoRestore.h"

Fixed.

I didn't see any other uses of Elements() in the SMIL directory so hopefully this pattern is not repeated elsewhere.

try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7d58b3d48afc1af2f772ddb92f135563d34c8c3c
Comment 25 User image Tyson Smith [:tsmith] 2016-11-29 15:26:16 PST
Created attachment 8815496 [details]
smil_crash.html

Further reduce test case
Comment 26 User image Andrew McCreight [:mccr8] 2016-11-29 15:35:19 PST
Per dholbert's suggestion, I locally ran Mochitests in content/smil/test/ and dom/svg/test/, reftests in layout/reftests/svg/smil and web platform tests in testing/web-platform/tests/svg/ and they all passed, so this at least doesn't totally break SMIL.
Comment 27 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 15:37:47 PST
Created attachment 8815501 [details]
reduced interactive testcase

(Here's a further-reduced testcase, based on the one Tyson just posted. It won't crash until you click a button.)
Comment 28 User image Ritu Kothari (:ritu) 2016-11-29 15:40:38 PST
Tracked for 50+
Comment 29 User image Daniel Veditz [:dveditz] 2016-11-29 15:43:54 PST
Comment on attachment 8815492 [details] [diff] [review]
Explicitly guard against reentrance in nsSMILTimeContainer.

Let's get this landed on the usual branches at least. Is this ready to land on -release and -esr or do we need more testing?
Comment 30 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 15:45:50 PST
Comment on attachment 8815501 [details]
reduced interactive testcase

>    <animate id="ia" end="50s"></animate>
>    <animate begin="60s" end="ic.end"></animate>

Note: the hardcoded times in the testcase here aren't magic numbers -- they only matter a bit.  Specifically -- for the crash to happen...
 (1) The first animation's end-time must be before the second animation's begin-time.
 (2) You must to call pauseAnimations() [click the button] before the first animation has begun.
Comment 31 User image Andrew McCreight [:mccr8] 2016-11-29 15:48:38 PST
[Security approval request comment]
How easily could an exploit be constructed based on the patch? It is fairly obvious where the problem is. Also an exploit for this is already public.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Not really.

Which older supported branches are affected by this flaw? All.

If not all supported branches, which bug introduced the flaw? N/A

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? The two patched files haven't changed much in the last year, so backports should be trivial.

How likely is this patch to cause regressions; how much testing does it need? It looks fairly safe to me. My patch causes an intentional crash, but only when we're in danger of hitting memory corruption anyways. Plus I think SMIL isn't a very widely used feature, which reduces the risk.
Comment 32 User image Andrew McCreight [:mccr8] 2016-11-29 15:49:21 PST
(In reply to Daniel Veditz [:dveditz] from comment #29)
> Let's get this landed on the usual branches at least. Is this ready to land
> on -release and -esr or do we need more testing?

Like I said, this code hasn't changed in any substantial way in around 5 years so it should be okay everywhere.
Comment 34 User image Liz Henry (:lizzard) (needinfo? me) 2016-11-29 16:13:05 PST
Comment on attachment 8815492 [details] [diff] [review]
Explicitly guard against reentrance in nsSMILTimeContainer.



[Triage Comment]
Comment 35 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 16:13:59 PST
(In reply to Daniel Holbert [:dholbert] from comment #30)
> Note: the hardcoded times in the testcase here aren't magic numbers -- they
> only matter a bit.  Specifically -- for the crash to happen...
>  (1) The first animation's end-time must be before the second animation's
> begin-time.
>  (2) You must to call pauseAnimations() [click the button] before the first
> animation has begun.

(Sorry -- in (2), I meant "before the first animation's end-time". [It has no explicit begin time.])
Comment 36 User image Andrew McCreight [:mccr8] 2016-11-29 16:16:25 PST
For what it is worth, I think this may have been introduced in bug 474743, which landed in 2010.
Comment 37 User image Liz Henry (:lizzard) (needinfo? me) 2016-11-29 16:24:16 PST
mccr8 and dholbert will land this directly on affected branches. Then I think releng will take care of getting it into the particular release relbranches.
Comment 38 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 16:26:29 PST
https://hg.mozilla.org/releases/mozilla-aurora/rev/bbdf2771fc05481d45509d631c1ebf8b589dca49
Comment 39 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 16:34:08 PST
https://hg.mozilla.org/releases/mozilla-beta/rev/c736f46410bfef32acf9f38adaa930a9684b28c1
Comment 40 User image Daniel Veditz [:dveditz] 2016-11-29 16:34:41 PST
Jim, Bob: This 0-day exploit might be a good chance to test the effectiveness of the Windows sandbox, and maybe learn more what we need to do.
Comment 41 User image Daniel Veditz [:dveditz] 2016-11-29 16:37:41 PST
Looks like Thunderbird can hit this code, too: bp-11fc7f68-e391-4b23-9d01-d23702161128 (that's not me, found this UAF in crash-stats). At least no scripting in the mail view so it's not easily wormable.
Comment 43 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 16:50:29 PST
From a "is this going to break any websites" perspective:
 - This is a feature (SMIL) that was never implemented by IE/Edge.
 - Chrome implemented it but is intending to deprecate it.
 - The flaw (and the change) are in syncbase timing, which is a particularly niche sub-feature.
 - SMIL itself is very niche and doesn't get a lot of use on the web.

SO.  I think potentially-breakable-web-content would be hard to find, and our automated test suite is as good or better than looking for real-world stuff on the web, I think (in terms of "will this break anything").
Comment 44 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 16:53:10 PST
(We added comprehensive tests for this sub-feature back in bug 537852. Apparently not quite comprehensive enough to have detected this bug here, but hopefully comprehensive enough to catch any breakage that we might be causing.

Plus, the nature of the patch is that *any* behavior-change will be changing us *away* from doing something unsafe which would probably already crash [and now we'll crash safely instead]. So, not particularly worrisome.)
Comment 45 User image Wayne Mery (:wsmwk, NI for questions) 2016-11-29 17:08:26 PST
Noted for Thunderbird. Thanks
Comment 46 User image Liz Henry (:lizzard) (needinfo? me) 2016-11-29 17:13:40 PST
We will need help smoketesting all affected versions once we have them built, since it is a national holiday in Romania right now (for the next few days). Sounds like we expect to still crash. The crash signature will be different.
Comment 47 User image Wes Kocher (:KWierso) 2016-11-29 17:37:10 PST
FIREFOX_50_0_2_RELBRANCH: https://hg.mozilla.org/releases/mozilla-release/rev/cc272f7d48d3544ffaf242b51430ffcf3a932a29
Comment 48 User image Al Tsai [:atsai] 2016-11-29 23:11:37 PST
After a quick test on 51.0b5 on Mac OS X, Windows 7, and Ubuntu x86_64. We got the same result that all Firefox crashed after clicking the button on reduced interactive testcase
Comment 49 User image Cynthia Tang [:cynthiatang] 2016-11-29 23:23:36 PST
Created attachment 8815620 [details]
Video for Firefox 51.0b5

Test result: https://youtu.be/4vMopBX4mTs
Comment 50 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 23:34:42 PST
Yeah -- a crash is still expected, even in a "good" build with the fix.  The difference is whether the crash is safe or not.  (Per beginning of comment 13, we'd like to avoid crashing at all, but it's simpler to just abort safely in the error condition for now.)

To validate the fix, you should load the interactive testcase and trigger the crash, and then go ahead and submit the crash and check the crash signature. A "bad" crash looks like this:
bp-d49b2ac3-2bec-40b1-b633-b81cf2161130

A "good" crash will look a bit different -- it should at least have a "crash address" of 0x0 or close to that, I think.  I'm not sure exactly how it'll look because I don't have a expected-"good" build handy. Do you have a link to the build that you're testing?
Comment 52 User image Al Tsai [:atsai] 2016-11-29 23:37:33 PST
Comment 51 for the crash report, I am not sure if the patch fixes it yet.
Comment 53 User image Wes Kocher (:KWierso) 2016-11-29 23:40:59 PST
I crashed the linux64 50.0.2 build with the interactive testcase, and here's my crash report: https://crash-stats.mozilla.com/report/index/7b0b3ced-b715-45d3-8dd7-50b3d2161130
Comment 55 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-29 23:57:58 PST
Thanks. Both comment 51 and comment 53 are "good" (safe) crashes. They both have
> MOZ_CRASH Reason 	MOZ_RELEASE_ASSERT(!mHoldingEntries)
...which indicates that we intentionally & safely crashed using one of mccr8's added release-build-asserts.

You can consider the presence of MOZ_RELEASE_ASSERT(!mHoldingEntries) in crash reports as confirmation that the patch has effectively neutralized the problem.
Comment 56 User image Cynthia Tang [:cynthiatang] 2016-11-30 00:02:43 PST
Hi Daniel,
I just tested it on Windows 10 pro (x64). The crash address is "0x7ff9fb1c05c0". I am not sure if we expected that. Thanks.

Crash report https://crash-stats.mozilla.com/report/index/83a20ef1-c91a-4c4b-a334-181de2161130#tab-rawdump
Comment 57 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-30 00:08:39 PST
Thanks, Cynthia.  I believe the presence of "MOZ_RELEASE_ASSERT" in that crash report means all is well. (Additionally, the backtrace confirms that we're crashing on that new assertion line in AddMilestone, which is just before we set ourselves up for trouble.)

I guess MOZ_RELEASE_ASSERT produces a non-null crash address on Windows for some reason; not sure why. But I think that crash report is still fine -- MOZ_RELEASE_ASSERT should be all that we need to look for here.
Comment 58 User image Al Tsai [:atsai] 2016-11-30 01:22:29 PST
Currently, all tests look good to me.

We tested:
- 45.5.1esr
- 50.0.2
- 51.0b5

platform:
- Windows 7 64bit
- Mac OS X 10.12.1
- Ubuntu 14.04 LTS 64-bit

Fix works on all 9 combinations. We also have a glance on web-compatibility and graphic smoke test and didn't find significant issues.
Comment 59 User image Bob Owen (:bobowen) (PTO until 25th Feb) 2016-11-30 01:31:02 PST
(In reply to Daniel Veditz [:dveditz] from comment #40)
> Jim, Bob: This 0-day exploit might be a good chance to test the
> effectiveness of the Windows sandbox, and maybe learn more what we need to
> do.

Indeed, do we have a test case that fully demonstrates the exploit ... in a safe way. :)
Comment 60 User image Frederik Braun [:freddyb] (off until March 6th) 2016-11-30 04:32:45 PST
Yes, Bob, See attachments added in comment 25 and comment 27.
Comment 61 User image Jim Mathies [:jimm] 2016-11-30 04:34:45 PST
(In reply to Bob Owen (:bobowen) from comment #59)
> (In reply to Daniel Veditz [:dveditz] from comment #40)
> > Jim, Bob: This 0-day exploit might be a good chance to test the
> > effectiveness of the Windows sandbox, and maybe learn more what we need to
> > do.
> 
> Indeed, do we have a test case that fully demonstrates the exploit ... in a
> safe way. :)

I'm curious what hackers do once they have access to kernel32. From news articles I've read it sounds like this is being used to fingerprint people using system info. The level 1 (low integrity) sandbox on release isn't going to secure against that. I would love to get the details on what they are doing so we can use these to test the sandbox in the future. If anyone has any additional detail please send it my way.
Comment 62 User image Bob Owen (:bobowen) (PTO until 25th Feb) 2016-11-30 04:40:25 PST
(In reply to Frederik Braun [:freddyb] from comment #60)
> Yes, Bob, See attachments added in comment 25 and comment 27.

Right, but as I understand it, those just trigger the crash.
They don't exploit it to get RCE and then do something interesting with it.
Comment 63 User image Frederik Braun [:freddyb] (off until March 6th) 2016-11-30 04:43:17 PST
Gotcha! According to <https://twitter.com/TheWack0lian/status/803736507521474560>, the actual exploit gets IP/MAC address and sends this to a hardcoded IP address. Safe enough to test in a VM that's offline and can be rolled back?!
Comment 64 User image Carsten Book [:Tomcat] 2016-11-30 06:53:38 PST
https://hg.mozilla.org/mozilla-central/rev/9271347b07d2
Comment 65 User image Andrew McCreight [:mccr8] 2016-11-30 07:28:11 PST
Not that it matters, but the initial landing on m-c was this:
  https://hg.mozilla.org/mozilla-central/rev/adcc39e3cad0f32aba0efb478cc4a023a5dfc43f
I forgot to post the link in this bug and set the flag.
Comment 66 User image Wayne Mery (:wsmwk, NI for questions) 2016-11-30 10:02:57 PST
https://crash-stats.mozilla.com/report/index/a99ccf9b-0064-4d38-a076-6bc8b2161130 Thunderbird 45.5.1 build2, has  	MOZ_RELEASE_ASSERT(!mHoldingEntries)  tested via thunderbrowse addon. So confirmed fixed
Comment 67 User image Kamil Jozwiak [:kjozwiak] 2016-11-30 10:39:08 PST
Using the original poc with the live payload that was added in comment#0, I went through the following verification:

Platforms Used:

* Win 7 Pro SP1 x64 -> PASSED
* Win 7 Pro SP1 x86 -> PASSED
* Win 8 Pro x64 -> PASSED
* Win 8.1 x64 -> PASSED
* Win 10 x86 -> PASSED
* Win 10 x64 -> PASSED

Builds Used:

* fx50.0.2, buildId: 20161129173726, changeset: cc272f7d48d3
* fx45.5.1, buildId: 20161129180326, changeset: f2792b829193

Ensured the following was occurring when FX crashed:

* the crash is occurring under the new assertion that was added into the "nsSMILTimeContainer::AddMilestone" member function
* MOZ_CRASH Reason: MOZ_RELEASE_ASSERT(!mHoldingEntries)
Comment 68 User image Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2016-11-30 10:44:43 PST
Looks good for Fennec Release on my Google Pixel XL:
* Fennec 50.0.2: bp-1af50520-ad59-46cc-9647-e192d2161130 [contains MOZ_RELEASE_ASSERT(!mHoldingEntries)]
* Fennec 50.0.1: bp-384dcb09-9c0a-430e-a960-807f82161130 [does not contain a MOZ_CRASH Reason]
Comment 69 User image Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2016-11-30 10:55:49 PST
Looks good for Fennec Beta on my Google Pixel XL:
* Fennec 51.0b5: bp-6c583b9b-fccb-46cc-b96b-b22912161130 [contains MOZ_CRASH Reason: MOZ_RELEASE_ASSERT(!mHoldingEntries)]
* Fennec 51.0b4: bp-e339411b-a5ae-4c1c-810c-359a22161130 [does not contain a MOZ_CRASH Reason]
Comment 70 User image Daniel Holbert [:dholbert] (vacation, returning 2/27) 2016-11-30 10:59:56 PST
(In reply to Daniel Holbert [:dholbert] from comment #13)
> IMO it'd make sense to take this patch as a way to neuter this attack for
> the moment, and then we can separately examine the (complex) logic around
> the reentrance here in a more measured way.

(Following up on this: I've now filed bug 1321357 on looking into fixing this more thoroughly & preventing any crashes (even safe ones) on testcases like the ones in this bug.)
Comment 71 User image Justin [:JW_SoftvisionQA] 2016-11-30 12:00:38 PST
Looks good on Mac 10.12 and Ubuntu 16.04

* fx50.0.2, buildId: 20161129173726, changeset: cc272f7d48d3
* fx45.5.1, buildId: 20161129180326, changeset: f2792b829193
Comment 72 User image Justin [:JW_SoftvisionQA] 2016-11-30 12:01:43 PST
Marking this a Verified now that all OS's are complete.
Comment 73 User image Yuhong Bao 2016-11-30 18:06:14 PST
Created attachment 8815960 [details]
Another more real world like test case

A more real world like test case with an actual animation. Click "Pause animation 1" to trigger bug.
Comment 74 User image forkest 2016-12-01 05:10:22 PST
I have read the bug description and opened obscuredfiles.com just now, before updating to 50.0.2, the site currently has some strange suspicious message and a gif file. Can somebody from the security team  please confirm me that the gif there is safe and I'm not infected? I know that this particular bug was about SVG, but...
Comment 75 User image Frederik Braun [:freddyb] (off until March 6th) 2016-12-01 05:16:19 PST
The exploit we have seen made the browser crash, after the malicious shell code was executed.
It's likely that you're safe if the browser did not crash.
Comment 76 User image forkest 2016-12-01 06:19:37 PST
Thank you, Frederik. I feel safe now.
Comment 77 User image Ryan VanderMeulen [:RyanVM] 2016-12-01 06:26:24 PST
To be clear for those arriving at this bug after the fact, with the patch that landed in this bug and shipped in Fx50.0.2/ESR45.5.1, Firefox will *still* crash when the exploit attempts to run.

However, it will now do so safely before anything exploitable has a chance to happen. You can verify that the crash happened safely by looking for the "MOZ_CRASH Reason: MOZ_RELEASE_ASSERT(!mHoldingEntries)" annotation in the crash report viewable in about:crashes. Comment 51 has a link to a crash report showing what that looks like.

Bug 1321357 tracks fixing Firefox to avoid crashing at all.
Comment 78 User image Frederik Bovendeerd 2016-12-01 09:53:15 PST
Do I understand correctly, that fixes like this (in general and in this specific case) are implemented in both the new release as well as the most recent beta version?
I cannot find an explicit listing of changes from beta 51.0b4 to 51.0b5, but assume/deduct from what I read in this thread that 51.0b5 is safe, too.
I don't want to downgrade to 50.0.2 not to be vulnerable anymore. ;)

> The exploit we have seen made the browser crash, after the malicious shell code was executed.
So, prior to the fix, the exploit would have allowed anyone to execute code, THEN the browser would crash and now it essentially crashes BEFORE any malicious code can be executed, until a cleaner fix can be found?

Thanks in advance for your answers!
Comment 79 User image Ryan VanderMeulen [:RyanVM] 2016-12-01 10:07:23 PST
(In reply to Frederik Bovendeerd from comment #78)
> Do I understand correctly, that fixes like this (in general and in this
> specific case) are implemented in both the new release as well as the most
> recent beta version?

Correct, 51b5 also contains this fix. As do the latest Nightly and DevEdition builds.
https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=FIREFOX_51_0b4_RELEASE&tochange=FIREFOX_51_0b5_RELEASE

> So, prior to the fix, the exploit would have allowed anyone to execute code,
> THEN the browser would crash and now it essentially crashes BEFORE any
> malicious code can be executed, until a cleaner fix can be found?

My understanding is that previously the browser would either crash or execute the payload depending on the circumstances. But yes, the fix that landed here makes Firefox crash in a controlled way before it ever gets to that point.
Comment 80 User image Frederik Bovendeerd 2016-12-01 12:10:29 PST
Thanks for the quick reply and the link. This is exactly what I was not able to find.

Concerning crashing or no crashing, your understanding seems to differ from Frederik Braun's. ;)
> The exploit we have seen made the browser crash, after the malicious shell code was executed.
Comment 81 User image Frederik Braun [:freddyb] (off until March 6th) 2016-12-01 23:44:42 PST
Given the attention this bug is getting, I feel things are getting way too complicated for all of our users to understand.
If you are reading this because you want to dig deeper, I'm happy to explain things in #security on irc.mozilla.org.

Here's my response, none the less:

An attacker exploiting this vulnerability can do anything, once they gain control over Firefox. That's how exploits work.
The attack we've seen tried to exfiltrate data and then crashed. A more skilled attacker could just resume normal program execution. But that's not what we have seen and there's no sign someone significantly improved the attack within the 24 hours we took to get the fix out (which is indeed in all branches of Firefox, as Ryan mentioned)

Due to time constraints, the fix we have landed is not the most elegant. In the patched version, Firefox notices that something is wrong and exits abruptly (crashes), which is slightly different from a crash due to undefined behavior. See more about understanding crash reports at <https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Crash_reporting/Understanding_crash_reports>

(My comment 75 was a simplification. Given that the user in comment 74 saw the same GIF as I did (we visited the web page at roughly the same time) and that his browser did not crash, my statement still holds true.)
Comment 82 User image Andrei Vaida, QA [:avaida] – please ni? me 2016-12-02 06:55:19 PST
We've also managed to confirm that this is fixed on 51.0b6-build1 (20161201172143), using the following platforms: Windows 10 x64, Windows 8.1 x86, Mac OS X 10.11.6 and Ubuntu 14.04 x86.

Updating flags accordingly.
Comment 83 User image Frederik Bovendeerd 2016-12-02 11:28:41 PST
(In reply to Frederik Braun [:freddyb] from comment #81)

> An attacker exploiting this vulnerability can do anything, once they gain
> control over Firefox. That's how exploits work.
> The attack we've seen tried to exfiltrate data and then crashed. A more
> skilled attacker could just resume normal program execution. 

Thanks for clarifying that.

I guess it just bothered me that this particular exploit was "featured" on all sorts of IT news sites and blogs and although dozens of people on every site I have found were terribly worried of having been affected by 'it' (as in the exploit with the "original" shell code, that sent IP and MAC address to France), it was never once mentioned that this combination of exploit and executed code would have crashed the browser and if you experienced no such crash, you'd most likely be fine.

In addition it was rarely mentioned that exploit and code are in essence two separate things, as in the exploit being the "backdoor" to FF and by that your system and the code being executed as the "what happens then" part.

It was always along the lines of "careful there is an exploit that sends potentially private data off to God-knows-who and there is no way to figure out if it happened to you, sorry, my bad".
Comment 84 User image NoOp 2016-12-03 17:56:49 PST Comment hidden (offtopic)
Comment 85 User image NoOp 2016-12-03 18:08:49 PST Comment hidden (offtopic)
Comment 86 User image NoOp 2016-12-03 18:29:06 PST Comment hidden (offtopic)
Comment 87 User image NoOp 2016-12-04 11:12:46 PST Comment hidden (off-topic)
Comment 88 User image Al Billings [:abillings] 2016-12-04 16:34:31 PST Comment hidden (off-topic)
Comment 89 User image Frederik Braun [:freddyb] (off until March 6th) 2016-12-05 02:06:48 PST
"MOZ_CRASH Reason 	MOZ_RELEASE_ASSERT(!mHoldingEntries)" means it *is* patched.
Comment 90 User image Richard Barnes [:rbarnes] 2016-12-05 11:52:47 PST
Exodus Intel provided the following feedback on the patch:

=====

Regarding your patch: We have a suggestion (please bear with us because we are not developers).

This boundary condition is flawed and should be addressed.

> 310 	const MilestoneEntry* p = mMilestoneEntries.Elements();
> 311 	#if DEBUG
> 312 	uint32_t queueLength = mMilestoneEntries.Length();
> 313 	#endif
> 314 	while (p < mMilestoneEntries.Elements() + mMilestoneEntries.Length()) {
> 315 	  mozilla::dom::SVGAnimationElement* elem = p->mTimebase.get();
> 316 	  elem->TimedElement().HandleContainerTimeChange();
> 317 	  MOZ_ASSERT(queueLength == mMilestoneEntries.Length(),
> 
> 322 	const MilestoneEntry* p = mMilestoneEntries.Elements();
> 323 	#if DEBUG
> 324 	uint32_t queueLength = mMilestoneEntries.Length();
> 325 	#endif
> 326 	while (p < mMilestoneEntries.Elements() + mMilestoneEntries.Length()) {
> 327 	  mozilla::dom::SVGAnimationElement* elem = p->mTimebase.get();
> 328 	  elem->TimedElement().HandleContainerTimeChange();
> 329 	  MOZ_ASSERT(queueLength == mMilestoneEntries.Length(),

The reason it is flawed is that "p" is set on initial entry, but the boundary check against is invalid once the object changes.

The flaw should be mitigated if you change the logic of the loop and it's related boundary check to account for possible changes to the object during each iteration.
Comment 91 User image Daniel Veditz [:dveditz] 2016-12-05 12:25:30 PST
It's not clear from the formatting above, but the two chunks were supposed to be parallel columns showing the old and new versions -- in other words we did not change the loop that was the source of the problem. Instead this bug provided a stop-gap patch in the lines above that, to crash if there's an attempt to modify the object while it is in that loop.

The "real" fix that rewrites that loop is in bug 1321357 (see comment 70)

Note You need to log in before you can comment on or make changes to this bug.