Last Comment Bug 653238 - nsSMILAnimationController::mDocument is suspicious
: nsSMILAnimationController::mDocument is suspicious
Status: RESOLVED FIXED
[sg:critical?]
:
Product: Core
Classification: Components
Component: SVG (show other bugs)
: Trunk
: All All
: -- normal (vote)
: ---
Assigned To: Daniel Holbert [:dholbert]
:
Mentors:
Depends on: 653270 654015
Blocks:
  Show dependency treegraph
 
Reported: 2011-04-27 14:03 PDT by Olli Pettay [:smaug]
Modified: 2013-12-27 14:19 PST (History)
5 users (show)
dholbert: in‑testsuite-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
fixed
fixed
-
wanted
unaffected
unaffected


Attachments
patch 1 v1 (3.92 KB, patch)
2011-04-27 16:24 PDT, Daniel Holbert [:dholbert]
bugs: review+
asa: approval‑mozilla‑aurora+
dveditz: approval‑mozilla‑beta+
Details | Diff | Review
(ignore, empty patch) (120 bytes, patch)
2011-04-27 17:08 PDT, Daniel Holbert [:dholbert]
no flags Details | Diff | Review
patch 2: promote GetRefreshDriverForDoc to an instance method (7.85 KB, patch)
2011-04-27 17:10 PDT, Daniel Holbert [:dholbert]
bugs: review+
asa: approval‑mozilla‑aurora+
dveditz: approval‑mozilla‑beta+
Details | Diff | Review
fix compilation with --disable-smil (688 bytes, patch)
2011-04-29 17:57 PDT, Bartłomiej Brzozowiec (BartZilla)
dholbert: review+
asa: approval‑mozilla‑aurora+
dveditz: approval‑mozilla‑beta+
Details | Diff | Review

Description Olli Pettay [:smaug] 2011-04-27 14:03:22 PDT
mDocument is type nsIDocument* and once it is set to some value
it is never cleared, if I read the code correctly.
That might lead to sg:critical crashes if someone managed to
access mDocument once the object is deleted.
Comment 1 Daniel Holbert [:dholbert] 2011-04-27 14:16:37 PDT
(In reply to comment #0)
> mDocument is type nsIDocument* and once it is set to some value
> it is never cleared

Right -- once a nsSMILAnimationController is created, it's tied to a specific document (it's owned by that document) - the idea is it should be torn down when the document is torn down.

There shouldn't be any way for a nsSMILAnimationController to outlive its document -- I don't think anyone else retains an (owning) pointer to it.  A quick search for nsSMILAnimationController confirms that:
http://mxr.mozilla.org/mozilla-central/search?string=nsSMILAnimationController

Given that, perhaps nsIDocument should have a nsAutoPtr<nsSMILAnimationController> rather than a nsRefPtr, to make the ownership/lifetime explicit.
Comment 2 Daniel Holbert [:dholbert] 2011-04-27 14:19:20 PDT
(In reply to comment #1)
> I don't think anyone else retains an (owning) pointer to it.  A
> quick search for nsSMILAnimationController confirms that:

(Sorry, I meant to soften that "confirms" to a less-strong word, as of course someone could own a pointer via one of nsSMILAnimationController's superclasses.)

And that reminds me that we're using nsRefPtr (not nsAutoPtr) explicitly because that's required in order to be an effective subclass of nsARefreshObserver. (since registering to observe refreshes involves incrementing the refcount)
Comment 3 Olli Pettay [:smaug] 2011-04-27 14:23:05 PDT
There could be a Disconnect() method which sets mDocument to null.
And nsDocument could call that.
Comment 4 Daniel Holbert [:dholbert] 2011-04-27 14:43:19 PDT
...and the only nsSMILAnimationController superclass that supports AddRef/Release is nsARefreshObserver, so the only other thing (besides the document) to retain a reference to an animation controller is a nsRefreshDriver.

dbaron tells me that a nsRefreshDriver *can* outlast its document if that document is an iframe, since we essentially only create *one* nsRefreshDriver for the topmost document in a given tab, and then use that same nsRefreshDriver for subdocuments.

So if we were to stay attached to a nsRefreshDriver beyond when our document died, that would be bad.

However, during document teardown, we'll always(?) hit nsDocument::OnPageHide, which calls mAnimationController->OnPageHide(), which calls Pause(), which calls StopSampling(), which unregisters us from the refresh driver.

(Note that script shouldn't be able to unpause us at this point, because this is a "PAUSE_PAGEHIDE"-type Pause, whereas scripted pausing has a different type, and we only resume animation when all types of pauses' flags are cleared)

So -- this means that by the time our document dies, its AnimationController will no longer be observing refreshes, so the dying document will be the last owner of the nsSMILAnimationController (and will kill it off when its nsRefPtr goes away).
Comment 5 Daniel Holbert [:dholbert] 2011-04-27 14:43:58 PDT
(In reply to comment #3)
> There could be a Disconnect() method which sets mDocument to null.
> And nsDocument could call that.

Per comment 4, this shouldn't be necessary right now, but I'd still be happy with making that change for sanity/safety.
Comment 6 Olli Pettay [:smaug] 2011-04-27 14:50:36 PDT
Yes, I'd like to see something like Disconnect().
It should make using mDocument future-proof.
Raw pointers have caused way too many sg:critical problems.
Comment 7 Daniel Holbert [:dholbert] 2011-04-27 15:17:34 PDT
Sure, agreed. I'll whip up something here - taking.
Comment 8 Daniel Holbert [:dholbert] 2011-04-27 16:24:08 PDT
Created attachment 528740 [details] [diff] [review]
patch 1 v1
Comment 9 Daniel Holbert [:dholbert] 2011-04-27 16:33:51 PDT
Comment on attachment 528740 [details] [diff] [review]
patch 1 v1

This patch layers on top of bug 653270.

Notes on patch:
 - I shifted a StopSampling() call from the destructor into Disconnect, since StopSampling() needs a handle to the document in order to look up the nsRefreshDriver.)

 - My Disconnect() method asserts that there aren't any outstanding owning references, besides our document's.  (which means the controller should be about to die).  In the unexpected case where that check fails, though, I've added null-checks to all other uses of mDocument. (the first inside of GetRefreshDriverForDoc, and the other inside of nsSMILAnimationController::DoSample)

(I'm going to post a followup patch to simplify GetRefreshDriverForDoc() into a protected instance method rather than a static helper, since every caller passes the same value -- mDocument -- to it.)
Comment 10 Daniel Holbert [:dholbert] 2011-04-27 17:08:33 PDT
Created attachment 528751 [details] [diff] [review]
(ignore, empty patch)

This second patch just promotes the static helper-method GetRefreshDriverForDoc() to an instance method, so that its callers don't all have to pass mDocument to it.

This patch doesn't make any functional changes - it's just a straight substitution.

After this second patch, the only mentions of mDocument are in Disconnect (which clears it), and in DoSample & GetRefreshDriver (which both have a sanity-null-check, added in previous patch.)

After this cleanup, this hopefully makes it clearer that, while we don't expect mDocument to be null, we'll now be ok if it does end up being that way.
Comment 11 Daniel Holbert [:dholbert] 2011-04-27 17:10:01 PDT
Created attachment 528753 [details] [diff] [review]
patch 2: promote GetRefreshDriverForDoc to an instance method

(sorry, previous version was empty; forgot to qref)
Comment 12 Daniel Holbert [:dholbert] 2011-04-28 01:08:46 PDT
These patches passed TryServer, btw:
http://tbpl.mozilla.org/?tree=Try&rev=0e2ab7795345
Comment 14 Daniel Holbert [:dholbert] 2011-04-28 13:16:26 PDT
With smaug's consent, I'm opening up bug, since there's not anything that we think is actually potentially scary here. (See comment 1, comment 2, comment 4)  This bug's patch was just for sanity / extra-caution / future-proofing.
Comment 15 Bartłomiej Brzozowiec (BartZilla) 2011-04-29 17:57:08 PDT
Created attachment 529251 [details] [diff] [review]
fix compilation with --disable-smil

(In reply to comment #13)
> Landed:
>  http://hg.mozilla.org/mozilla-central/rev/46eb35887fd5
>  (...)

     1.1 -- a/content/base/src/nsDocument.cpp
     1.2 +++ b/content/base/src/nsDocument.cpp
     ...
    1.11  
    1.12 +  if (mAnimationController) {
    1.13 +    mAnimationController->Disconnect();
    1.14 +  }
    1.15 +

I guess it should be #ifdef-ed MOZ_SMIL, as in other cases in this file where mAnimationController is used. Otherwise compilation fails with --disable-smil. Patch attached that fixes this.
Comment 16 Daniel Holbert [:dholbert] 2011-04-29 18:04:12 PDT
Comment on attachment 529251 [details] [diff] [review]
fix compilation with --disable-smil

oops, thank you! r=me.
Comment 17 Daniel Holbert [:dholbert] 2011-04-29 18:19:23 PDT
Landed --disable-smil build fix on Bartłomiej's behalf:
  http://hg.mozilla.org/mozilla-central/rev/a008a3fda2b5
Comment 18 Daniel Holbert [:dholbert] 2011-05-02 10:58:32 PDT
(In reply to comment #14)
> With smaug's consent, I'm opening up bug, since there's not anything that we
> think is actually potentially scary here. (See comment 1, comment 2, comment 4)

Re-hiding, as Jesse seems to have found a way of poking a hole in the logic from comment 1/2/4.
Comment 19 Daniel Holbert [:dholbert] 2011-05-02 10:58:51 PDT
(that being bug 654015)
Comment 20 Daniel Holbert [:dholbert] 2011-05-03 23:29:02 PDT
Per bug 654015 comment 24: While we're not sure whether that this bug exposes us to exploitable issues, bug 654015 demonstrates that one-off document-teardown bugs can cause semi-scary situations where this bug's protections would help us sleep better.

So, I'd advocate landing the following on Aurora for Firefox 5:
 - bug 653270 (minor not-really-functional cleanup), which this bug stacks on top of per comment 9
 - this bug here's 2 patches, plus BartZilla's --disable-smil build-fix
 - bug 654015's 2 patches (once they get review)

This technically affects Firefox 4, too, but last I heard we aren't doing any more releases on that branch (and even if we were, I'm not sure this is quite scary enough to justify landing on a release branch yet).

I won't request approval until bug 654015 is reviewed, and then I intend to request approval for all the aforementioned patches at once.  I've already verified that they all apply cleanly to Aurora.
Comment 21 Daniel Holbert [:dholbert] 2011-05-12 14:50:14 PDT
Comment on attachment 528740 [details] [diff] [review]
patch 1 v1

Ok, requesting approval to land this bug's patches & bug 654015's patches on aurora (as described in comment 20)
Comment 22 Asa Dotzler [:asa] 2011-05-18 13:04:18 PDT
dholbert, can you tell us what kind of risk we've got in taking this patch and what testing has been done to give us confidence that this won't hurt us on Beta?
Comment 23 Daniel Holbert [:dholbert] 2011-05-19 14:46:57 PDT
(In reply to comment #22)
> dholbert, can you tell us what kind of risk we've got in taking this patch

The change here is pretty minimal and pro-security -- basically, when nsDocument is destroyed, this patch forcibly clears nsSMILAnimationController's raw pointer to it, to prevent usages-after-free.

(We fully expect that the nsSMILAnimationController shouldn't be using that raw pointer anymore anyway, per my logic in comment 4.  But in case it tries to, now it won't be possible.)

The only anticipated effect here is to convert theoretical use-after-free bugs into no-ops. (since we null-check the pointer and bail out if it's null.)

Bug 653238 initially looked like it was one such use-after-free bug, but it turned out it was just a use-during-destruction-but-before-the-actual-free-occurs.

> and what testing has been done to give us confidence that this won't hurt us
> on Beta?

Weeks of baking on m-c.

There are no specific testcases associated with this bug, since this is a defense against a theoretical threat that we can't think of any ways to actually trigger at this point.
Comment 24 Daniel Veditz [:dveditz] 2011-05-19 14:51:40 PDT
Comment on attachment 529251 [details] [diff] [review]
fix compilation with --disable-smil

Approved for mozilla-beta, a=dveditz for release-drivers
Comment 25 Asa Dotzler [:asa] 2011-05-19 15:15:05 PDT
Comment on attachment 529251 [details] [diff] [review]
fix compilation with --disable-smil

Please land these changes on both Aurora and Beta. (In the future, getting changes in during Aurora will save you this extra step.)
Comment 26 Daniel Holbert [:dholbert] 2011-05-20 01:43:11 PDT
Sorry -- I forgot that this bug's patches stack on top of bug 653270 (as noted in comment 9).  That bug is just code-cleanup with no functional change -- I filed it in order to make the patches here simpler.  /me requests approval over there...

Note You need to log in before you can comment on or make changes to this bug.