Require callers to hold strong references to refcounted parameters

NEW
Unassigned

Status

()

enhancement
4 years ago
2 years ago

People

(Reporter: ayg, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

== Problem ==

Functions that accept refcounted types conventionally take a raw pointer, T*.  They assume that the pointer will remain valid across the life of the function, but they might do something that indirectly releases it, like this:

  void
  DoStuff(nsINode* aNode)
  {
    RemoveFromParent(aNode);
    DoStuffWithNode(aNode);
  }

If the only reference to aNode was held by its parent, this is a potentially sec-critical use-after-free.  This kind of bug has occurred at least once in editor code that I know of, where a maliciously-constructed page caused a possibly-exploitable crash by using mutation events to unexpectedly release a node.

To prevent this kind of vulnerability, callees that take actions that might release a parameter should be able to signal that the caller must hold a strong reference, and the compiler should enforce that requirement as much as possible.  This could be done by using new types for the parameters if possible, not static analysis, so bad code will fail to compile even for those who do not run static analysis.  If possible, it should be done in a way that does not cause unnecessary addref/release and does not compromise binary compatibility.

== Solution #1: nsCOMPtr<T> ==

The most obvious solution is to just make function parameters nsCOMPtr<T> (or nsRefPtr<T>, etc.) instead of T*.  Then the callee will take its own reference.  However, I've been told that this many addrefs/releases is totally unacceptable for performance.

== Solution #2: const nsCOMPtr<T>& ==

This has a few disadvantages:

1) It presumably has worse performance, because a pointer-to-pointer is passed instead of a pointer.

2) It is not binary-compatible with T*.  (Ehsan thinks this is a major issue, see bug 1179451 comment 43.)

3) If there's a mismatch between the type of the parameter and the type of smart pointer being passed, it will addref unnecessarily to convert.

4) It will addref silently if passed a raw pointer, so programmers might not notice the performance cost of the extra addref.

5) If the caller passes a non-local nsCOMPtr<T>, it might be inadvertently modified over the course of the callee's execution, which could result in a bug in the callee.

It has two major advantages:

1) It does not introduce new types.

2) It does not require any extra work by the caller.  The caller can pass a raw pointer and it will be silently addrefed.  (This is the flip-side of disadvantages 2 and 4.)

== Solution #3: new type, RefParam ==

A new type is proposed, called RefParam.  It is similar to a raw pointer, and is binary-compatible with a raw pointer -- it does not ever addref or release, and its constructor or destructor are trivial.  But it can be constructed only from a smart pointer or another RefParam, not a raw pointer.  If the variable it's constructed from is a local variable, it is guaranteed that a strong reference is held over the life of the function.

However, it provides no protection against smart pointers in members.  I think this problem can be left out of the scope of this bug and dealt with in a follow-up.  Note that solution #2 also doesn't deal with members properly (disadvantage 5).

Some functions currently take references to refcounted types instead of pointers.  For these functions, a variant NonNullRefParam is proposed, which can only be constructed from OwningNonNull.

== Functions that want to accept raw pointers ==

Some callers do not have a strong reference to an object, but want to call a function on it anyway.  This is legitimate as long as the function is certain not to release it, e.g.:

  if (node->Contains(otherNode->GetFirstChild())) {

GetFirstChild() returns a raw pointer (and for performance, we might not want to change that).  It is probably easy to verify by code inspection that Contains() cannot release anything.  Thus requiring an extra addref/release here is pointless.  The obvious solution is that functions like Contains() should be allowed to keep taking raw pointers, if performance is more important than the slightly greater risk of bugs.

There is an issue with this: smart pointers will not necessarily convert implicitly to raw pointers anymore (see bug 1179451, bug 1193298, bug 1193762).  There is obviously no reason callers should be prohibited from passing smart pointers to functions that take raw pointers.  I solved this problem in a patch to bug 1179451.  But it has nothing whatsoever to do with this bug, so I will leave it to a different bug.
Posted patch PatchSplinter Review
This consists basically of the latest version of bug 1179451 part 4 option 2 with requested revisions, with patches 9, 10, and 11 from bug 1193762 merged in.
Attachment #8647485 - Flags: review?(nfroyd)
(In reply to :Aryeh Gregor (working until August 13) from comment #0)

> == Solution #3: new type, RefParam ==
> 
> A new type is proposed, called RefParam.  It is similar to a raw pointer,
> and is binary-compatible with a raw pointer -- it does not ever addref or
> release, and its constructor or destructor are trivial.  But it can be
> constructed only from a smart pointer or another RefParam, not a raw
> pointer.  If the variable it's constructed from is a local variable, it is
> guaranteed that a strong reference is held over the life of the function.

It's worth noting that this only applies if the local is const or the address of the member does not escape before the call.

e.g.

{
   nsRefPtr<F> f = new F;
   q(f, [&f] { f = nullptr; });
}

void q(RefParam<F> f, function r)
{
   r();
   f->doStuff();
}
Comment on attachment 8647485 [details] [diff] [review]
Patch

Review of attachment 8647485 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/RefParam.h
@@ +50,5 @@
> +    static_assert(mozilla::IsBaseOf<T, U>::value, "U is not a subclass of T");
> +  }
> +
> +  template<class U>
> +  MOZ_IMPLICIT NonNullRefParam(const NonNullRefParam<U>& aOther)

This should already suppress the automatic copy and move constructors.  If you would like me to default one or both of them anyway, let me know.

@@ +63,5 @@
> +  T& get() const { return mRef; }
> +
> +  // No assignment allowed
> +  NonNullRefParam<T>& operator=(const NonNullRefParam<T>&) = delete;
> +  NonNullRefParam<T>& operator=(const NonNullRefParam<T>&&) = delete;

Note that deleting the copy constructor should effectively delete the move constructor as well, AFAIK.  I deleted the move constructor explicitly as well because you asked me to.  Likewise elsewhere that I deleted or defaulted the move constructor when the copy constructor already was -- AFAIK this has no effect.

Also, when defaulting and deleting things that have no effect, I didn't do all the const/non-const variants.  Let me know if you want me to.

@@ +118,5 @@
> +    static_assert(mozilla::IsBaseOf<T, U>::value, "U is not a subclass of T");
> +  }
> +
> +  // Explicit default to ensure binary compatibility with T*, per bug 767178
> +  // comment 35 (this means we don't get a static_assert here)

The parenthesized bit is wrong -- this isn't templated over U, so we don't need a static_assert anyway.  Consider it stricken.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #2)
> It's worth noting that this only applies if the local is const or the
> address of the member does not escape before the call.

Good point!  I'll add an XXX about that too.  As with members, this means that this isn't a perfect solution, but I think it doesn't detract much from the value.  Practically speaking I don't think we have a lot of code that uses addresses to smart pointers for anything other than out params, which shouldn't be an issue unless you're passing the same pointer to the same function as both an in and out param, in which case you deserve whatever you get.
Comment on attachment 8647485 [details] [diff] [review]
Patch

Review of attachment 8647485 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/RefParam.h
@@ +120,5 @@
> +
> +  // Explicit default to ensure binary compatibility with T*, per bug 767178
> +  // comment 35 (this means we don't get a static_assert here)
> +  MOZ_IMPLICIT RefParam(const RefParam<T>&) = default;
> +  MOZ_IMPLICIT RefParam(const RefParam<T>&&) = default;

It turns out the parameter type for an explicitly-defaulted move constructor must not be const, according to Clang.  So consider the "const" gone from this last line.
Summary: Require callers to hold strong references to recounted parameters → Require callers to hold strong references to refcounted parameters
Flags: needinfo?(nfroyd)
Comment on attachment 8647485 [details] [diff] [review]
Patch

Review of attachment 8647485 [details] [diff] [review]:
-----------------------------------------------------------------

I still stand by my bug 1179451 comment 48.  I am also leery of introducing yet more types into our (already somewhat complicated) refcounted smart-pointer system and having inconsistent rationales on when to use our templates vs. normal C++ syntax.

All the literature I've been able to find (Chromium's smart pointer docs; WebKit's smart pointer docs; Libre Office uses reference-counted pointers, but they seem to use them in very limited places; C++ "guidelines") doesn't seem to care about the specific case of easily passing smart pointer temporaries to raw pointer arguments, or ensuring that people are holding strong references.  (Probably because the last one is very hard to enforce...)  If you have other pointers, I would be most happy to read through them.

I'm also a little skeptical that most of the changes required seem to be inside editor/; it's not clear to me that editor/ represents "the Gecko Coding Way", if any such thing exists.  Is editor/ forward-thinking in its common usage of nsCOMPtr return values + passing them as raw pointer arguments and thus a forerunner of what Gecko style will look like once we decide smart pointer return values are OK?  Or is it an aberration, and adding some temporaries or refactoring would eliminate the requirement for a pervasive addition to our smart pointer machinery?  I'm not really sure.
Attachment #8647485 - Flags: review?(nfroyd)
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #6)
> I still stand by my bug 1179451 comment 48.  I am also leery of introducing
> yet more types into our (already somewhat complicated) refcounted
> smart-pointer system and having inconsistent rationales on when to use our
> templates vs. normal C++ syntax.

I agree that the extra complexity is bad.  I don't think that there's any other way to get the safety guarantees without it.

> All the literature I've been able to find (Chromium's smart pointer docs;
> WebKit's smart pointer docs; Libre Office uses reference-counted pointers,
> but they seem to use them in very limited places; C++ "guidelines") doesn't
> seem to care about the specific case of easily passing smart pointer
> temporaries to raw pointer arguments, or ensuring that people are holding
> strong references.  (Probably because the last one is very hard to
> enforce...)  If you have other pointers, I would be most happy to read
> through them.

They presumably just have the safety problems that the idea was to avoid.  We also didn't have RefParam until now -- and having it would have prevented at least a couple of sec-critical bugs.  I'm not the one to make the call about whether those bugs are worth the added complexity of RefParam, but at this point I do think it's the only practical way anyone's suggested to prevent them.  We regularly add more static compiler safety checks to Mozilla, and just because other projects haven't done so doesn't mean we shouldn't, IMO.

> I'm also a little skeptical that most of the changes required seem to be
> inside editor/; it's not clear to me that editor/ represents "the Gecko
> Coding Way", if any such thing exists.  Is editor/ forward-thinking in its
> common usage of nsCOMPtr return values + passing them as raw pointer
> arguments and thus a forerunner of what Gecko style will look like once we
> decide smart pointer return values are OK?  Or is it an aberration, and
> adding some temporaries or refactoring would eliminate the requirement for a
> pervasive addition to our smart pointer machinery?  I'm not really sure.

Having worked on fixing up the relevant code, I'm absolutely sure that if functions return nsCOMPtr<T> (which they should IMO), you will need either RefParam, lots of .get()s, or some other solution that I don't know about.  I don't think using .get() a lot is a good solution.  I think I read someone say once that WebKit doesn't have an implicit conversion and they just use .get() everywhere.

So in summary, I think there are four possibilities on the table:

1) Avoid returning nsCOMPtr from functions, and use .get() where necessary.  This should be perfectly manageable, just then we don't get to return nsCOMPtr from functions, which results in awkwardness or extra addrefs/releases, as discussed elsewhere.

2) Use lots of .get().  I don't think this is a good idea.  .get() is dangerous and should be relatively uncommon.

3) Leave nsCOMPtr rvalues implicitly converting to T*.  As discussed, this is unsafe.

4) Introduce RefParam.

If you're not comfortable with RefParam, I think (1) is the best way forward for now.
Assignee: ayg → nobody
Status: ASSIGNED → NEW
You need to log in before you can comment on or make changes to this bug.