Add static check against RefPtr(this) inside constructor
Categories
(Developer Infrastructure :: Source Code Analysis, task)
Tracking
(firefox-esr115 wontfix, firefox122 wontfix, firefox123 wontfix, firefox124 fixed)
People
(Reporter: saschanaz, Assigned: saschanaz)
References
Details
(Keywords: sec-want, Whiteboard: [post-critsmash-triage][adv-main124-])
Attachments
(3 files, 1 obsolete file)
The ultimate goal is to remove all this
but this is the first step.
Assignee | ||
Comment 1•2 years ago
|
||
Comment 2•2 years ago
|
||
Thanks for doing this! I had a WIP of a similar checker, but hadn't gotten around to posting it yet. I'll post my WIP here for reference.
The big difference with my approach was that instead of rejecting just RefPtr{this}
, I rejected all references to this
not in a member access in the constructor, so passing to FunctionWhichTakesAReference(this)
would also lead to an error, as it could potentially take a reference.
My approach is certainly more aggressive than yours, but I think it's where we'd want to end up eventually. It might be worth seeing how many more cases my approach catches in a trial run than yours does.
FWIW I don't think we'll want to land a checker like this which points out potential security bugs in the tree until we've audited all of the warnings and added the opt-out mechanism to them.
Comment 3•2 years ago
|
||
This is very much an early WIP which prevents writing this
in
refcounted constructors explicitly. The intention was that this would be
expanded with an opt-out mechanism like
MOZ_ALLOW_REFCOUNTED_THIS_IN_CTOR(this)
so we could adapt existing
uses in the codebase as we come across them and land this as an actual
blocking lint.
Updated•2 years ago
|
Assignee | ||
Comment 4•2 years ago
|
||
Used your WIP patch and some script to dedupe from the log messages. (Also filtered out incorrect messages caused by mere member use since that includes an implicit this->
. Not sure why unless(hasParent(memberExpr()))
wasn't enough to prevent this.)
That gave 798 places in the tree, including only the cases where this
is used directly inside the constructor, which excludes cases where the constructor calls this-using methods.
Notable common cases are:
- Member initializations (241 cases matching
^\s*m[A-Z]\w+.*this
) - Loggers (122 cases including the text
LOG
) - Adding observers (94 cases including
Observer(this
, although this overlaps a bit with the member inits) HoldJSObjects(this)
(86 cases)- Singleton inits (30 cases matching
\s+[gs][A-Z]\w+ = this
)
Assignee | ||
Comment 5•2 years ago
|
||
(Tried normalizing the path to further dedupe things, although the effect is minimal)
Comment 6•2 years ago
|
||
(In reply to Kagami [:saschanaz] from comment #4)
- Member initializations (241 cases matching
^\s*m[A-Z]\w+.*this
)
I imagine these are quite varied, as they aren't all mSelf = this
or similar, they probably are also other more complex forms. Some might be OK and others probably won't be. I expect we'll want to be annotating them.
- Loggers (122 cases including the text
LOG
)
It might be worth finding some way to detect if the this
value is just being used in a log statement and exempt it. Not sure how hard that'd be to do, but we'd probably need to check if the ancestor call is to mozilla::detail::log_print
(https://searchfox.org/mozilla-central/rev/7a4c08f2c3a895c9dc064734ada320f920250c1f/xpcom/base/Logging.cpp#58).
- Adding observers (94 cases including
Observer(this
, although this overlaps a bit with the member inits)
These are probably all scary, as the observer service does require the ability to take a reference to the type. Even if aHoldWeak
is passed in as a flag, in order to take a weak reference we first need to QI to nsISupportsWeakReference
(https://searchfox.org/mozilla-central/rev/7a4c08f2c3a895c9dc064734ada320f920250c1f/xpcom/base/nsWeakReference.cpp#90-91), which will take and then drop a strong reference to the object.
Fortunately for us, while calls like this are a bit scary, they currently generally won't cause issues. The observer service doesn't take any reference to the passed-in observer until nsMaybeWeakPtrArray::AppendWeakElement
(https://searchfox.org/mozilla-central/rev/7a4c08f2c3a895c9dc064734ada320f920250c1f/xpcom/base/nsMaybeWeakPtr.h#75-77), which is the last step of registering an observer, and is after all of the error cases which could cause an early return.
If this ends up being a large chunk of the callers, we could potentially opt out specifically calls to nsIObserverService::AddObserver
, assuming that the last argument is false
(not holding weak).
HoldJSObjects(this)
(86 cases)
We could potentially add an opt-out for HoldJSObjects
, as for most cases here this is OK. We'd probably want to lint for improper uise of HoldJSObjects through a different mechanism.
- Singleton inits (30 cases matching
\s+[gs][A-Z]\w+ = this
)
These are definitely a bit sketchy though I expect in many cases we'll be able to get away with it? It really depends on how the code is working under the hood. Given there are only 30 cases, ideally we probably just stop with this pattern and clean those cases up.
Looking through the list a bit I see a few sketchy places as well as some ones which are a bit scarier than others. Something like https://searchfox.org/mozilla-central/rev/7a4c08f2c3a895c9dc064734ada320f920250c1f/dom/xul/MenuBarListener.cpp#52-55 is probably a risk because we definitely take a strong reference when claiming an event listener, and I believe we do it before doing the core logic (as we want to unify WebIDL and XPCOM event listener codepaths), meaning that we probably can release it on error code paths (like https://searchfox.org/mozilla-central/rev/7a4c08f2c3a895c9dc064734ada320f920250c1f/dom/events/EventListenerManager.cpp#214).
Other callers like https://searchfox.org/mozilla-central/rev/7a4c08f2c3a895c9dc064734ada320f920250c1f/dom/media/mediacontrol/MediaController.cpp#94-105 looks terrifying, but turns out to actually use a raw pointer under the hood for the Connect
callback. This will definitely be broken if that ever changes though.
Overall even though this is 700+ cases, it's a more approachable list than I was expecting and I think it would actually be possible to audit all of them and add annotations to the ones which are OK (with a couple of extra exemption cases most likely)
Assignee | ||
Comment 7•2 years ago
|
||
Oops, somehow I missed your comment. Thanks for taking a look!
Overall even though this is 700+ cases, it's a more approachable list than I was expecting and I think it would actually be possible to audit all of them and add annotations to the ones which are OK (with a couple of extra exemption cases most likely)
Note that this only covers passing this
directly from the constructor. Calling such methods from the constructor is not currently covered and may be even harder to change. But I do think that managing those 700+ cases first would be a great first step.
Assignee | ||
Comment 8•2 years ago
•
|
||
Calling such methods from the constructor is not currently covered
Hi Andi, do you perhaps have some idea how to track methods that are called by constructors in a clang plugin? Do we have some existing example? Thanks!
Edit: Perhaps I can just match to all methods and return early if it's not called in constructors?
Comment 9•2 years ago
•
|
||
I'm not sure about this because you will hit a huge number of matches, this will be very expensive, for FunctionCall, even if you grain out from the matcher it will still be very slow.
I would match Constructors and in that context I would also match CxxMemberCallExpr or CallExpr.
Assignee | ||
Comment 10•2 years ago
•
|
||
Hello people, a couple of months passed and I think it's time to revisit.
Given that it would be complex to block the constructor from accessing raw this
via methods per comment #9, I'd like to try proceeding with Randell's idea that we should just prevent calling any method at all on the constructors of refcounted classes, which sounds easier to implement a linter for that (except it'll require quite some refactoring, which sounds fair enough to me)
Assignee | ||
Comment 11•2 years ago
•
|
||
Oh no, submitted too early.
Nika and Randell, what do you think about comment #9? Any objections or some different suggestions?
Comment 12•2 years ago
|
||
I forget exactly what I said months ago - can you quote it here?
Can you give an example of the refactoring that will be needed?
Assignee | ||
Comment 13•2 years ago
|
||
Sure, it's https://phabricator.services.mozilla.com/D183286#6061770.
I think Kagami is correct; Foo::Foo() { Bar(); } and Foo::Bar() { RefPtr<Foo> self(this); } will cause this same sort of problem -- but it's very hard to statically analyze I imagine. Barring 'this' in a constructor and adding a MOZ_ASSERT(mRefCnt == 0) at the end of every constructor would do it... at a cost. Too bad there isn't a way to have every refcounted object return an already_AddRefed<> value, and bump to 1 in the mRefCnt constructor. If there was any way to force something to run at the "end" of the constructor it could check for a mRefCnt of 0 (ignoring the snowwhite issue). We could also create with a refcnt of 1, and then wrap every constructor of a refcounted object in something that converts that into an already_AddRefed<> or some such... (just noodling).
Assignee | ||
Comment 14•2 years ago
•
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #12)
Can you give an example of the refactoring that will be needed?
I mean we'd have to move some function calls in the constructors to a separate init function or get a static factory function. (... which is partially already done.)
Comment 15•2 years ago
|
||
Agreed then, that seems like a workable way forward to get some protection
Comment 16•2 years ago
|
||
I'm a bit concerned that trying to avoid putting initalization which may invoke function calls into constructors goes against the grain of C++ language design. It would mean, for example, that we can't use the constructor initializer list to initialize some fields (those whose initializers involved a function call), and may have cascading effects on other types (e.g. the type of a field may now need to be default-constructible when it wasn't before). It also introduces the possibility of objects being a "constructed but Init()
not yet called" state that more places need to check for.
Have we ruled out other potential approaches, like the one discussed in this comment which would ensure the refcount is 1 going into the constuctor and therefore we don't need to restrict the operations that can occur in the constructor body?
Comment 17•2 years ago
|
||
(In reply to Botond Ballo [:botond] from comment #16)
Have we ruled out other potential approaches, like the one discussed in this comment which would ensure the refcount is 1 going into the constuctor and therefore we don't need to restrict the operations that can occur in the constructor body?
Doing a bulk refactor like this for all reference counted objects would likely be an impossibly significant change. We'd need to re-write every single line of code which invokes new T()
for a refcounted type T
to use some other mechanism like MakeRefCounted<T>
, which would also require making every reference counted constructor public. We'd also likely need to change every instance of custom reference counting to handle this new behaviour, which could be quite disruptive and may break things. It would certainly be nicer in many ways if our reference counts started at 1
, but I am skeptical that a full refactoring like this is possible.
I have looked a bit into ways to do a similar refactoring incrementally. It may be possible for us to copy Chromium's approach with adopting initial references. Chromium uses a special StartRefCountFromOneTag
, which uses an additional debug-only boolean in reference counted types, and an extra method. Effectively, their MakeRefCounted<T>
equivalent clears needs_adopt_ref_
for these types instead of calling AddRef
, and it's a debug assertion if needs_adopt_ref_
hasn't been cleared and then AddRef
is called (i.e. reference counting within the constructor is still prohibited, but is debug checked).
It is worth noting that while Chromium added StartRefCountFromOneTag
in 2017 (https://source.chromium.org/chromium/chromium/src/+/65f39693c549f42146b26cd35b13fa6a94fa0744), it hasn't been used thoroughly in the time since then, with REQUIRE_ADOPTION_FOR_REFCOUNTED_TYPE()
appearing to only have 28 references. I expect that a transition like this would be hard & slow, even if we could do it incrementally.
(In reply to Kagami [:saschanaz] (they/them) from comment #10)
Given that it would be complex to block the constructor from accessing raw
this
via methods per comment #9, I'd like to try proceeding with Randell's idea that we should just prevent calling any method at all on the constructors of refcounted classes, which sounds easier to implement a linter for that (except it'll require quite some refactoring, which sounds fair enough to me)
As I believe I mentioned to you on matrix a little while ago, I think preventing all method calls is too aggressive. We don't need to start by creating a 100% foolproof static analysis, we're just trying to catch simple places where we're doing something wrong. Something like calling another method which loops around & takes a reference will be hard to analyze and it's probably OK if we don't catch it in a basic analysis.
I think just blocking trivial cases where someone passes a bare this
as an argument to a function should probably be sufficient as a goal for this bug.
Assignee | ||
Comment 18•2 years ago
|
||
(In reply to Kagami [:saschanaz] (they/them) from comment #10)
As I believe I mentioned to you on matrix a little while ago, I think preventing all method calls is too aggressive. We don't need to start by creating a 100% foolproof static analysis, we're just trying to catch simple places where we're doing something wrong. Something like calling another method which loops around & takes a reference will be hard to analyze and it's probably OK if we don't catch it in a basic analysis.I think just blocking trivial cases where someone passes a bare
this
as an argument to a function should probably be sufficient as a goal for this bug.
In that case I think the current patch can already be a start point, we can then go your approach (block this
from all constructors of refcounted classes), Randell's approach (block method calls from such constructors), or something else.
Comment 19•2 years ago
|
||
![]() |
||
Comment 20•2 years ago
|
||
Backed out for Constructor related sm bustage:
https://hg.mozilla.org/integration/autoland/rev/1adb317fdb23a986c09196e937af8421e112c4d1
Push with failures
Failure log
[task 2024-02-01T09:15:37.482Z] error: 'error' diagnostics seen but not expected:
[task 2024-02-01T09:15:37.482Z] File /builds/worker/checkouts/gecko/build/clang-plugin/tests/TestRefCountedThisInsideConstructor.cpp Line 4: 'nsCOMPtr.h' file not found
[task 2024-02-01T09:15:37.482Z] 15 errors generated.
[task 2024-02-01T09:15:37.482Z] gmake[4]: *** [/builds/worker/checkouts/gecko/config/rules.mk:688: TestRefCountedThisInsideConstructor.o] Error 1
Assignee | ||
Updated•2 years ago
|
Comment 21•2 years ago
|
||
![]() |
||
Comment 22•2 years ago
|
||
Updated•2 years ago
|
Updated•1 years ago
|
Updated•1 year ago
|
Updated•11 months ago
|
Description
•