Closed Bug 1180993 Opened 4 years ago Closed 4 years ago

Add an analysis to help catch unused return values of specific types

Categories

(Firefox Build System :: Source Code Analysis, defect)

defect
Not set

Tracking

(firefox42 fixed)

RESOLVED FIXED
mozilla42
Tracking Status
firefox42 --- fixed

People

(Reporter: ehsan, Assigned: Nika)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 5 obsolete files)

We have MOZ_WARN_UNUSED_RESULT to help warn (hopefully fatally) when a method declares that its return type needs to be checked, and a caller forgets to do that.

But sometimes there is a specific return type that can never be unused.  For example, in our code it is always an error to ignore a return type of already_AddRefed<T>, and failure to do so results in bugs such as bug 1180105 which are basically memory leaks.

We can add an analysis that lets us add an attribute to such types, and turn the pattern that went unnoticed in that bug into a compile-time error.
Michael, in case you're looking for more fun analyses to write!  :-)
You know me too well ehsan :)

This static analysis searches for unused expressions, and checks their type. If the type is a type which is annotated with MOZ_MUST_USE, it errors.

Currently, the analysis also produces errors for functions like `MustUse &foo()`. This is because `MustUse &` is technically the same type as `MustUse`. I can discriminate between the two, but I'm not sure if it's worth it.

I'm not sure if we would want the analysis to recursively traverse inheritance and sub-records to discover other values which are MOZ_MUST_USE. Currently the analysis does not do that.
Attachment #8630309 - Flags: feedback?(ehsan)
The analysis only found the leak fixed in bug 1180105. This leak has been also fixed in this patch (which should be rebased onto tip before it is applied). The other changes are, for the most part, due to intentional leaks.
Attachment #8630312 - Flags: review?(ehsan)
Assignee: nobody → michael
Comment on attachment 8630310 [details] [diff] [review]
Part 2: Add the MOZ_MUST_USE attribute to mfbt, and use it to verify usage of already_AddRefed

Review of attachment 8630310 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/Attributes.h
@@ +450,5 @@
>   *   a compile time error to call AddRef or Release on the return value of a
>   *   function.  This is intended to be used with operator->() of our smart
>   *   pointer classes to ensure that the refcount of an object wrapped in a
>   *   smart pointer is not manipulated directly.
> + * MOZ_MUST_USE: Applies to type declarations.  Makes it a compile time error to not

Let's give this a better name.  How about MOZ_RETURN_TYPE_MUST_BE_USED?
Attachment #8630310 - Flags: review?(ehsan) → review+
Comment on attachment 8630312 [details] [diff] [review]
Part 3: Correct use sites of functions which return already_AddRefed

Review of attachment 8630312 [details] [diff] [review]:
-----------------------------------------------------------------

::: image/MultipartImage.cpp
@@ +39,5 @@
>    {
>      // Use GetFrame() to block until our image finishes decoding.
> +    nsRefPtr<SourceSurface> surface =
> +      mImage->GetFrame(imgIContainer::FRAME_CURRENT,
> +                       imgIContainer::FLAG_SYNC_DECODE);

Please remove this hunk!  :-)
Attachment #8630312 - Flags: review?(ehsan) → review+
(In reply to Michael Layzell [:mystor] from comment #2)
> Currently, the analysis also produces errors for functions like `MustUse
> &foo()`. This is because `MustUse &` is technically the same type as
> `MustUse`. I can discriminate between the two, but I'm not sure if it's
> worth it.

I think it's OK to keep it as is, since such return types would cause a temporary object to be created if the return value is unused.

> I'm not sure if we would want the analysis to recursively traverse
> inheritance and sub-records to discover other values which are MOZ_MUST_USE.
> Currently the analysis does not do that.

We can extend it to do that if we need to in the future, but that's unnecessary overhead right now, so let's keep it simple.
Comment on attachment 8630309 [details] [diff] [review]
Part 1: Add an analysis to help catch unused return values of specific types

Review of attachment 8630309 [details] [diff] [review]:
-----------------------------------------------------------------

I think you should try to rewrite this using matchers as noted below, so minusing, but the code itself is fine!

::: build/clang-plugin/clang-plugin.cpp
@@ +231,5 @@
> +    const Expr* E = dyn_cast_or_null<Expr>(stmt);
> +    if (E) {
> +      // XXX It would be nice if we could use getAsTagDecl,
> +      // but our version of clang is too old.
> +      // (getAsTagDecl would also cover enums etc.)

I only ever expect this analysis to be useful for objects, but this is a good point.  Perhaps file a follow-up bug for this?

@@ +328,5 @@
>  
>      return true;
>    }
> +
> +  bool VisitSwitchCase(SwitchCase* stmt) {

Instead of using visitors, let's use a callExpr() matcher which should cover all cases.  You may be able to combine that with callee() and returns() to do some cool stuff.  :-)

::: build/clang-plugin/tests/TestMustUse.cpp
@@ +1,1 @@
> +#define MOZ_MUST_USE __attribute__((annotate("moz_must_use")))

Please rename the macro accordingly.

@@ +24,5 @@
> +  producesMustUseRef(); // expected-error {{Unused MOZ_MUST_USE return value of type 'MustUse'}}
> +  producesMayUse();
> +  producesMayUsePointer();
> +  producesMayUseRef();
> +  {

Hmm, everything from here...

@@ +136,5 @@
> +    producesMustUseRef(); // expected-error {{Unused MOZ_MUST_USE return value of type 'MustUse'}}
> +    producesMayUse();
> +    producesMayUsePointer();
> +    producesMayUseRef();
> +  }

... to here seems to be needless repetition.  Is there a good reason why you included all of these?  If not, please remove them.

@@ +150,5 @@
> +  MustUse *b = producesMustUsePointer();
> +  MustUse &c = producesMustUseRef();
> +  MayUse d = producesMayUse();
> +  MayUse *e = producesMayUsePointer();
> +  MayUse &f = producesMayUseRef();

One interesting case that you need to error on in your analysis is something like below:

producesMustUse().foo();

(Assuming that you add a foo method to MustUse.)

This needs to error out, since the return value is ultimately dropped on the floor.

::: build/clang-plugin/tests/moz.build
@@ +9,5 @@
>      'TestCustomHeap.cpp',
>      'TestExplicitOperatorBool.cpp',
>      'TestGlobalClass.cpp',
>      'TestMustOverride.cpp',
> +    'TestMustUse.cpp',

Please rename the test accordingly.
Attachment #8630309 - Flags: feedback?(ehsan) → feedback-
(In reply to Ehsan Akhgari (not reviewing patches, not reading bugmail, needinfo? me!) from comment #8)
> I think you should try to rewrite this using matchers as noted below, so
> minusing, but the code itself is fine!
The problem with using matchers is that this analysis cares about where the call expression is being found from. The code using matchers would look an awful lot like this, as there is no way to get the containing statement from an Stmt*. I need to instead match against the containing statement, and then find the callExpr from there.

That's why I chose to use the RecursiveASTVisitor (which is very similar to how the warning is implemented within Clang). This definitely _could_ be written using a series of matchers, but I actually feel like it may be more verbose than necessary, as I would need to have an anyOf with an internal node for each of these statement types, and then match against the callExpr inside of that.

> I only ever expect this analysis to be useful for objects, but this is a
> good point.  Perhaps file a follow-up bug for this?

Reason why I was thinking about other things is if we ever got to the point where we wanted to make it an error to drop an nsresult on the floor. nsresult isn't an object, so we would have to support enums etc.

> Hmm, everything from here...
[...]
> ... to here seems to be needless repetition.  Is there a good reason why you
> included all of these?  If not, please remove them.
I originally included them because I thought that they may be treated differently. I know know in reterospect that they are all CompoundStmt-s, so I can remove all but the first one.

> One interesting case that you need to error on in your analysis is something
> like below:
> 
> producesMustUse().foo();
> 
> (Assuming that you add a foo method to MustUse.)
> 
> This needs to error out, since the return value is ultimately dropped on the
> floor.

I don't understand why this should error out, as the return value is definitely "used", to call foo() on.
In the case of already_AddRefed, already_AddRefed.take() would definitely "use" the already_AddRefed. I see no reason why the implicit "this" argument of a method should be treated differently in this case than a function argument.
Flags: needinfo?(ehsan)
(In reply to Michael Layzell [:mystor] from comment #9)
> (In reply to Ehsan Akhgari (not reviewing patches, not reading bugmail,
> needinfo? me!) from comment #8)
> > I think you should try to rewrite this using matchers as noted below, so
> > minusing, but the code itself is fine!
> The problem with using matchers is that this analysis cares about where the
> call expression is being found from. The code using matchers would look an
> awful lot like this, as there is no way to get the containing statement from
> an Stmt*. I need to instead match against the containing statement, and then
> find the callExpr from there.
> 
> That's why I chose to use the RecursiveASTVisitor (which is very similar to
> how the warning is implemented within Clang). This definitely _could_ be
> written using a series of matchers, but I actually feel like it may be more
> verbose than necessary, as I would need to have an anyOf with an internal
> node for each of these statement types, and then match against the callExpr
> inside of that.

OK, that's fair.

The thing that worries me though is how to demonstrate that these way we are matching exactly everything that we would with callExpr().  Can you please explain how that can be ensured?

> > I only ever expect this analysis to be useful for objects, but this is a
> > good point.  Perhaps file a follow-up bug for this?
> 
> Reason why I was thinking about other things is if we ever got to the point
> where we wanted to make it an error to drop an nsresult on the floor.

Maybe I'm pessimistic, but I am pretty sure that will never happen.  :-)  But more importantly:

> nsresult isn't an object, so we would have to support enums etc.

Even if we don't use it for nsresult, we may want to do it for other enums, so yes, I agree that we should fix this when the clang upgrade is finished.

> > Hmm, everything from here...
> [...]
> > ... to here seems to be needless repetition.  Is there a good reason why you
> > included all of these?  If not, please remove them.
> I originally included them because I thought that they may be treated
> differently. I know know in reterospect that they are all CompoundStmt-s, so
> I can remove all but the first one.

Cool!

> > One interesting case that you need to error on in your analysis is something
> > like below:
> > 
> > producesMustUse().foo();
> > 
> > (Assuming that you add a foo method to MustUse.)
> > 
> > This needs to error out, since the return value is ultimately dropped on the
> > floor.
> 
> I don't understand why this should error out, as the return value is
> definitely "used", to call foo() on.
> In the case of already_AddRefed, already_AddRefed.take() would definitely
> "use" the already_AddRefed. I see no reason why the implicit "this" argument
> of a method should be treated differently in this case than a function
> argument.

The reason why this is necessary is that we can't know what that method call does, and whether that's OK, so the safer thing is to reject all patterns that can cause a temporary return object to silently be destroyed, and that would still work well with the |unused <<| pattern.

(Note that this is not an issue for already_AddRefed, and in fact your analysis is fine for already_AddRefed, but I want to keep it more generic.)
Flags: needinfo?(ehsan)
An example of a method that would not be ok in the above is something like the below:

obtainResource().log("resource obtained"); // oops. we dropped the resource!
(In reply to Ehsan Akhgari (not reviewing patches, not reading bugmail, needinfo? me!) from comment #11)
> An example of a method that would not be ok in the above is something like
> the below:
> 
> obtainResource().log("resource obtained"); // oops. we dropped the resource!

I don't really see the problem here. the resource was used - in order to log the string "resource obtained". The resource will then be cleaned up as per normal by destructors. In my opinion that method call is morally equivalent to:

    log(&obtainResource(), "resource obtained");

which I think is something we shouldn't be trying to prohibit, as that is an example of using the value which was returned from the function.

(In reply to Ehsan Akhgari (not reviewing patches, not reading bugmail, needinfo? me!) from comment #10)
> The thing that worries me though is how to demonstrate that these way we are
> matching exactly everything that we would with callExpr().  Can you please
> explain how that can be ensured?

We don't want to match every call expression, instead we want to make every _unused_ call expression. The ast matcher which I wrote uses the same logic which clang uses internally to determine if an expression's value is unused - namely that an expression's value is unused if it exists in any of the locations which I am checking. In all other locations where an expression could exist, the value of the expression is being used in some way (And SEMA seems to agree with me on that). I don't want to find other instances of CallExpr, as they will be in positions where they are used, and I don't want things which are used.

> The reason why this is necessary is that we can't know what that method call
> does, and whether that's OK, so the safer thing is to reject all patterns
> that can cause a temporary return object to silently be destroyed, and that
> would still work well with the |unused <<| pattern.

In my opinion, what the method call does is "use" a reference to the object which was returned, meaning that we aren't dropping the value on the floor, rather we are using the value, and then cleaning up by running the destructor like usual.
I should add that this expression complains about _any_ unused expression of an annotated type, not just CallExprs, So if you produced one of these values through a different mechanism, and then didn't use it, it would also produce an error.
Duplicate of this bug: 771512
Duplicate of this bug: 421998
Improved the error message clarity, and added a note explaining how to handle the error if you actually don't want to use the value.
Attachment #8630309 - Attachment is obsolete: true
Attachment #8631597 - Flags: review?(ehsan)
Rebased onto tip
Attachment #8630312 - Attachment is obsolete: true
Comment on attachment 8631597 [details] [diff] [review]
Part 1: Add an analysis to help catch unused values of specific types

Review of attachment 8631597 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/clang-plugin/clang-plugin.cpp
@@ +238,5 @@
> +      if (decl) {
> +        decl = decl->getDefinition();
> +        if (decl && hasCustomAnnotation(decl, "moz_must_use")) {
> +          unsigned errorID = Diag.getDiagnosticIDs()->getCustomDiagID(
> +            DiagnosticIDs::Error, "Unused MOZ_MUST_USE value of type %0");

Please improve the error message!  Something like:

Return values of type foo should always be used.

@@ +240,5 @@
> +        if (decl && hasCustomAnnotation(decl, "moz_must_use")) {
> +          unsigned errorID = Diag.getDiagnosticIDs()->getCustomDiagID(
> +            DiagnosticIDs::Error, "Unused MOZ_MUST_USE value of type %0");
> +          unsigned noteID = Diag.getDiagnosticIDs()->getCustomDiagID(
> +            DiagnosticIDs::Note, "If you're sure you don't want to use it, try 'mozilla::unused << %0'");

This suggestion can be wrong in cases such as:

  for (; OopsIMustHaveBeenUsed();) {
    // ...
  }

Since operator<< returns void.  Maybe it's better to not suggest anything, since suggesting a correct code pattern is difficult.
Attachment #8631597 - Flags: review?(ehsan) → review+
(In reply to Ehsan Akhgari (not reviewing patches, not reading bugmail, needinfo? me!) from comment #18)
> Please improve the error message!  Something like:
> 
> Return values of type foo should always be used.

The problem is that it's not just return values. How about "Values of type %0 must always be used"

> This suggestion can be wrong in cases such as:
> 
>   for (; OopsIMustHaveBeenUsed();) {
>     // ...
>   }
> 
> Since operator<< returns void.  Maybe it's better to not suggest anything,
> since suggesting a correct code pattern is difficult.

In that particular situation, the return value _is_ being used (It's being used as the condition variable in a for loop).
In every situation in which the return value isn't being used, such as in either of the other two expressions in the for loop, it would be legal for the value to be void (otherwise, the value would be being used). I think that this suggestion is always valid, however, I can remove it if you would like.
Changes to the plugin such that it builds on the version of clang running on try
Attachment #8631597 - Attachment is obsolete: true
Corrected linux-only test failure related to unused already_AddRefed
Attachment #8631601 - Attachment is obsolete: true
(In reply to Michael Layzell [:mystor] from comment #19)
> (In reply to Ehsan Akhgari (not reviewing patches, not reading bugmail,
> needinfo? me!) from comment #18)
> > Please improve the error message!  Something like:
> > 
> > Return values of type foo should always be used.
> 
> The problem is that it's not just return values. How about "Values of type
> %0 must always be used"

OK.

> > This suggestion can be wrong in cases such as:
> > 
> >   for (; OopsIMustHaveBeenUsed();) {
> >     // ...
> >   }
> > 
> > Since operator<< returns void.  Maybe it's better to not suggest anything,
> > since suggesting a correct code pattern is difficult.
> 
> In that particular situation, the return value _is_ being used (It's being
> used as the condition variable in a for loop).
> In every situation in which the return value isn't being used, such as in
> either of the other two expressions in the for loop, it would be legal for
> the value to be void (otherwise, the value would be being used). I think
> that this suggestion is always valid, however, I can remove it if you would
> like.

I'm not convinced that it is always valid.

Most importantly, it seems like a poor choice to teach people how to work around this error.  In most cases, they will want to fix it properly.  Giving suggestions to people which automatically turn into memory leaks is not a good idea.

Please remove the suggestion.  :-)
(In reply to Michael Layzell [:mystor] from comment #20)
> Created attachment 8632130 [details] [diff] [review]
> Part 1: Add an analysis to help catch unused return values of specific types
> 
> Changes to the plugin such that it builds on the version of clang running on
> try

Note that bug 1123386 is waiting for review, so you can wait for it to land instead of working around these issues.
Removed the suggestion to use `mozilla::unused`
Attachment #8632130 - Attachment is obsolete: true
Depends on: 1123386
Keywords: checkin-needed
No longer depends on: 1123386
Blocks: 1183114
Blocks: 1185044
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.