Closed Bug 1145056 Opened 9 years ago Closed 9 years ago

Coverity complains on every use of MutexAutoLock and GuardObjectNotifier

Categories

(Core :: MFBT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox40 --- fixed

People

(Reporter: bagder, Assigned: erahm)

References

(Blocks 1 open bug)

Details

(Keywords: coverity)

Attachments

(3 files, 4 obsolete files)

I've browsed the list of defects detected by coverity and a fair amount of the ones reported in the network area (where I've focused my attention) are all attributed to the use of MutexAutoLock, and the GuardObjectNotifier to be specific.

Here's an example straight from Coverity CID 1275905:

It identifies an "Explicit null dereferenced" in the nsSocketOutputStream::OnSocketReady() method in netwerk/base/nsSocketTransport2.cpp file on line 530:

        MutexAutoLock lock(mTransport->mLock);

 write_zero_model: GuardObjectNotifier sets <temporary>.mStatementDone to NULL.
 var_deref_model: ~GuardObjectNotifier dereferences null 

This report seems to be triggered like this because the "class GuardObjectNotifier" from mfbt/GuardObjects.h is made to deref the 'mStatementDone' pointer without checking for nullptr while being nullptr-initialized in the constructor.
(In reply to Daniel Stenberg from comment #0)
> This report seems to be triggered like this because the "class
> GuardObjectNotifier" from mfbt/GuardObjects.h is made to deref the
> 'mStatementDone' pointer without checking for nullptr while being
> nullptr-initialized in the constructor.

GuardObjectNotifier is only ever constructed as a default parameter to a constructor, which then passes it to the init method of a member GuardObjectNotificationReceiver, which initialises the pointer.

Unfortunately it does this by casting away const, which is presumably why Coverity can't see how mStatementDone changes. Maybe we can use && these days?

(I don't understand why an init method is used rather than the constructor of the receiver; as it is it seems that you can forget to init the receiver.)
Just as a follow-up, this could be the reason for as many as 900 or more the current Coverity warnings.

'MutexAutoLock lock(' is used at >900 places in C++ code in gecko right now and they all get one warning each.

AutoRestore also uses GuardObjectNotifier and is used over 100 times. Example CID 1275850 that points to line 1033 in netwerk/cache2/CacheIndex.cpp:

    AutoRestore<bool> saveRemovingAll(index->mRemovingAll);
Looks like it had some side effects.
Anyway, thanks for having a look, it is causing a lot of warnings in coverity.
Neil, any other idea? This is giving a lot of warnings in coverity.
Flags: needinfo?(neil)
The LinuxDbg 4 failures look like known intermittents, the JP failures...I'm not sure, but that might be infra issues, the OSX 10.6 bc1 and dt1 failures look like known intermittents. Some of these seem to have been fixed, I've push a run on top of the latest m-c and will do some retriggers if necessary:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=304712ba2fc0
Comment on attachment 8580070 [details] [diff] [review]
Possible patch

Review of attachment 8580070 [details] [diff] [review]:
-----------------------------------------------------------------

The rebased try push looks happy, Nathan do you have any feedback on Neil's approach?
Attachment #8580070 - Flags: feedback?(nfroyd)
Comment on attachment 8580070 [details] [diff] [review]
Possible patch

Review of attachment 8580070 [details] [diff] [review]:
-----------------------------------------------------------------

WFM.  Did we verify that using these macros still complains about the case they were inserted to complain about?  (Presumably there are no instances of such behavior in the tree, or in our tests.)

It's too bad we can't run Coverity on this change in isolation...
Attachment #8580070 - Flags: feedback?(nfroyd) → feedback+
(In reply to Sylvestre Ledru from comment #5)
> Neil, any other idea? This is giving a lot of warnings in coverity.

(In reply to Eric Rahm from comment #6)
> The LinuxDbg 4 failures look like known intermittents, the JP failures...I'm
> not sure, but that might be infra issues, the OSX 10.6 bc1 and dt1 failures
> look like known intermittents. Some of these seem to have been fixed, I've
> push a run on top of the latest m-c and will do some retriggers if necessary:

Thanks for picking this up for me; I'd only attached the patch as a proof of concept.
Flags: needinfo?(neil)
I have verified that running this test w/ |mach gtest Synchronization*| both before and after applying the proposed patch causes an assertion as expected, example output:

> [ RUN      ] Synchronization.Guard
> Assertion failure: mStatementDone, at ../../../../dist/include/mozilla/GuardObjects.h:101
Assignee: nobody → erahm
Status: NEW → ASSIGNED
This just updates Neil's patch with a proper description.
Attachment #8590434 - Flags: review?(nfroyd)
Attachment #8580070 - Attachment is obsolete: true
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #8)
> It's too bad we can't run Coverity on this change in isolation...
Indeed. The next check is in ~3 days.
Attachment #8590434 - Flags: review?(nfroyd) → review+
https://hg.mozilla.org/mozilla-central/rev/cafa3b191f6e
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
I triggered a new coverity build and it still complains about it.
(In reply to Sylvestre Ledru [:sylvestre] from comment #15)
> I triggered a new coverity build and it still complains about it.

First, thanks for doing that.

Second, what's the complaint now?  Same thing?
Yes, it does not consider the bug to be fixed.
If you don't have access, just apply, I will give you access.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Adding an assertion should be enough to satisfy coverity.
Attachment #8591946 - Flags: review?(nfroyd)
Comment on attachment 8591946 [details] [diff] [review]
Assert that the guard notifier has been initialized

Review of attachment 8591946 [details] [diff] [review]:
-----------------------------------------------------------------

I think the comment for this also needs some wordsmithing, but I will hold off on actual suggestions for now.

::: mfbt/GuardObjects.h
@@ +76,5 @@
>  public:
>    GuardObjectNotifier() : mStatementDone(nullptr) { }
>  
> +  ~GuardObjectNotifier() {
> +    /**

Uber style nit: we tend to use // comments for "inline" comments, and reserve /** */ comments for block comments before functions/method/etc...at least I think that's the style in mfbt/, and a quick skim seems to confirm.

@@ +79,5 @@
> +  ~GuardObjectNotifier() {
> +    /**
> +     * Assert that the guard has been properly initialized. This clears up a
> +     * static analysis warning due to the potential dereferencing of a null
> +     * pointer.

In the cold light of day, I'm curious whether this assert would actually convince Coverity that mStatementDone is non-null, or whether Coverity will just think that the conditional in the assert is always-taken, since it already thinks mStatementDone must be nullptr at this point.

Here's a semi-gross idea: why don't we initialize mStatementDone like so:

#include "mozilla/Poison.h"

GuardObjectNotifier() : mStatementDone(reinterpret_cast<void*>(mozPoisonValue())) { }

That way, Coverity has to assume mStatementDone might be anything (at least, I hope it sees us storing some semi-random value into mozPoisonValue() at program initialization), and will stop complaining.  I'm unsure if we'd want to do:

MOZ_ASSERT(mStatementDone != reinterpret_cast<void>*(mozPoisonValue()));

here; Coverity might be smart enough to see that mozPoisonValue() wouldn't have changed between the constructor and the destructor.  And since it thinks that mStatementDone doesn't change from the value set in the constructor, then it'd think the assert was always failing.

The important thing is that the different initialization value doesn't change the behavior of this class: if mStatementDone pointed to poison memory, we'd get a crash when storing to mStatementDone.

That initialization, of course, needs a comment.

WDYT?
Attachment #8591946 - Flags: review?(nfroyd) → feedback+
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #20)
> Comment on attachment 8591946 [details] [diff] [review]
> Assert that the guard notifier has been initialized
> 
> Review of attachment 8591946 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think the comment for this also needs some wordsmithing, but I will hold
> off on actual suggestions for now.
> 

Sure, I'm open to suggestions. I'm also fine just removing the comment, I don't think it particularly needs to be explained (we're just performing a null check, aka enforcing the usage requirements of this setup which has a huge comment above).

> ::: mfbt/GuardObjects.h
> @@ +76,5 @@
> >  public:
> >    GuardObjectNotifier() : mStatementDone(nullptr) { }
> >  
> > +  ~GuardObjectNotifier() {
> > +    /**
> 
> Uber style nit: we tend to use // comments for "inline" comments, and
> reserve /** */ comments for block comments before functions/method/etc...at
> least I think that's the style in mfbt/, and a quick skim seems to confirm.
> 

This follows the style of the file, I can just do a contrasting style if you prefer.

> @@ +79,5 @@
> > +  ~GuardObjectNotifier() {
> > +    /**
> > +     * Assert that the guard has been properly initialized. This clears up a
> > +     * static analysis warning due to the potential dereferencing of a null
> > +     * pointer.
> 
> In the cold light of day, I'm curious whether this assert would actually
> convince Coverity that mStatementDone is non-null, or whether Coverity will
> just think that the conditional in the assert is always-taken, since it
> already thinks mStatementDone must be nullptr at this point.
> 

It should work, Coverity handles our asserts rather well (at least in the past it has, and such a changed satisfied Coverity). A key point here is Coverity thinks mStatementDone *may* be nullptr (as in we may not have set the pointer yet). And it's correct, it's totally possible we don't do initialization properly (or at some point in the future we mess it up).

> Here's a semi-gross idea: why don't we initialize mStatementDone like so:
> 
> #include "mozilla/Poison.h"
> 
> GuardObjectNotifier() :
> mStatementDone(reinterpret_cast<void*>(mozPoisonValue())) { }
> 
> That way, Coverity has to assume mStatementDone might be anything (at least,
> I hope it sees us storing some semi-random value into mozPoisonValue() at
> program initialization), and will stop complaining.  I'm unsure if we'd want
> to do:
> 
> MOZ_ASSERT(mStatementDone != reinterpret_cast<void>*(mozPoisonValue()));
> 
> here; Coverity might be smart enough to see that mozPoisonValue() wouldn't
> have changed between the constructor and the destructor.  And since it
> thinks that mStatementDone doesn't change from the value set in the
> constructor, then it'd think the assert was always failing.
> 
> The important thing is that the different initialization value doesn't
> change the behavior of this class: if mStatementDone pointed to poison
> memory, we'd get a crash when storing to mStatementDone.
> 
> That initialization, of course, needs a comment.
> 
> WDYT?

I think that's probably overkill for something used in debug only code that should have been null checked.
Eric, can we try with this patch? Thanks
Flags: needinfo?(erahm)
(In reply to Sylvestre Ledru [:sylvestre] from comment #22)
> Eric, can we try with this patch? Thanks

Sure, I just need an r+ before I can land. Nathan are you okay with this as-is after discussing in IRC?
Flags: needinfo?(erahm) → needinfo?(nfroyd)
(In reply to Eric Rahm [:erahm] from comment #23)
> Sure, I just need an r+ before I can land. Nathan are you okay with this
> as-is after discussing in IRC?

I really do not think Coverity is going to like the patch.  The defect report certainly sounds like Coverity thinks mStatementDone is null, not just "might be null".  And given that assumption, I'd expect that adding an assert would just make Coverity think the assert always fails.

You mention in comment 21 that adding asserts has fixed Coverity complaining about null-pointer derefs before...can you point to examples?
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #24)
> (In reply to Eric Rahm [:erahm] from comment #23)
> > Sure, I just need an r+ before I can land. Nathan are you okay with this
> > as-is after discussing in IRC?
> 
> I really do not think Coverity is going to like the patch.  The defect
> report certainly sounds like Coverity thinks mStatementDone is null, not
> just "might be null".  And given that assumption, I'd expect that adding an
> assert would just make Coverity think the assert always fails.
> 
> You mention in comment 21 that adding asserts has fixed Coverity complaining
> about null-pointer derefs before...can you point to examples?

Bug 1142681 is one such example. It's not a 1:1 comparison, but shows that Coverity understands what the assertion guarantees (in this case that a value isn't null).
Flags: needinfo?(nfroyd)
(In reply to Eric Rahm [:erahm] from comment #25)
> (In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #24)
> > (In reply to Eric Rahm [:erahm] from comment #23)
> > > Sure, I just need an r+ before I can land. Nathan are you okay with this
> > > as-is after discussing in IRC?
> > 
> > I really do not think Coverity is going to like the patch.  The defect
> > report certainly sounds like Coverity thinks mStatementDone is null, not
> > just "might be null".  And given that assumption, I'd expect that adding an
> > assert would just make Coverity think the assert always fails.
> > 
> > You mention in comment 21 that adding asserts has fixed Coverity complaining
> > about null-pointer derefs before...can you point to examples?
> 
> Bug 1142681 is one such example. It's not a 1:1 comparison, but shows that
> Coverity understands what the assertion guarantees (in this case that a
> value isn't null).

Right, that's not the same thing.  Coverity looked at that code and saw that the arena could be non-null or null, then figured out there would be a leak if it was null.  The assert told Coverity that null wasn't a valid value for the pointer, so it could ignore that possibility.
Flags: needinfo?(nfroyd)
Looks like we're going to go with the poisoning recommended in comment 20, I don't have time to look into the more elaborate solution right now so if someone else wants to pick it up they're more than welcome to.
Assignee: erahm → nobody
In order to both verify that guard object notifiers are being properly used
and to silence a coverity warning about an explicit null dereference we
switch over to using a poison value rather than nullptr. An assertion is added
to make sure that the guard object notifier is properly initialized as well.
Attachment #8595651 - Flags: review?(nfroyd)
Assignee: nobody → erahm
Status: REOPENED → ASSIGNED
Attachment #8591946 - Attachment is obsolete: true
Comment on attachment 8595651 [details] [diff] [review]
Assert that the guard notifier has been initialized

Review of attachment 8595651 [details] [diff] [review]:
-----------------------------------------------------------------

I owe you at least one beer if Coverity is still complaining after this.
Attachment #8595651 - Flags: review?(nfroyd) → review+
We're going to need to update a few more build files, there's a whole lot of failure going on in that try push. At least:
  - memory/replace/dmd
  - mozglue/linker

Not sure where these map:

> gdb-tests.exe : fatal error LNK1120: 1 unresolved externals
> js.exe : fatal error LNK1120: 1 unresolved externals
> jsapi-tests.exe : fatal error LNK1120: 1 unresolved externals

Other failures appear to be AWS infrastructure related issues.
(In reply to Eric Rahm [:erahm] from comment #31)
> We're going to need to update a few more build files, there's a whole lot of
> failure going on in that try push. At least:
>   - memory/replace/dmd
>   - mozglue/linker
As the remaining issues are unrelated to the actual issue, I cherry-picked the patch and push Firefox build to coverity.
Great news:
       New defects found: 13
       Defects eliminated: 10974

Congrat!
Awesome work!
Updated build files to add mfbt dependencies.
Attachment #8595651 - Attachment is obsolete: true
Comment on attachment 8597487 [details] [diff] [review]
Assert that the guard notifier has been initialized

Review of attachment 8597487 [details] [diff] [review]:
-----------------------------------------------------------------

Carrying forward r+ for code changes from :froydnj.
Attachment #8597487 - Flags: review+
Comment on attachment 8597487 [details] [diff] [review]
Assert that the guard notifier has been initialized

:glandium, can you take a look at the moz.build files and lay out any specific changes you think need to happen?
Attachment #8597487 - Flags: review?(mh+mozilla)
Ok, now windows is busted b/c of too many symbols. 

Snippets:

>mozglue.lib(mozglue.dll) : error LNK2005: "bool __cdecl mozilla::IsFloat32Representable(double)" (?IsFloat32Representable@mozilla@@YA_NN@Z) already defined in Unified_cpp_mfbt0.obj
>   Creating library gdb-tests.lib and object gdb-tests.exp
>gdb-tests.exe : fatal error LNK1169: one or more multiply defined symbols found

>mozglue.lib(mozglue.dll) : error LNK2005: "bool __cdecl mozilla::IsFloat32Representable(double)" (?IsFloat32Representable@mozilla@@YA_NN@Z) already defined in js_static.lib(Unified_cpp_mfbt0.obj)
>   Creating library js.lib and object js.exp
>js.exe : fatal error LNK1169: one or more multiply defined symbols found

> mozglue.lib(mozglue.dll) : error LNK2005: "bool __cdecl mozilla::IsFloat32Representable(double)" (?IsFloat32Representable@mozilla@@YA_NN@Z) already defined in js_static.lib(Unified_cpp_mfbt0.obj)
>   Creating library jsapi-tests.lib and object jsapi-tests.exp
>jsapi-tests.exe : fatal error LNK1169: one or more multiply defined symbols found
Comment on attachment 8597487 [details] [diff] [review]
Assert that the guard notifier has been initialized

:glandium, looks like there is some more windows bustage due to mfbt symbols being multiply defined. Any specific suggestions on how to properly fix this?
Attachment #8597487 - Flags: review?(mh+mozilla) → feedback?(mh+mozilla)
Per glandium on IRC:

> erahm: for the android bustage, you need mfbt for TestZip, which is in mozglue/linker/tests, not mozglue/linker
> erahm: the js errors are likely due because you're not exporting your poison symbol
> https://dxr.mozilla.org/mozilla-central/source/mfbt/Poison.h#22 should make it exported, but it would seem it's not exported. you'd have to figure that out

So there's a mystery as to why the poison symbol isn't exported on windows. I don't have a windows machine to trace this down, so we'll have to deal with some try trial and error.
<glandium> erahm: looks like it *is* exported, so it must be some other fun problem with dllimport vs dllexport
<erahm> glandium: that is...unfortunate :(
<glandium> erahm: ok, it's because of the IMPL_MFBT defines
<glandium> in js/src/*/moz.build
<glandium> remove them
(snip)
<glandium> erahm: note that this is actually independent of your patch, that needs to be fixed, it's a latent bustage from bug 1134039
Attachment #8597487 - Flags: feedback?(mh+mozilla)
Nathan, given the cascading suck of build file issues would you be okay w/ me just #defining poison to some value that we agree on and moving on? ie:

> #define POISON 0xACACACAC
> mStatementDone(static_cast<bool*>(POISON))
Flags: needinfo?(nfroyd)
(In reply to Eric Rahm [:erahm] from comment #43)
> Nathan, given the cascading suck of build file issues would you be okay w/
> me just #defining poison to some value that we agree on and moving on? ie:
> 
> > #define POISON 0xACACACAC
> > mStatementDone(static_cast<bool*>(POISON))

Bleh.  So, basically, Windows and all the associated programs we build there are making our life difficult?

I guess it's not worth trying to rewrite how mfbt builds on Windows for the sake of a Linux64 static analysis.  I think MOZ_POISON should be uintptr_t(-1), and please write some explanatory comment for why we are making up our own poison value.

Thanks for sticking with this.  Sorry it turned out to be so complicated. :(
Flags: needinfo?(nfroyd)
Just using a #define now.
Attachment #8599993 - Flags: review?(nfroyd)
Attachment #8597487 - Attachment is obsolete: true
Blocks: 1160253
Comment on attachment 8590434 [details] [diff] [review]
Coverity complains on every use of MutexAutoLock and GuardObjectNotifier

Review of attachment 8590434 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/GuardObjects.h
@@ +10,5 @@
>  #define mozilla_GuardObjects_h
>  
>  #include "mozilla/Assertions.h"
>  #include "mozilla/Types.h"
> +#include "mozilla/Move.h"

M comes before T
(In reply to :Ms2ger from comment #47)
> Comment on attachment 8590434 [details] [diff] [review]
> Coverity complains on every use of MutexAutoLock and GuardObjectNotifier
> 
> Review of attachment 8590434 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mfbt/GuardObjects.h
> @@ +10,5 @@
> >  #define mozilla_GuardObjects_h
> >  
> >  #include "mozilla/Assertions.h"
> >  #include "mozilla/Types.h"
> > +#include "mozilla/Move.h"
> 
> M comes before T

That one already landed :) I can fix up in the follow up (attachment 8599993 [details] [diff] [review]) of course!
Comment on attachment 8599993 [details] [diff] [review]
Assert that the guard notifier has been initialized

Review of attachment 8599993 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/GuardObjects.h
@@ +19,5 @@
>  
> +/**
> + * A custom define is used rather than |mozPoisonValue()| due to cascading
> + * build failures relating to how mfbt is linked on different operating
> + * systems.

Please insert a "See bug 1160253." here.
Attachment #8599993 - Flags: review?(nfroyd) → review+
https://hg.mozilla.org/mozilla-central/rev/43e99677905e
Status: ASSIGNED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.