Closed Bug 999888 Opened 10 years ago Closed 10 years ago

Make ObserverLists not use mozilla::AtomicRefCounted

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla32

People

(Reporter: ehsan.akhgari, Assigned: ehsan.akhgari)

References

Details

Attachments

(2 files, 4 obsolete files)

      No description provided.
Assignee: nobody → ehsan
Blocks: 991812
Attachment #8410728 - Flags: review?(nfroyd)
Attachment #8410728 - Flags: review?(nfroyd) → review+
So I was trying to figure out why our leak checking doesn't catch this until I found this horrible code which was added in bug 913653.  Aaron, can you please explain why?  This should have at least received my review... :(
Flags: needinfo?(aklotz)
Blocks: 913653
Attachment #8410996 - Attachment is obsolete: true
Comment on attachment 8410999 [details] [diff] [review]
Part 0: Prevent ObserverList objects from leaking; r=aklotz

This seems to work fine.
Attachment #8410999 - Flags: review?(aklotz)
Comment on attachment 8410999 [details] [diff] [review]
Part 0: Prevent ObserverList objects from leaking; r=aklotz

Oops, this has some debugging junk in it.
Attachment #8410999 - Attachment is obsolete: true
Attachment #8410999 - Flags: review?(aklotz)
Attachment #8411001 - Flags: review?(aklotz)
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #3)
> So I was trying to figure out why our leak checking doesn't catch this until
> I found this horrible code which was added in bug 913653.  Aaron, can you
> please explain why?  This should have at least received my review... :(
It's *supposed* to "leak." This code is intended to live until the process terminates, as the IOInterposer is intended to capture I/O right up until the end (to support shutdown write poisoning and other I/O logging). If it's plugged into leak checking then we will see false positives.
Flags: needinfo?(aklotz)
(In reply to Aaron Klotz [:aklotz] (Away until April 28) from comment #8)
> (In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment
> #3)
> > So I was trying to figure out why our leak checking doesn't catch this until
> > I found this horrible code which was added in bug 913653.  Aaron, can you
> > please explain why?  This should have at least received my review... :(
> It's *supposed* to "leak." This code is intended to live until the process
> terminates, as the IOInterposer is intended to capture I/O right up until
> the end (to support shutdown write poisoning and other I/O logging). If it's
> plugged into leak checking then we will see false positives.

Leak checking is not an optional feature that you can opt into!  If this is intent to leak, then you cannot make ObserverLists refcounted, and you need to roll your own manual lifetime rules for that object.  The existence of this code prevents us from removing the RefCounted class which has introduced several severe issues for us (including security sensitive bugs) so doing nothing here is not an option.

I see two ways forward.  If you think making ObserverLists not refcounted and manually managed is something that you can do very soon, then let's do that here in this bug.  If not, let's take my patch here which may break IOInterposer by making it miss the late I/O which might happen here and then fix that as a follow-up when it's not on the critical patch of getting rid of RefCounted.  What do you think?
Flags: needinfo?(nfroyd)
Flags: needinfo?(aklotz)
For the record, here's why this code is using refcounting:

1) ObserverLists are treated as being immutable so that they may be accessed concurrently without locking.
2) In order to add or remove observers a new copy of the ObserverLists must be made and modified. In order to publish the new list, the code takes a lock, sets the master list RefPtr to the new list and an atomic generation count is incremented.
3) Registered threads compare their thread-local generation count to the master generation count. If they do not match, they take the lock from (2) and update a thread-local RefPtr with the pointer to the new MasterList.
4) Since we use reference counting for the ObserverLists, any outdated lists are eventually destroyed once all threads have grabbed the new pointer; the reference counting isn't used to prevent the current ObserverList from leaking, but rather is used to destroy the old generations.

We can do a few things here:
1) Quick fix: Use std::shared_ptr for this code;
2) Revert back to locking for ObserverLists access;
3) Try to take your patch.

I think that what we do here depends on how you feel about using std::shared_ptr. If "Leak checking is not an optional feature that you can opt into" means no refcounting at all unless it uses our leak-checked refcounting, then I should just revert IOInterposer back to locking the ObserverLists (in desktop perf we're starting to use IOInterposer to do more aggressive I/O logging during builds, so I'd like to keep as much of this functional as I can).
Flags: needinfo?(aklotz) → needinfo?(ehsan)
We discussed this on IRC and the resolution seems to be doing normal refcounting and .forget()'ing the last nsRefPtr during opt builds in order to intentionally leak the last of these objects throughout shutdown.
Flags: needinfo?(ehsan)
Ehsan, this patch is very similar to the previous one that you wrote but I added a few additional things. If we move the IOInterposerInit instantiation a bit later in XREMain::XRE_main then we can clean everything up before refcount logging is shut down.
Attachment #8411001 - Attachment is obsolete: true
Attachment #8411001 - Flags: review?(aklotz)
Attachment #8414259 - Flags: review?(ehsan)
Flags: needinfo?(nfroyd)
Comment on attachment 8414259 [details] [diff] [review]
Make IOInterposer Leakcheck-friendly

Review of attachment 8414259 [details] [diff] [review]:
-----------------------------------------------------------------

r+ on the parts of the patch that I wrote ;-)  But I would defer the real review to Nathan because I don't actually understand this whole setup.  It's far way too complicated for my small brain!

::: xpcom/build/IOInterposer.cpp
@@ +488,5 @@
>  
>  void
>  IOInterposer::Clear()
>  {
> +#if defined(DEBUG) || defined(FORCE_BUILD_REFCNT_LOGGING)

It's really sad that we have to keep propagating this everywhere. :(
Attachment #8414259 - Flags: review?(nfroyd)
Attachment #8414259 - Flags: review?(ehsan)
Attachment #8414259 - Flags: feedback+
Comment on attachment 8414259 [details] [diff] [review]
Make IOInterposer Leakcheck-friendly

Review of attachment 8414259 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/build/IOInterposer.cpp
@@ +488,5 @@
>  
>  void
>  IOInterposer::Clear()
>  {
> +#if defined(DEBUG) || defined(FORCE_BUILD_REFCNT_LOGGING)

Can you add a comment here about why we are making this check?
Attachment #8414259 - Flags: review?(nfroyd) → review+
Aaron, please land your patch alongside with mine once you addressed comment 14.  Thanks a lot!
(In reply to Nathan Froyd (:froydnj) from comment #14)
> Comment on attachment 8414259 [details] [diff] [review]
> Make IOInterposer Leakcheck-friendly
> 
> Review of attachment 8414259 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: xpcom/build/IOInterposer.cpp
> @@ +488,5 @@
> >  
> >  void
> >  IOInterposer::Clear()
> >  {
> > +#if defined(DEBUG) || defined(FORCE_BUILD_REFCNT_LOGGING)
> 
> Can you add a comment here about why we are making this check?

Will do.

(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #15)
> Aaron, please land your patch alongside with mine once you addressed comment
> 14.  Thanks a lot!

Yep!
Summary: Make ObserverLists not use mozilla::AtomicRefCounted; r=froydnj → Make ObserverLists not use mozilla::AtomicRefCounted
Updated patch with comment. Carrying forward r+.
Attachment #8414259 - Attachment is obsolete: true
Attachment #8414597 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/1d166a3a207d
https://hg.mozilla.org/mozilla-central/rev/4033c3fdd78a
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
You need to log in before you can comment on or make changes to this bug.