Closed Bug 172030 (already_addrefed) Opened 22 years ago Closed 12 years ago

|nsCOMPtr|: add simplified syntax for temporary |nsCOMPtr|s in the middle complex expressions

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: john, Unassigned)

References

Details

Attachments

(1 file)

Outside of xpcom/glue, already_AddRefed<T> is *only* used in our codebase as a
temporary returned by other functions.  It is never declared as a variable type
anywhere, nor can I see any valid use case for this.  I propose we make this
usage official (it already essentially seems to be).  If we do this, there are
two things we can do:

- make get() private, only usable by friends of the class inside xpcom/glue. 
This makes the class much safer to use since the obvious thing for a
non-thinking consumer of a class to do when they just want the pointer back is
to do FunctionReturningAlreadyAddrefed().get().  There is no legitimate use case
(that I can see) for a temporary to use get() unless it is a helper class.

- make do_QueryInterface() work with already_AddRefed<T> by releasing the
already_AddRefed<T> when using it.  This allows a very common use of COMPtrs:
nsCOMPtr<T2> comptr = do_QueryInterface(functionReturningAlreadyAddRefed());

privatizing the default constructor and operator= may also help enforce the rule
of "no declaring these things on the stack".

Assigning to me since I'm willing to do this, but until my tree is clean of
layout history state redesign I won't get to it so others are welcome :)
Ouch!

Yes, you are correct in your observation that |already_AddRefed<T>| is only used
as a temporary, typically for crossing function boundaries e.g., as a return
value.  It is a rather ephemeral type which exists soley to annotate a pointer
with the extra information that it has already been |AddRef|ed with respect to
the ownership transfer in progress.  So this type is _very_ short lived.  It is
only useful during the evaluation of the expression which is manipulating
ownership, and no longer.  This usage is already official.

No user-defined conversion exists to |T*|, because that would defeat the
annotation.  Assignments would become ambiguous where both |already_AddRefed<T>|
and |T*| were allowed.  And such an automatic conversion would throw away the
information that this pointer had already been |AddRef|ed in the case where the
caller was written _only_ to accept a |T*|.  Only a public |get| allows
interaction with raw pointers, e.g.,

  nsIFoo* fp = functionReturningAlreadyAddRefed().get();

or particularly, when assigning into the canonical |result|.

You call

  nsCOMPtr<T2> comptr = do_QueryInterface(functionReturningAlreadyAddRefed())

a "very common use".  I hope not.  As I'm sure you knew when you wrote this,
this is a leak.  I know you knew that, because you are offering to fix the leak
:-)  I don't mind discussing a fix, but I don't think we should call this a
common use when it's currently a bug everywhere it's used; and the current
design of |do_QueryInterface| in fact prohibits it (with a compile time error)
when the inner called function actually does return |already_AddRefed<T>|.  Note
that bug #8221 specifically addresses this situation.  We should carefullly
consider fixing this leak as opposed to the current state of outlawing it,
because introducing such a fix will be the first time that any implicit behavior
has re-entered |nsCOMPtr| code.  Also, this would change the behavior from two
cases (inner function returns raw pointer, inner function returns
|already_AddRefed<T>|) of don't work or don't compile to sometimes compiles and
doesn't work, sometimes compiles and works.  I know this is close to an edge
case.  It still bears deep thought.

As this is just a temporary, privatizing the constructor would be disastrous, as
all instances of these exist on the stack as automatic variables.  Anyone can
make one to signal the extra information.  Taking away this capability would
make this class much less useful, and would stop our current code from compiling.

dbaron?  Your thoughts?

jkeiser: I'm not asserting ownership over |nsCOMPtr| :-)  I just want you to
know that the machinery in there is more complicated, evolved, and considered in
places than you might think.  And since |nsCOMPtr|s appear in almost every
function in our world, a little change can have a big impact.  Tread lightly
hear, and look before you leap.  Hope this helps.
> "Also, this would change the behavior from two cases .... sometimes compiles and
> works"

I'm not sure what this is saying... explain?

I think that this change is viable if we also eliminate functions that return
raw addrefed T*....  but yes, we should be careful.  ;)
Given

  T* f0();                   // result is already |AddRef|ed
  already_AddRefed<T> f1();  // result is obviously already |AddRef|ed

currently

  nsCOMPtr<U> up0 = do_QueryInterface(f0()); // compiles, leaks
  nsCOMPtr<U> up1 = do_QueryInterface(f1()); // doesn't compile

proposed

  nsCOMPtr<U> up0 = do_QueryInterface(f0()); // compiles, leaks
  nsCOMPtr<U> up1 = do_QueryInterface(f1()); // compiles, doesn't leak

Yes, if we made functions declared like |f0| illegal when they actually did
|AddRef| their result, then we would be OK (though there might be quite a few
functions to fix).  But ... currently, the theme of |nsCOMPtr| is that it makes
bad things impossible to say, it doesn't absorb bad things.  That is, you can't
call |AddRef| and |Release| on an |nsCOMPtr|.  It's illegal to try.  It is not
the case that |nsCOMPtr| simply `does the right thing' when you do this bad
action by hand.  Similarly with |do_QueryInterface| and the other type-safe
machinery.  They make it easy to say the right thing, and impossible to mix-up
types and say the wrong thing.  They don't automatically compensate for saying
the wrong thing.  In the first versions of |nsCOMPtr|, I went the `automatic'
route.  In particular, |nsCOMPtr| would accept assignment from _any_ pointer
type, and in cases where I couldn't prove I was directly being assigned the COM
correct interface pointer, I automatically and silently queried for the correct
one.  Thus `fixing' the callers mistake ... by turning their simple assignment
into a costly dynamic cast (via |QueryInterface|).  I thought this was a
fantastic idea, but I was convinced to try the opposite approach by two people
whose opinions, at the time, I didn't value very much.  The current approach is
the direct result.  No hidden operations.  Examining any piece of code that uses
|nsCOMPtr|s, you can easily tell _everything_ that will happen.  Polymorphism is
a good thing, but when it can easily hide big surprises it's a bad thing.

I'm not saying this case is as extreme as my initial design.  It's not.  The
only thing we may be hiding here is the fact that we may consume an input rather
than leak it.  But suddenly it means that you have to see the definition of the
inner function to know what's going to happen next.  Maybe the trade-off is
worth it, maybe not.  It is certainly the case that this is a change from
|nsCOMPtr|s current modus operandi.
Ah, ok.  So you'd rather not introduce a usage that looks like an incorrect one
but is correct, since people will be more likely to do the incorrect one.  Makes
sense.

Perhaps we should address the issue of outlawing such functions first, since
there are other obvious (to me) benefits of doing so -- maintainability,
clarity, etc.
You summarize it neatly.  Perhaps a little bit of an oversimplification, but
it'll do till something better comes along.

So exactly what things do we want to talk about outlawing?
What I am concerned about with get() is not the legitimate uses, but the
numerous illegitimate ones.  I think we are abusing get() here, whose contract
in the nsCOMPtr case is "get me a weak pointer".  get() for already_AddRefed
means "get me a strong pointer."  People used to the one are going to expect
good behavior out of the other.  However, removing get() would cause some cases
to be suboptimal, such as your example assigning to a raw strong pointer--it
would require an extra AddRef() and Release().

Perhaps instead a better name for the function is in order that told the user
"not only are you getting the pointer, you are taking *ownership* of that
pointer"?  Like .take().  We have few enough of these functions using
alread_AddRefed<T> that changing the callers will not be a chore if it comes to it.


I see your point with do_QueryInterface.  All other do_QueryInterface()s have
the contract "Takes the pointer in, casts with QueryInterface, and AddRefs the
result."  No "releases the pointer that comes in," even though this is a new
case it could be non-obvious that it won't leak.  Though the only consequence of
someone misunderstanding that piece of information is that they won't use this
particular syntax.

The common use I was talking about, BTW, was "QueryInterfacing the return value
of a function and putting into an nsCOMPtr," the specific example I gave is what
would be the common syntax for doing that if we made
do_QueryInterface(already_AddRefed<T>).  For now we can just do the ugly but
explicit "nsCOMPtr<T2> comptr =
do_QueryInterface(nsCOMPtr<T>(functionReturningAlreadyAddRefed()));" or split it
into two lines.


What bz is talking about outlawing, I believe, is "functions with strong T* as
their return value" ... use already_AddRefed<T> instead.
I see the difference in our thinking now!  I had always thought of |get| as
meaning ``give me the underlying raw pointer, free of any constraints''.  You've
rolled up thoughts of ownership into it.  If many people have the same view of
|get| that you do, then probably, some changes are in order.  However, nobody
should be calling |get| on an |nsCOMPtr|.  As far as I know, it's never needed.
 dbaron long ago fixed the comparison ambiguity for which |.get()| was a
resolution.  And |nsCOMPtr|s don't really `hand out' strong pointers or weak
pointers.  The destination of an assigment is strong or weak based on how it is
used, that is

  nsCOMPtr<T> tp0 = ...;

  nsCOMPtr<T> tp1 = tp0; // |tp1| is a strong (owning) pointer

  T* tp2 = tp0;          // |tp2| is a strong pointer
  NS_ADDREF(tp2);
  // ...
  NS_RELEASE(tp2);

  T* tp3 = tp0;          // |tp3| is a weak pointer

  nsCOMPtr<nsIWeakReferenc> tp4 = do_GetWeakReference(tp0);
                         // |tp4| is a weak pointer

|.get()| could have been applied in every one of these assignments.  In every
one of these assignments it would have been superfluous.  When you see someone
using |.get()| on an |nsCOMPtr|, you should ask them why and make them stop.

I see your point about |.take()|, it makes sense to me, I like the name. 
However, the |already_AddRefed<T>| doesn't, itself, actual have ownership of the
object.  When it goes out of scope, if you haven't assigned it into a managing
pointer of your own, the referent is leaked.  Still

  T* answer = someFunction().take();

reads pretty well.  |someFunction()| is producing an object with ownership,
|answer| is taking it.  Hmmm.  Now that I read over that expression, it looks
like |take()| is on the wrong side of the |=|.  In some projects, in fact
several that I've worked on, this concept is expressed with |yield()| and
|adopt()|, e.g.,

  answer.adopt( expr.yield() )

The facet of this that doesn't fit our world is that |nsCOMPtr|s are universally
|adopt|ing.  They always take ownership.  And since we are a reference counted
system, no object really |yield|s ownership ... hmmm, well, maybe
|already_AddRefed<T>| does, since it doesn't keep ownership for itself (as all
other functions, objects, and expressions do).  So maybe

  answer = someFunction().yield();

Of course, the common case of assigning into an |nsCOMPtr| just works, as it
would today, right?

  nsCOMPtr<T> answer = someFunction();

...or does my `polymorphism can hide too much work' and `the reader should know
what happens without looking up the declaraation' arguments work against me, and
actually make it wrong for |nsCOMPtr| to understand the difference between a
function returning |T*| and |already_AddRefed<T>|?  Now I've confused _myself_.
 I'm not sure my arguments in previous are consistent enough to be valid...

Here's what I like so far

  functions that return |AddRef|ed pointers as their function result
  should be illegal, in favor of returning |already_AddRefed<T>|

  |get| should be strongly recommended against, or else illegal for
  ordinary users on an |nsCOMPtr|

Here's what I think is in the realm of possibility

  maybe |do_QueryInterface( already_AddRefed<T> )| should not be an error,
  and should `do the right thing'

  maybe |already_AddRefed<T>::get()| is the wrong name; consider |.take()|
  and/or |.yield()|

Here's what I think we've agreed is no longer on the table

  removing |.get()| functionality (by whatever name) from |already_AddRefed|

  privatizing ctor's or assignment operators of |already_AddRefed|

Does my thinking match yours?  Did I mis-state anything?

P.S.: given

  template <class T>
  inline
  T*
  take( already_AddRefed<T>& tp )
    {
      return tp.get(); // or any name you like, here
    }

one can say

  T* answer = take( someFunction() );

while

  nsCOMPtr<T> answer = someFunction();

is still valid.
> However, nobody should be calling |get| on an |nsCOMPtr|.  As far as I know,
> it's never needed.

Two exceptions I can think of:
 * printf (in debugging code), or anything else that needs to pass the pointer
   value through |...|.
 * A few cases where the pointer is a parameter to a function template, e.g.,
   ::GetStyleData, to satisfy some weak compilers (Sun WorkShop and OS/2+AIX -
   see nsButtonFrameRenderer.cpp, revision 3.46).  (I need to figure out why
   |GetStyleData| is broken whereas |CallQueryInterface| isn't.)

Maybe I should go through the tree removing the rest so people stop seeing other
code using .get().
Yes, let's get rid of uneccessary uses of |.get()|.  jkeiser: what do you think
of the template function above, and what if its name were |take_ownership_of|,
so that one would say

  T* raw_owner = take_ownership_of( someFunction() );
I guess I'll start working on elimination of functions returning addrefed
pointers over in bug 171808.
I see T* t = mycomptr; as a weak pointer assignment whether T* is a strong ptr
or not--you strengthen it yourself with the addref.

> Here's what I like so far:
> functions that return |AddRef|ed pointers as their function result
> should be illegal, in favor of returning |already_AddRefed<T>|

Agreed.  We were discussing in the IRC channel the other day and it might be
nice to have a [strongptr] directive in IDL to designate that a type is a strong
pointer.  That way you could do [strongptr] native StrongTPtr(T); and later do
[notxpcom] StrongTPtr myFunc(); thus avoiding making a %{ C++ block.  Sort of
off topic but it would certainly facilitate enforcing this rule.  People will
grumble if they have to do %{ C++ inside a class.  It's ugly.

> |get| should be strongly recommended against, or else illegal for
> ordinary users on an |nsCOMPtr|

Absolutely agreed.  get() is hardly mentioned in the user manual.

> Here's what I think is in the realm of possibility
> maybe |do_QueryInterface( already_AddRefed<T> )| should not be an error,
> and should `do the right thing'

I am in favor of this, but I'll understand if you want to keep the "no hidden
operations" rule.  I think the rule should be "nsCOMPtr and friends should be as
easy to use as possible and as safe as possible, safety being more important
than ease of use."

> maybe |already_AddRefed<T>::get()| is the wrong name; consider |.take()|
> and/or |.yield()|

I love the take(ptr) idea, it is better than mine.  I think take_ownership_of()
is too verbose and doesn't add much semantically.  yield() would also be
acceptable, though I think take(ptr) is better and more natural to wrap around
the return value of a function.

> Here's what I think we've agreed is no longer on the table:
> removing |.get()| functionality (by whatever name) from |already_AddRefed|

Right, we still may need to get the raw pointer out.

> privatizing ctor's or assignment operators of |already_AddRefed|

I agree.  The automatic default constructor shouldn't be created since we made
other constructors, and the copy constructor is OK to have since you might want
to do already_AddRefed<T> myFunc() { return myFunc2(); } already_AddRefed<T>
myFunc2();
I found something else to add here ... for some reason on gcc 2.96 if you try to
directly cast an nsCOMPtr to already_AddRefed *when you are returning it* the
conversion does not work (I guess it doesn't want to do multiple implicit
conversions):

DOES NOT WORK:
already_AddRefed<nsISupports> func() { return nsCOMPtr<nsISupports>(nsnull); }
WORKS:
already_AddRefed<nsISupports> func() { return
already_AddRefed<nsISupports>(nsCOMPtr<nsISupports>(nsnull)); }

The same results occur when you are using a variable of that type and return it.

The current hackish evil workaround:
already_AddRefed<nsISupports>func() {
  nsISupports* var = myCOMPtr;
  NS_IF_ADDREF(var);
  return var;
}

Suggestions:
(1) make an already_AddRefed<T> do_AddRef(nsCOMPtr<T>) (then you can do return
do_AddRef(myCOMPtr))
(2) make a private already_AddRefed<T>(nsCOMPtr<T>) constructor to keep people
from using that constructor (it is baaad that it compiles on Windows, it's a
guaranteed leak!)
(3) make a private nsDerivedSafe<T> constructor too, since you should *never
never never* do already_AddRefed<T> func() { return myComPtr.get(); } (which was
my first impulse when I learned of this silliness, and took significant thought
before I realized just how leaky it was)

The basic thing: already_AddRefed<T> is not an obvious construct, and its
interaction with other constructs needs to be safer--to do that we can either
make most cases work as expected, or find ways to disallow things.

I originally wanted to make an already_AddRefed<T>(nsCOMPtr<T>) constructor
*public* to deal with the problem, but sicking pointed out that currently
nsCOMPtr is almost identical to an nsCOMPtr and this would be one more
difference (since raw ptrs you have to call AddRef on).  I can see keeping
unnecessary differences out when there is a non-ugly alternative like do_AddRef.
Status: NEW → ASSIGNED
Argghhh!  |already_AddRefed<T>| is only a type annotation for raw pointers.  Do
not apply this to |nsCOMPtr|s, which already obviously report their ownership
condition and perfectly manage it.  DO NOT RETURN |nsCOMPtr|s AS FUNCTION
RESULTS.  This is a bug waiting to happen.  See the |nsCOMPtr| user manual for
the long explanation

  http://www.mozilla.org/projects/xpcom/nsCOMPtr.html

In particular, since the local |nsCOMPtr| is destroyed in your examples, the
result ends up _not_ being |AddRef|ed.  That is, it may even be a dangling
pointer.  This is just one of the bad things that directly returning an
|nsCOMPtr| can lead to.  |already_AddRefed| should always and only be used as a
return type.  Your special version of a constructor would essentially turn
|already_AddRefed| into |nsCOMPtr| and make it a manager of ownership rather
than an annotation.  Thus it should always and only be applied to raw pointers
which you have already |AddRef|ed on behalf of the function caller.  Thus, an
|nsCOMPtr| or |nsCOMPtr<T>::get()| are completely innappropriate sources for an
|already_AddRefed<T>|, since neither of those has been |AddRef|ed on behalf of
the function result.  They have only been |AddRef|ed on behalf of the owning
interest embodied in that |nsCOMPtr|.

You keep saying |already_AddRefed| is a dangerous construct.  Please enumerate
the exact dangers that you think need to be addressed in a single comment.  Just
the dangers.  In this comment, do not propose any solutions as that will muddy
the question.  One comment listing just the dangers, please.  Then we'll address
them and see where we come out.
(And, in that comment, please try not to confuse whether issues could cause the
refcount to be too high or too low.)
There are two problems:

1. The following compiles on windows (and afaict should in gcc as well, but
doesn't for some reason)

already_Addrefed<nsIFoo>
myClass::myFunction()
{
    return mMyCOMPtr;
}

where myClass::mMyCOMPtr is an nsCOMPtr<nsIFoo>. Creating an already_AddRefed
out of an nsCOMPtr is a guarenteed low refcount.

2.
It is cubersome to create a function that returns an already_Addrefed if the
value that you want to return lives in an nsCOMPtr. You have to do something like

already_Addrefed<nsIFoo>
myClass::myFunction()
{
    nsIFoo* ptr = mMyCOMPtr;
    NS_IF_ADDREF(ptr);
    return ptr;
}


There's also a semi-problem: it's an easy misstake to do a function like

already_Addrefed<nsIFoo>
myClass::myFunction()
{
    return mMyRawPtr;
}


On request i'll post the suggested solution in a separate comment :-)
proposed solutions

For 1:
Make a private constructor like |already_Addrefed(nsCOMPtr<T>)|. Or possibly
|already_Addrefed(nsDerivedSafe<T>*)| to stop people from just adding a |.get()|
to get their code to compile.

For 2:
The following function:

template <class T>
inline
const already_AddRefed<T>
do_Addref( T* aRawPtr )
{
  NS_IF_ADDREF(aRawPtr);
  return aRawPtr;
}


For the semi-problem:
Make the already_AddRefed ctor private and make do_Addref, dont_Addref and
getter_Addrefs be friends. I can't think of any case where you would need to
create an already_AddRefed but can't use any of those functions.
when i say "make a private constructor" i mean "make a private constructor so
that it doesn't compile"
Private ctors outlawing construction from |nsCOMPtr| and/or from |nsDerivedSafe|
sound reasonable.  However, there are lots of ways to return an |AddRef|ed
pointer that we don't want to become impossible to say, for instance

  already_AddRefed<U>
  C::GetU()
  {
    U* result = 0;
      // |mMemberPtr| is a |T*| or perhaps |nsCOMPtr<T>|
      // where T is not the same type as U
    CallQueryInterface(mMemberPtr, &result);
    return result;
  }

So I don't think we'd want to make it illegal to construct an
|already_AddRefed<T>| from a raw |T*|.  I think I may like the idea of
|do_AddRef| (though note my fixed capitalization).  It's wierd that there would
be no |do_Release|, and we wouldn't want people using this in general to manage
refcounts.  In fact, we pretty much only want this to apply to function return
values.  Maybe |do_AddRefResult|.  And since one never returns |nsCOMPtr|s
directly, we make it illegal for an |nsCOMPtr|.  That leaves open whether we
should make it either illegal or a no-op when applied to something that is
already an |already_AddRefed<T>|.  Making it illegal matches our current model
(e.g., with |do_QueryInterface|):

  template <class T>
  inline
  const already_AddRefed<T>
  do_AddRefResult( T* aRawPtr )
    {
      NS_IF_ADDREF(aRawPtr);
      return aRawPtr;
    }

  template <class T>
  inline
  void
  do_AddRefResult( nsCOMPtr<T>& )
    {
      // This signature exists to stop you from calling |do_AddRefResult|
      //  on a managed pointer.
    }

  template <class T>
  inline
  void
  do_AddRefResult( already_AddRefed<T>& )
    {
      // This signature exists to stop you from calling |do_AddRefResult|
      //  on a pointer that claims to already be |AddRef|ed.
    }

This could still be used by people outside the context we intend... an ugly
macro could make that impossible

  #define ADDREF_AND_RETURN( expr ) \
    return _private_do_AddRefResult(expr)

suitably renaming our template functions (above) to indicate they should not be
called directly.  Of course I hate macros.  I only bring this idea up for
completeness.  Directly returning an |already_AddRefed<T>| in this macro forces
the caller to be correctly declared, else the statement won't compile.

This is all pretty close to what you (plural) have been suggesting, except
privatizing the constructors for |already_AddRefed|.  Does this address your
concerns?  Is my argument against privatization convincing?  Or is there more
information I haven't considered yet that will convince me they _do_ need to be
privatized?  And are there further dangers in |already_AddRefed| that need to be
examined?

I'll attach a patch as soon as we all agree on how much of this we want.
Your example

    ...
    CallQueryInterface(mMemberPtr, &result);
    return result;
  }

could still be archived by writing:

    ...
    CallQueryInterface(mMemberPtr, &result);
    return dont_AddRef(result);
  }

So you sould still be able to do everything without using already_AddRefeds ctor
directly. I'm still not sure how important it is though.


Why do you want to outlaw calling do_AddRefResult on a nsCOMPtr. Doing

    ...
    return do_AddRefResult(myNsCOMPtr);
  }

should work fine and whould follow the rule of making raw-pointers and
com-pointers behave the same.

Making it illigal to call do_AddRefResult on an already_AddRefed sounds like a
good idea though. But would we really need to explicitly make a ctor for it,
since a already_AddRefed can't be cast to a raw-pointer. Hmm.. or can it, by
going through an nsCOMPtr?
>    CallQueryInterface(mMemberPtr, &result);
>    return dont_AddRef(result);

Sorry, but that's a little odd-looking to me.  Why should I put a "dont_AddRef"
annotation on a situation that does not addref and should not addref?

Imo we should keep the sematics of "dont_AddRef is something you use when you
assign a raw pointer into an nsCOMPtr and don't want the nsCOMPtr to addref for
some reason".

yeah, it is neater to to be able to return a raw pointer directly. The only
advantage with making already_AddRefeds ctor private that i can see is that it
forces you to be explicit about how the refcounting should be done by calling
do_AddRef or dont_AddRef.

The argument for being extra explicit that i can see is that already_AddRefed is
by nature a bit dangerous, in that that it "owns" a pointer, but doesn't release
it when it goes away. But I'm myself unsure about how important it is to force
people to explicitly call dont_AddRef to produce the already_AddRefed, so i'm
willing to drop the suggestion.



I talked with jkeiser some more and came up with another suggestion. Scott, you
are a bit wairy about do_AddRef since people might start using it to do normal
addrefing such as:

  ...
  do_AddRef(myPtr);
  ...

is that correct?

We could fix that by doing something similar to what do_QueryInterface does,
have do_AddRef return an nsDoAddRef class and not do the actual addrefing until
that is cast to an already_AddRefed. So a simple call to do_AddRef would be a
no-op unless the already_AddRefed is used (such as returned). I'm not entierly
sure if this is a good or a bad thing, but at least it's something to consider.
I have spoken to Scott about this issue and we agreed that RETURN_AND_ADDREF and
do_AddRef.  However, I have been thinking and would like to take another tack on
already_AddRefed<T> entirely that makes the class more useful and a lot safer. 
How about we make an nsCOMPtrResult<T> that extends nsCOMPtr<T> and has all the
same semantics except:

1. it cannot be constructed as a variable on the stack (there are ways to ensure
this by privatizing constructors and requiring functions like do_AddRefResult()
and dont_AddRefResult().  Thus both the () and (T*) constructors are private. 
This will help enforce the semantics that nsCOMPtrResult<T> can only be used as
a temporary result from a function.

2. copying from nsCOMPtrResult<T> to nsCOMPtr<T> results in the nsCOMPtr taking
ownership of the pointer from the Result (and the result being nulled out, but
the user should not ever know this).  This allows similar syntax to what we have
now--assigning nsCOMPtr<T> ptr = functionReturningAlreadyAddrefed();

3. if no such copy ever occurred, nsCOMPtrResult<T> is released.  This does
several things for us automatically:
   - it completely erases any possibility of leaking that exists with
already_Addrefed--for example, you can lose the already_AddRefed or lose the
weak reference by doing .get() to make something compile (which is *totally
safe* with nsCOMPtr and not with already_AddRefed).
   - it allows us to do things like do_QueryInterface(function()) and friends in
a totally natural way, *without leaking*.

4. nsCOMPtrResult<T>(nsCOMPtrResult<T>&) would transfer ownership as well, for
the case of compilers where you end up creating a temporary before returning
from a function (or compilers which don't inline do_AddRef).  this could be
privatized completely if the nsCOMPtrResult<T>(T*) constructor is made public,
which is something I could see potentially changing.

Extending from nsCOMPtr<T> makes it possible to do even more exotic things with
it like functionReturningCOMPtrResult()->CallFunction(); (which can be really
really nice) and generally *treat it just like a COMPtr*.  There is no new
learning to do, you can apply everything you knew of COMPtr to it except that
you can't declare one on the stack.

What do you think?
I was so excited about the rest of the proposal that I didn't finish the first
sentence :)  It should have read "I have spoken to Scott about this issue and we
agreed that RETURN_AND_ADDREF and some form of take_COMPtr--he doesn't like that
word--are good enough to make things work for now."  But I think the new
proposal addresses what we *really* want to do :)
nsISupports* foo = GetFoo();

where GetFoo() returns an nsCOMPtrResult<nsISupports>

Does my object still exist after this call?  Or did it get destroyed?

(note that mailnews has some functions that actually return an nsCOMPtr<T> and I
don't like them because of this precise problem). 
Eek!

I had forgotten that you could cast from nsCOMPtr<T> to T*.  Your case is indeed
an important one.  We need to disallow that, which entails a few things:

(b) privatize the T* cast
(a) define do_QueryInterface() and friends for nsCOMPtrResult<T>--*these would
just forward to do_QueryInterface(ptr.get())* so that the functions perform the
identical, expected actions.

It just means that some of our functionality is not as free as expected, but
lines of code should not significantly increase since there are already
explicitly disallowed versions of do_QueryInterface and friends for
already_AddRefed<T>, which could go away.
Note that disabling that cast means that you will not be able to directly call
Foo(MyFunction()) without explicitly doing .get().  But I think requiring the
explicitness is worth it to keep the safety.  If something could go wrong, make
sure the user has to think about it before he can do it :)
One more thought that doesn't affect things too badly: some things don't make
sense to call on nsCOMPtrResult.  getter_AddRefs() is one of them.  The more I
think, the more it makes sense to name it this and make it *like* this, but
don't extend and kill functionality--build in functionality we *do* want.  This
may help us deal with potential finicky compiler problems with nsCOMPtrResult
too, and it will definitely help keep space problems to a minimum the way we
already do with already_AddRefed.

What we do want:
- .operator* (why not, eh)
- .operator->
- .get()
- do_QueryInterface()
- ADDREF_AND_RETURN or do_AddRefResult() - preferences anyone?
- dont_AddRef() - since we're subsuming other functionality we may as well take
this over too and obsolete already_AddRefed (this is a very similar class anyway).
- take_ownership_of() (.take() is in nsRefPtr, perhaps we could do that here? :)

What we don't want:
- constructor, copy constructor, .operator=
- cast to T*
- .get_address()
- anything that assumes assignment in COMPtr (assign*, getter_AddRefs ...)

Problem:
In order to work like nsCOMPtr, you have to be able to create an nsCOMPtr *from*
an nsCOMPtr_helper so that do_XXX works--those functions are within the system.
  Thus perhaps we could publicize the nsCOMPtrResult(nsCOMPtr_helper&); constructor.

In all reality, to work like nsCOMPtr you have to be able to construct from a
raw pointer, but that would make it really easy to declare on the stack.  A
slightly more restrictive idea that might be even better is to have a
nsCOMPtrResult<T>(nsCOMPtr<T>&) constructor and not have a T* constructor.

Ideas on how to solve the Constructor Quandary?  Are the docs Good Enough?  The
consequence of declaring an nsCOMPtrResult<T> on the stack become obvious when
you do this:

nsCOMPtrResult<T> ptr(Foo());
nsCOMPtr<T> ptr2 = ptr;
ptr->Blah(); // CRASH because we set ptr to null

Then again, it will become really obvious really quick that one should not do
this because ptr gets nulled out :)
OK, this patch creates an nsCOMPtrResult and the necessary tests.  I have
followed the following idiom for nsCOMPtrResult, which totally eliminated the
need for any newly named functions except one (take_ownership_of, which can be
debated and changed if we so desire).  When you return from a function, it
AddRefs what you return, because that is what a COMPtr does.  If you don't want
to AddRef, you do dont_AddRef, which will return an nsCOMPtrResult (though it
doesn't yet).

1. It's just like nsCOMPtr, except for rules 2-4.  This means that it will
automatically release itself after the current statement is done.
2. You should NEVER write "nsCOMPtrResult" anywhere except as a return value to
a function.  nsCOMPtrResult tries to make that hard (though it is by no means
impossible, there are several constructors out of necessity).  This also means
there are no assignment operators or functions that work on it.
3. The first time it is assigned to a COMPtr, it is nulled out.  This also
happens when you do take_ownership_of() and when you copy one nsCOMPtrResult to
another (like when you have nsCOMPtrResult<T> bar() { return foo(); }  This
prevents over-releasing and crashing.
4. It does not allow you to do things to it that would allow you to implicitly
keep the pointer around after the current statement (no cast, no operator*,
though the latter is controversial).  This prevents crashes.

The only suboptimal thing is when you do:
nsCOMPtrResult<T> foo() {
  nsCOMPtr<T> blah;
  ...
  return blah;
}

This does an unnecessary addref / release because it converts nsCOMPtr<T> to
T*, AddRefs it into the nsCOMPtrResult, and then Releases the COMPtr.  Given
rule #2 above, I think we are safe making a nsCOMPtrResult<T>(nsCOMPtr<T>&)
constructor that handles this special case by nulling out the nsCOMPtr and
transferring the reference to the nsCOMPtrResult.

Thoughts?
sweet mother of mercy
Let me start off by saying I'm finding a lot of frustration in this bug :-)

Let me make sure everybody is starting on the same page.  The model for
|nsCOMPtr| is that is EXACTLY like a raw pointer, except that it embodies
ownership.  C++ makes some syntax around this impossible, but still, this is how
|nsCOMPtr| is written and what we want its clients to think.

  - An |nsCOMPtr| is the same as a raw pointer (except that it embodies ownership).
  - Anyplace I can use a raw pointer (and where it would be an `owning'
pointer), I can use an |nsCOMPtr|.
  - Any legal action I perform on or with a raw pointer, I can do with an
|nsCOMPtr| and expect the same result (modulo the fact that I can't separate an
|nsCOMPtr| from it's idea of ownership).

Why this model instead of something `smarter'?  Because this is helpful and
accessible.  Ordinary programmers, once they `get' ownership (assuming they
already understand pointers in general) can start using |nsCOMPtr|s and
reasonably predict the results of any expression using them.  No, the API
doesn't perfectly convey this model... there are syntax problems.  But by and
large, this was my intent, and we've been successful in it.  The implementation
of |nsCOMPtr| is very complex.  Its use is quite simple.  My intent is to keep
this model.  Whenever it becomes possible to fix the API to better convey this
model, we have in the past and will in the future take those steps.  The more
clearly we etch this model in the mind of |nsCOMPtr|s clients, and the more
closely we implement it in the API, the more likely its users will be to write
correct, working, safe, efficient, predictable code.  So far, this model has
proved very successful.  |nsCOMPtr| has been helpful in our codebase and to a
large degree, I argue, because of its simplicity of use.

To summarize: the goals for further modifications of the |nsCOMPtr|
implementation and related machinery are to ensure a coherent presentation of a
simple and useful model.

Along this path, there will be errors.  Clients will write code that doesn't
work.  For purposes of this discussion, there are three reasons we care about
that incite broken client code.  (1) we will have bugs where we don't faithfully
implement the model we advertise; (2) we will have API problems where we can't
actually even advertise the model we espouse; (3) the programmer in question
will have the wrong model in mind.  All three of these are problems we will need
to address.  And just as we need to fix bugs in our code, we'll need to fix bugs
in our docs so programmers have the right model, and failing that, directly fix
the engineers in question to have the right model.

This preface is not neccessarily pertinent to our entire discussion, but I think
 it's important, and it let's you understand the background for comments I've
made earlier and the comments I've yet to make.


Second, simplicity is central.  Always start with the simplest solution that can
reasonably work.  This is a general rule which I always encourage, and which I
will naturally apply to our work in this bug.  Additional machinery must be
weighed against its utility.  If it doesn't carry its own weight, you're not
going to want it.  That doesn't mean complex implementations are always wrong. 
But they are always more fragile.  |nsCOMPtr| is a very complex implementation.
 It evolved this way to answer many problems, and to as closely as reasonable
implement the model described above, while at the same time, getting around many
compiler weaknesses across the range of platforms we support.  It's a great
tool, but don't kid yourself---|nsCOMPtr|s complexity makes it _very_ fragile. 
No one but a seasoned C++ language lawyer with good understanding of COM's
ownership rules can reasonably modify |nsCOMPtr|s implementation without risking
catastrophe.  That's a _bad_ thing.  I wish the implementation were simpler. 
And every time we're given the chance to simplify and remove something, we try
to do so.  Is the cost of the |nsCOMPtr|s complexity worth it?  I think so, but
we strive to keep it as simple as we reasonably can; and that means removing
work-arounds as compilers get smarter, and carefully weighing all additions to
the machinery.

How does this apply to the discourse in this bug?  Well, you're proposing a lot
of new machinery.  You're close to duplicating |nsCOMPtr| in its entirety. 
Again and again in this bug and over the phone, we keep asking the same
question: "What problems are you aiming to fix."  I'm going to go over the
problems you have mentioned, if I can extract them.  I'll voice these problems
as though _you_ are speaking.

  PROBLEM 1.  |already_AddRefed<T>| might accidently be used as something other
than a return value, this is not appropriate

  PROBLEM 2.  I want a simpler way to say

    nsCOMPtr<U> u = do_QueryInterface( nsCOMPtr<T>(make_a_T()) );

These two problems were your impetus for filing this bug.

  PROBLEM 3.  Programmers will make coding errors because |nsCOMPtr<T>::get()|
means `get me a weak pointer', while |already_AddRefed<T>::get()| means `get me
a strong pointer'.  Generalizing, people think of |already_AddRefed<T>| as an
|nsCOMPtr|, and it isn't.  This makes them write broken code.

  PROBLEM 4.  Many XPCOM patterns require a function to return an |AddRef|ed
result, the current form for saying that is too hard/ugly:

    // for pointers returned as function results
    answer = member;
    NS_IF_ADDREF(answer);
    return answer;

    // for pointers returned through function parameters
    *answer = member;
    NS_IF_ADDREF(*answer);
    return rv;

You suggest other problems, but those are within some of the proposed solutions.
 I believe the list above represents the core issues you wish to address.  Your
patch above primarily reflects your ideas on solving problem 3 above. 
Programmers think |already_AddRefed| is |nsCOMPtr|.  It's not, so let's make it
be |nsCOMPtr|.  Your solution is very complicated.  To a large degree, your plan
is to duplicate |nsCOMPtr|, but also to make significant changes to its
underlying model (it no longer acts like a raw pointer).  In fact, your new
model is a huge departure from anything we have in our codebase today.  Its not
totally outrageous, as it echoes decisions in |std::auto_ptr|.  Though note that
|std::auto_ptr| implements single ownership in a non-refcounted world, so it
isn't exactly a natural next step in our multi-owner refcounted world.

Before we even get into the details of your implementation, let's ask if we need
to solve this problem.  How many people are currently making mistakes with
|already_AddRefed|.  How many people have in their mind a model for
|already_AddRefed| that says it _is_ |nsCOMPtr| or else somehow manages
ownership itself.  Plainly, you have this model.  Plainly you are motivated to
`fix' it, I can only assume, because it has caused you to write code that didn't
work as you expected.  I've talked about the model I have for |nsCOMPtr|, above.
 Let me tell you what I intend by |already_AddRefed|.  |already_AddRefed| is a
type annotation on a raw pointer to convey as part of an API that the caller is
indeed returning a pointer already |AddRef|ed on the callers behalf.  That
annotation is useful to people just reading the APIs, and it's useful to
|nsCOMPtr|, because it conveys during assignment that no additional |AddRef| is
required.  |already_AddRefed| is an API annotation.  It is not a smart pointer.
 It does not manage ownership.  Anything you would have done with a raw pointer,
you will still have to do with an |already_AddRefed|, except that it brings with
it additional type information telling you how you should use it.  In practice,
where people always use |nsCOMPtr| as the destination of an assignment from a
function returning such a value ... expressions get slightly simpler

    nsCOMPtr<T> t = dont_AddRef( create_a_T() ); // becomes...
    nsCOMPtr<T> t = create_a_T();

In the hopefully rare case that someone intends to manage this with a raw
pointer, the code gets slightly more complicated

    T* t = create_a_T(); // becomes...
    T* t = create_a_T().get();

Does |nsCOMPtr<T>::get()| mean `give me a weak pointer'?  Does
|already_AddRefed<T>::get()| mean `give me a owning pointer'?  No.  In the
|nsCOMPtr| case, remember, an |nsCOMPtr| is supposed to be EXACTLY like a raw
pointer, but embodying ownership.  So an |nsCOMPtr| appearing in an expression
does not automatically share or provide ownership to any other object or caller.
 Ownership must be taken.  |p.get()| here doesn't mean give me a weak pointer
any more than |p| by itself does.  It means `I'm using this pointer, and
coincidentally, I need to strip off type information.'  To that end,
|nsCOMPtr<T>::get()| should be used only seldomly, in particular for passing
through varargs channels, and a few similar situations.  |nsCOMPtr<T>::get()| is
syntax outside our user model to get around C++ clumsiness.  It's name
originates from |std::auto_ptr|, but perhaps it would be better named something
like |nsCOMPtr<T>::as_raw_ptr()|.  It's appearence conveys no notion of
ownershiop, because ownership is in the eye of the beholder (that is, it's in
the management of the assigment destination).  Similarly
|already_AddRefed<T>::get()| is not intended to mean `give me an owning
pointer'.  And here, |as_raw_ptr()| would be as appropriate.  Once again, this
is a hack to get around C++ syntax and ambiguity issues.  On many compilers, if
|already_AddRefed<T>| had an automatic user-defined conversion to a |T*|,
assignment into an |nsCOMPtr<T>| would be ambiguous.  Then our _primary_ use of
this construct would fail.

OK.  Now we see the real problem.  The model I intended, and the model you have
in mind are very different.  How do we solve this problem?  One way is to say my
model is wrong, or less intuitive, or less desirable, and build machinery to
satisfy your model.  The totally opposite tack is to say your model is not the
one I intended, and better explain what I _did_ intend.  Close to that end, but
in the middle is to say the current implementation doesn't perfectly implement
my model, and to propose changes to fix it.  Let's examine the problems in the
current implementation:

  |already_AddRefed| as a type annotation, is only appropriate as a return value

That's true, and currently no one uses it as anything else.  Should we add
machinery to prevent the condition that doesn't yet occur?  That certainly
wouldn't be the simplest way to start.

  T* t = create_a_T().get()

doesn't really say what it means.  This doesn't convey the notion to the reader
that |create_a_T()| is returning ownership.  Maybe

  T* t = accept_ownership_of( create_a_T() );

Yeah, it's long... but you're not supposed to be using raw pointers in ownership
situations anyway.  If you are, you already have a problem.  One more reason to
ensure that this process isn't implicit.  The common case is with |nsCOMPtr|

  nsCOMPtr<T> t = create_a_T();

Do we want to add machinery to this statement?  Probably not.  This is the
common case.  If |create_a_T()| is declared in such a way that it makes its
notions of ownership clear (using a raw pointer result if it doesn't |AddRef| on
the callers behalf, and |already_AddRefed<T>| if it does), then this expression
functions perfectly and predictably.  Is there something there that needs to be
fixed?

Then there's the internal side of this problem.  How does |create_a_T()|
actually produce the |AddRef|ed value?  This is in the domain of problem 4, and
I think we've come to some reasonable conclusions about making that easier.  Are
our solutions to problem 4 vital bug-stopping fixes?  No, they are all about
convenience, and convenience only.

So where do we go from here?  Is my model natural, useful, helpful?  Is your
proposed model better?  What model does the typical client have?  If what
they're thinking doesn't match the implementation, will it be better to fix the
implementation to match (that is, if _everyone_ has the same `wrong' model)?  Or
to fix the API to better preach the model we have?  Or to fix the docs (or have
some doc at all)?  I'd like to start with the simplest fix, not almost
completely duplicating |nsCOMPtr| and adding an entirely new and different model
to the mix.  If there's already mis-understanding, adding more complexity is
_not_ going to be the answer.

To put it bluntly: are you the only one who thinks this way?  And if so, can I
fix _you_, and make only small code changes in the |nsCOMPtr| machinery to
account for the remaining differences (e.g., adding the return machinery
discussed in other comments, and |accept_ownership_of|).  It would suck to add a
ton of code (no matter how nice the additional code might be) to fix potential
bugs that were never really going to happen anyway.  While I admire your
enthusiasm, |nsCOMPtr|s appear on almost every line of code in our app ...
rushing headlong into major changes scares me.  A long phone conversation with
you that leads to you providing a complete implementation of something
completely outside the scope of our discussion scares the hell out of me. 
_Because_ |nsCOMPtr| is already complex, and therefore, fragile.

This is why over and over again you hear me and dbaron asking, essentially, what
is _exactly_ the broken thing that needs to be fixed.  I get the idea that you
are anxious to apply C++ magic to the problem.  I get this from your many
mentions of privatizing dtors and ctors and providing appropriate conversions
and whatnot.  Yes, C++ magic is fun, but if you describe your problems in terms
of the magic you intend to fix it with, you obscure the possibility of
significantly simpler fixes.  Before we can continue, _you_ need to do a
thorough, honest, and unbiased (as much as possible) review of your own thoughts
and opinions.

  - is your model right?
  - which are the real problems that really cause bugs
  - what is the simplest approach to a solution for those problems

So far, in my mind, the simplest solution appears to be conveying to _you_
personally what _my_ model is, and a few small code changes to fix some of this
implementatioon's shortcomings.  I'm not trying to stonewall you here, nor (I
hope) am I trying to stroke my own ego.  I want the best solution for the sake
of the code and its clients.  If you're right, I'm ready to proceed.  But this
discussion has focused very much on magical solutions and very little on proving
that there are real problems.  I love magic, but I refuse to apply any until I'm
sure it's needed.  I hope I've conveyed my thoughts usefully, completely, and
professionally.  I hope you've received them in the same way :-)
My apologies, I did not post my entire reasoning when I created this class. 
There *is* reasoning, the major points being that it is easier to learn and can
be used in more (real world!) situations than already_AddRefed<T> can.  In
summary, the ownership model of this object is what we are really discussing
right now, and based on what I believe people will want to do with the result of
a function, I think a stronger ownership model is necessary.  However, to come
to that conclusion I will go through our existing situation, make sure we're on
the same page with the current model, and go through specific cases that ought
to be fixed, as you have wisely suggested.

I hear you on the simplicity issue, but disagree because simplicity is not the
only point of nsCOMPtr--if a little more complexity will make common cases
easier and make learning easier, we should do it.


1. Hardly anybody currently uses already_AddRefed<T>.  (Before bz's checkins in
the last few weeks, IIRC, there were only two places it was used in the tree.) 
Therefore I believe it is a bit early to determine who has what model in their
head for ownership models and capabilities of already_AddRefed<T>. 
Specifically, we need to determine the properties of a "pointer to COM object
that is returned from a function."


2. I have the same model in my head of COMPtr, I believe: "it is usable just
like a raw pointer in all situations" is the ideal.  We may fall short of the
ideal in some places but we strive for it as much as possible.  You say
already_AddRefed tries to follow this rule, and for consistency I agree (though
perhaps not for the same reasons?): it should follow the same rules so that once
you understand the COMPtr model you can use other constructs in the module with
as little relearning as possible.

already_AddRefed<T>, however, mostly breaks this rule by disallowing many common
operations.  nsCOMPtrResult<T> breaks it as well, but breaks it less (see end of
comment).

already_AddRefed<T> IS like a raw pointer in two ways:
(a) you can do return (raw pointer); from a function returning
already_AddRefed<T> (see ownership rules below)
(b) you can assign into a COMPtr

already_AddRefed<T> is NOT like a raw pointer in many ways:
(a) you cannot assign into a raw pointer.
(b) you cannot use it in a function expecting T*
(c) you cannot use operator -> on it
(d) you cannot use operator* on it (thus it cannot be passed to a function
requiring T&)
(e) you cannot return (nsCOMPtr) the way you could if it were T*
(f) you cannot call QI on it

These are the main ones I am thinking of; these could all be good rules
depending on how one might want to use the object / not want to use the object.
 We will visit that in item 4.

My belief: the *implied* rule is "these pointer objects are like raw pointers in
all situations where it is *safe* to use them as raw pointers."  Don't let me
put words into your mouth, though.


3. Since already_AddRefed<T> is a new class, any *new* design decisions involved
in the class should be shown explicitly.  already_AddRefed<T> is different from
COMPtr in its ownership management.  This is totally fine, but requires you to
do a different set of things when you interact with it.

Things you have to do differently when using already_AddRefed<T> instead of
nsCOMPtr<T>:
(a) you MUST manage an already_AddRefed<T> yourself.  You cannot ignore the
result of a function returning it, for example.  You have to take it and release
it.  (This is not crucial, just a point.  We could always include it in the Rule
Of Things Not To Do When You Use already_AddRefed<T>)
(b) an already_AddRefed<T> is unlike COMPtr in that you have to addref first.

Now, whether that is *good* depends on how the class needs to be used.  We will
visit that shortly.


4. Let's quickly go through the list of things you will probably want to do
*often* returning from a function.

(a) return and AddRef a raw pointer
(b) return and AddRef an nsCOMPtr
(c) return but don't AddRef a raw pointer (if it's already addrefed)
(d) QueryInterface before returning
(e) grab the result from QueryElementAt or QueryReferent

(d) and (e) can be easily handled within already_AddRefed so there's no clear
advantage or disadvantage--if we decide they are important we should just add them.

(a-c) are a bit harder because depending on which ownership model you choose,
one or the other is a bit harder.  Pros and cons:

PROS OF MANAGED OWNERSHIP FOR RETURNING FROM A FUNCTION
- return-and-AddRef is easier than non-managed ownership
- same syntax as our existing model for nsCOMPtr (I think this is a big plus)
- in nsCOMPtrResult<>, at least, one can optimize away the extra AddRef/Release
involved in returning an nsCOMPtr and addrefing it.  A detail, but still, it's
there.

PROS OF NON-MANAGED OWNERSHIP
- easier to return-but-not-AddRef

It is *possible* to do everything you want to do with both models and at least
make it a simple operation.  I have stated my preference here, but it can be
argued both ways I think.  return-and-AddRef nor return-but-no-AddRef are going
to be equally prevalent, because of COM getter functions, though if we really
encourage everyone to make COMPtrs to store everything then return-and-AddRef
will be more prevalent.

5. Now, let's get down to figuring out what people may want to do with things
after they are returned from functions.  Once we agree on that, we can decide
what design best suits us from there.

(a) nsCOMPtr<IFoo> o = foo();
Possible with both.

(b) IFoo* o = foo();
There are reasons to disallow this in both models, and indeed it is disallowed
in both.

(c) function(foo()) (this includes do_QueryInterface(foo));
- IMPOSSIBLE to do with non-managed ownership--you have to assign to temporary
- Possible to do with managed ownership because the temporary returned will
release itself

(d) foo()->member():
This is an idiom that *does* happen in Java, though less prevalent in C++ it
would still be nice.  If we move to more getter functions like this it would be
cumbersome to do something like foo()->bar()->baz()->member() in a non-managed
ownership model.
- IMPOSSIBLE to do with non-managed ownership--you have to assign to temporary
- Possible to do with managed ownership because the temporary returned will
release itself

I think the last two cases, the first of which is a generalization of one of the
original problems, make a strong case for the managed ownership model.

6. Again I point out a minor point, but one worth raising anyway: it seems like
a very possible mistake to do the following if an API changes to add a return
value (this syntax will still compile):
foo(); // throw away return value

With already_AddRefed, this will break.  A minor point, but another one in favor
of the managed ownership model.

7. nsCOMPtrResult<T> seems like a good solution because it allows function calls
and member function calling and makes returning from a function similar to
COMPtr.  The con is that it is not as simple to write ... but given COMPtr's
solid track record I think we have a good interface and good code to work from.

Phew.  There you go :)
1. Remember, currently |already_AddRefed| is needed machinery _inside_
|nsCOMPtr|.  It's the basis of the implementation for |dont_AddRef| and
|getter_AddRefs|.

2. For a function |foo| that returns an |AddRef|ed pointer, assuming that

  T* tp = foo();

is illegal, and possibly syntactically denied.  Isn't it better just to declare
|foo| as

  const nsCOMPtr<T> foo();

Doesn't that have all the properties needed to address the points you raise and
with none of the implementation (omitting the problem of the raw pointer
assignment I called out above)?  Doesn't everything you want to do become possible?

  From inside |foo|:

  return aRawPtr; // |nsCOMPtr| result constructed from a raw pointer |AddRef|s
  return aCOMPtr; // |nsCOMPtr| result copy-constructed |AddRef|s
  return dont_AddRef(an_already_AddRefed_ptr);
  return do_QueryInterface(anyPtr);
  return do_QueryReferent(aWeakPtr);

  From outside |foo|:

  nsCOMPtr<T> t = foo();
  // ignoring raw pointer case for a moment
  DoSomething( foo() );
  nsCOMPtr<U> u = do_QueryInterface( foo() );
  foo()->some_member_function();
  foo();

So, ignoring the case of assignment into a raw pointer (for a moment, anyway),
doesn't just plain old |nsCOMPtr| seem like a better answer than adding special
machinery to solve at least _these_ problems?
Yes, nsCOMPtrResult is just like nsCOMPtr with these exceptions:

1. Removes cast to nsDerivedSafe<T>*.  We will leave this aside since it sounds
like you might have a plan to just disable that--but it seems to me that
requires a new class.

2. It can optimize away an extra AddRef and Release on the left-hand-side that
occurs in this situation (imagine foo()'s implementation is in a separate .cpp /
.o):
nsCOMPtr<IFoo> foo() { return new IFoo(); }
...
nsCOMPtr<IFoo> o;
o = foo();

In this case, the sequence is: (a) return value is constructed.  AddRef.  (b)
operator=(foo, return value) is called.  AddRef.  (c) return value is
destructed.  Release.  I have tested this and it does do an extra AddRef/Release.

YES, optimizing compilers can remove this entirely when the function is inlined
or even when you say nsCOMPtr<IFoo> o = foo() directly it actually copies the
memory from the return value into foo and therefore does not actually call the
copy constructor.  The point is, those are just optimizations--the thing C++
does in its most verbose case (which is a case that *can* happen as outlined
above) an extra AddRef and Release is performed.

Is this a strong enough reason to use a subclass?  I don't know.  But if we are
doing a new class for #1 (even if it's a subclass) we may as well do this
optimization too, and fritter away the AddRef and Release.

3. It can optimize away an extra AddRef / Release when you return an nsCOMPtr
from a function, but that is generally going to be handled by optimizing
compilers anyway (though you *do* have to make sure and do the optimization if
you create an nsCOMPtrResult<IFoo>, because it will call the constructor in that
case instead of just copying bits in memory):
nsCOMPtr<IFoo> foo() { nsCOMPtr<IFoo> myFoo; ... return myFoo; }
...
nsCOMPtr<IFoo> o = foo();

The sequence for a non-optimizing compiler is: (a) "myFoo" is constructed. 
AddRef.  (b) Return value on the stack is constructed with
nsCOMPtr<IFoo>(myFoo).  AddRef.  (c) "myFoo" is destructed.  Release.

Not that that particularly matters since compilers optimize, but the same
sequence happens when you return nsCOMPtr into an nsCOMPtrResult, which is where
it *does* get significant.

4. (this doesn't matter) it doesn't have the operators that assume
nsCOMPtrResult is a LHS like getter_AddRefs and operator=.  This was done to
have less stuff in the interface.  It is not by any means a priority or even a
necessity because callers shouldn't need to do that anyway.


The rest of this is about how you can replace already_AddRefed<> with nsCOMPtr
or nsCOMPtrResult even for internal use.  I believe we can get rid of
already_AddRefed<T> with no modifications to APIs if we use nsCOMPtrResult, and
we can even get rid of it with just nsCOMPtr.  Doing this has the admirable side
effect of making dont_AddRef(...).

Regarding dont_AddRef: this can be replaced with nsCOMPtr<> in all instances as
it is--you just have to have a way to initialize the nsCOMPtr without addrefing,
which can be done with a private constructor or what have you (I think it is
worthwhile to go the private-constructor-and-friend route instead of a separate
type because then you get rid of an entire templatized class).  Since it *is*
inlined, optimizing compilers can get rid of any extra copy construction.  If we
create a new class, though, I suggest using it (nsCOMPtrResult<> would serve,
for example).

Regarding getter_AddRefs: two of getter_AddRefs's incarnations are identical to
dont_AddRef.  The third is not.  First off, you *can* make all this work with
nsCOMPtr<> itself as well (and obviate the need for already_AddRefed internally)
but there is something else afoot here:

The two forms of getter_AddRefs() perform fundamentally different operations,
even outside of the fact that they return different types--the nsCOMPtr<> one
nulls out the RHS pointer whereas the others do not.  This seems like a bad idea
from an API design standpoint.  It also gives us trouble if we want to return
nsCOMPtr<> from functions, because now you may want to use the first form of
getter_AddRefs on the return of a getter the way you do for getters that return
T*.  Another class (nsCOMPtrResult) would solve that problem, but it seems like
a fundamental problem anyway.  Problem for this bug or later?  Depends whether
we think it's possible to return nsCOMPtr<> itself from functions.


So if this is what you are getting at, we could make nsCOMPtrResult<T> : public
nsCOMPtr<T> except hide the T* cast and add optimization and we'd be OK.  I
initially did that but then decided less crud in the interface was good--but
inheriting probably will make a little less bloat (I can test this), definitely
a good idea.
Downsides to using |nsCOMPtr| directly for this purpose, as pointed out by
jkeiser, are:

  - we haven't dealt with the (mentioned) assignment into a raw pointer

  - when the return value optimization cannot be applied, there will be an extra
|AddRef|, |Release| (because of the temporary and hopefully anonymous |nsCOMPtr|
within the function)

  - we don't want the result to be assignable, that is

    foo() = someOtherPointer

    should probably fail to compile.  I think this case is covered with the
|const| return value idiom.  I may be missing something, but I think this is a
|const|-correct situation.  If |nsCOMPtr| isn't |const|-correct for this
situation, it ought to be, and that would be a separate bug.

I think the return value optimization will be applied very often.  This
downgrades the severity of the second point.  Third point covered.  This leaves
the issue of assignment into a raw pointer as, I believe, the most important
downside to solve.

Note that a solution using |nsCOMPtr| or one using |nsCOMPtrResult| is
orthogonal to several of the other fixes we have discussed in this bug; and
_none_ of the proposed solutions eliminates the need for |already_AddRefed| as
internal machinery for |nsCOMPtr|.
Why do you want to get rid of |already_AddRefed| for |nsCOMPtr|s internal
machinery when it is simple and works.  It is the type annotation that triggers
the non-|AddRef|ing assignments and constructions, which is exactly what you are
suggesting.  How is your suggestion an improvement over the current machinery?
Why on earth would we go to trouble to hide the |nsDerivedSafe| conversion
machinery, which explicitly exists to prevent wrong usage?
I see that one of your reasons for replacing the |already_AddRefed| machinery is
that it would `get rid of one templated class'.  If that's a good idea, then
isn't it just as good to not _add_ a new templated class?  This is all assuming
an actual solution exists with private-constructor-and-friends.
The fact that there are two very different meanings for |getter_AddRefs| is a
genuine annoyance.  History is no excuse, but I'll supply it anyway. 
Originally, there were a pair of `wrappers' for the parameter return use

  getter_AddRefs(myCOMPtr)
  getter_doesnt_AddRef(myCOMPtr)

You picked the right one depending on the implementation of the getter in
question.  |getter_doesnt_AddRef| was very costly.  And the fact that routines
existed in both forms made it difficult to do the right thing either with
|nsCOMPtr|s or with raw pointers.  We solved the situation by outlawing routines
that return non-|AddRef|ed pointers through function parameters.  Additionally,
there were several `wrappers' for assigning into an |nsCOMPtr| directly

  myCOMPtr = do_QueryInterface(anyPtr)
  myCOMPtr = dont_QueryInterface(anyPtr)
  myCOMPtr = do_AddRef(anyPtr)
  myCOMPtr = dont_AddRef(anyPtr)

Strangely, and consistently, I found people doing this... when the source
pointer existed in an already |AddRef|ed form, people would say

  myCOMPtr = dont_AddRef(p)

when using a pointer |AddRef|ed as a result, they would first try

  myCOMPtr = getter_AddRefs(p)

and were surprised when it didn't work.  However, this was way back when the
user base of |nsCOMPtr|s could be counted on one hand.  It was a mistake to make
this homonym serve in this position.  A mistake you are now pointing out, but
that has been plaguing me for some time.  Beware, some API changes that seem
great to you right this instant may be of this form.  That is, they make perfect
sense to you, but don't fit the real model, and no one else will get them. 
People will learn them, but they will never make sense to them, and will be a
part of their mental model of the implementation only as exceptions.  Then
history labels your changes `a mistake'.

I'd love to have this fixed.  I'd hate to fix it myself.  It would require a
huge change to the sourcebase, a massive re-education, and for change in code
speed, size, or functionality.  Yes, it would be better and more correct, and
more understandable, probably.  But who is willing to pay that cost?
I still don't picture a lighter weight version of |dont_AddRef| that elides
|already_AddRefed|; and I don't think I understand your comment

  Doing this has the admirable side
  effect of making dont_AddRef(...).
This is my response to the first 3-4 comments.  Responses to the last 2 will
follow since I haven't read them yet :)

For nsCOMPtr, removing the cast to nsDerivedSafe<T>* is the same as saying
"removes raw pointer cast," and that is all I am saying with that point.  I do
not propose to remove the machinery altogether.  It is good machinery.

Moving to an nsCOMPtr<> derived type may be orthogonal to other fixes in this
bug (we covered a lot of ground) but I believe we can live without private
constructors (Don't Do That should be sufficient in this case since it's a very
well-defined rule: "only declare nsCOMPtrResult as a result of a function"). 
One thing we have not yet discussed (which we should) is that get() will become
more necessary in order to do things like function(foo().get());


Adding another templatized class is good in this case because it gives us
something we didn't have before: the ability to safely handle return values and
still do many things with them.  You add classes and functions when existing
functionality is insufficient.

Removing a templatized class without loss of functionality seems like a good
thing to me, if only because a developer of nsCOMPtr has one fewer idiom
(however trivial) to learn.  You can also get rid of some extra constructors and
global functions that deal with it.  Admittedly you're replacing some of these
with nsCOMPtrResult<> stuff, but as mentioned, nsCOMPtrResult does in fact give
us extra functionality and *if* the functionality is justified, so is the class.

I think we agree that if we move forward on the functionality in comment 31 we
can't just use nsCOMPtr and would have to use a different class--be it a
souped-up already_AddRefed<> or a dumbed-down nsCOMPtr type.
Ok, scc made a point in IRC that makes this model not work.  Namely, one of
rules (a very justified one) for the COM module is that you should not hide
expensive operations.  This means specifically that no implicit Release() should
happen unless there is a pretty high chance the reader of the code will know the
COMPtr is there.  For callers of a function like this, it is not necessarily
obvious that the result of the function is a COMPtr:

create_foo()->DoSomething(); // may destruct when the comptr goes away

I will have to think about it more, but I think a reasonable way to avoid
creating an explicit temporary variable or an ugly construction would be to have
a temp_COMPtr (probably there is a better name) function that acts like
dont_AddRef, making it easier to type something.

Possible syntax 1 (way too much extra stuff to do, and creates more thinking for
programmers since there are variables that have explicit names that could
possibly be used throughout the function):

nsCOMPtr<nsIMyFooInterface> foo = get_foo();
nsCOMPtr<nsIFoo> foo2 = do_QueryInterface(foo);

(The above situation is a tiny bit exaggerated in that I am trying to simulate
the scoping rules of the other statements and the rule that one should keep
COMPtrs alive as short a time as possible.)

Possible syntax 2 (a little ugly but solves the above problems):
nsCOMPtr<IFoo> foo = do_QueryInterface(nsCOMPtr<nsIMyFooInterface>(get_foo()));
// syntax gets old real quick

Possible syntax 3 (proposed):
nsCOMPtr<IFoo> foo = do_QueryInterface(temp_COMPtr(get_foo()));

It should be really explicit from this that a release will happen.

RETURN VALUES

I haven't thought out what to do yet for return values from the function in the
aftermath of the Great nsCOMPtrResult Heresy.  Probably something like scc
proposed earlier--do_AddRefResult or ADDREF_AND_RETURN.  I am pretty tired right
now so I'll leave it for tomorrow night.

I believe the list in comment 31 is a good list of things that one wants to be
able to do easily--even if you can't do them 100% automatically they should not
require a lot of red tape.  Not having an automatic release on the other end
pretty much makes a non-managed ownership model necessary, unfortunately.
This isn't a patch and the code is untested, but it would be something like this:

  /*
    Sometimes one wants to say something like

      nsCOMPtr<T> t = do_QueryInterface( f() );
 
    where |f()| returns a pointer that has already been |AddRef|ed,
    but is not the type you want.  In general this is a leak.  Where
    possible (and where |f()| is declared appropriately), the
    expression above turns into a syntax error to help protect you
    from this leak.  An alternative is to say

      nsCOMPtr<T> t = do_QueryInterface( nsCOMPtr<U>(f()) );

    which looks fine with unrealisticly short interface types like
    |T| and |U|, but could be annoying in a real situation

      nsCOMPtr<nsIInterfaceTypeA> =
        do_QueryInterface( nsCOMPtr<nsIInterfaceTypeB>( f() ) );
 
    The |temporary_nsCOMPtr| functions deduce the correct type for the
    temporary without explicitly re-stating it, and without hiding the
    existence of the temporary

      nsCOMPtr<T> t = do_QueryInterface( temporary_nsCOMPtr(f()) );

    These functions are all |inline| so hopefully the return value
    optimization will prevent this form from introducing an additional
    temporary.
   */
    
template <class T>
inline
const nsCOMPtr<T>
temporary_nsCOMPtr( const nsCOMPtr<T>& aSmartPtr )
  {
    return nsCOMPtr<T>(aSmartPtr);
  }

template <class T>
inline
const nsCOMPtr<T>
temporary_nsCOMPtr( T* aRawPtr )
  {
    return nsCOMPtr<T>(aRawPtr);
  }

template <class T>
inline
const nsCOMPtr<T>
temporary_nsCOMPtr( const already_AddRefed<T>& aSmartPtr )
  {
    return nsCOMPtr<T>(aSmartPtr);
  }

  /*
    While I could also define

      template <class T>
      const nsCOMPtr<T>
      temporary_nsCOMPtr( const nsCOMPtr_helper& );
    
    ...and similarly for |nsQueryInterface|, there are two reasons
    not to do this
    
    (1) since the return type can't be deduced from the parameter,
        the call-site would have to use the template specifying function
        call syntax, which is not supported on all of our compilers; and

    (2) even if you did use that syntax, the result is longer than
        and no improvement on the syntax this construct is designed
        to reduce, that is (compare)
        
          temporary_nsCOMPtr<nsISomeReallyLongTypeName>( do_QueryReferent(w) )
          
          nsCOMPtr<nsISomeReallyLongTypeName( do_QueryReferent(w) )

    For the same reason, I can't just declare _one_ version of this
    construct, e.g.,

      template <class T, class U>
      inline
      const nsCOMPtr<T>
      temporary_nsCOMPtr( const U& )
        // ...

    This form is great in that compile time conversion from |U| to
    |nsCOMPtr<T>| guarantees compatibility and expandability, but it
    suffers the same problems since the return type can't be deduced
    fromt the parameter.
   */


I'm going to take this bug, I'm also going to split it into multiple parts, one
for each major point.  I'm going to fix the summaries to better reflect the
evolution of thought in them; and finally, I'm going to collect them all
together (with other |nsCOMPtr| RFE's and bugs) under tracking bug # 178174.
Assignee: jkeiser → scc
Status: ASSIGNED → NEW
OS: Windows XP → All
Hardware: PC → All
Summary: Make already_AddRefed<T> less dangerous and more useful → |nsCOMPtr|: add simplified syntax for temporary |nsCOMPtr|s in the middle complex expressions
split off discussion of a convenience function to |AddRef| and return into bug
#178182
split off discussion of adding more descriptive syntax for assigning from an
|already_AddRefed| into a raw pointer into bug #178187
So, were these the three things that we all agreed on looking into at least?

  (1) adding machinery to make temporary |nsCOMPtr|s in the middle of complex
      expressions easier (discussion remains in this bug, #172030)

  (2) making uniform and prettier machinery for |AddRef|ing a result and
      returning it through an |already_AddRefed<T>| function result
      (discussion moved to bug #178182)

  (3) improving the interaction between |already_AddRefed<T>| and raw pointer
      assignment (discussion moved to bug #178187)

And here are the things we either have not yet come to any agreed course of
action on, or else abandoned

  (4) eliminate |already_AddRefed| in favor of some significantly smarter
      class that actually provides refcount management, not just type
      annotation

Have I missed points that we still need to split into new bugs?
  
Status: NEW → ASSIGNED
Severity: normal → enhancement
One more issue that needs to be covered is the issue of using an
already_AddRefed<T> in void context.  The issue being bad use like this:

{
  functionReturningAlreadyAddRefed();
}

This will leak the object pointed to since it throws out the already_AddRefed. 
nsCOMPtr functions like getter_AddRefs() and dont_AddRef() in void context will
have this same problem, though it is unlikely they will be called in such a way.

Potential Solution:

I don't believe there is anything we can do with already_AddRefed to *disallow*
this usage or make it work properly without changing the reference management
model.  However, it *is* possible to catch this condition in a debug build and
assert, and that would at least catch incorrect uses earlier (leaks are really
really hard to find).  Specifically we can have take_ownership_of() and
assignment into nsCOMPtr (and any other operations where the programmer
explicitly manages the reference) null out the already_AddRefed.  Then on its
destruction we do NS_ASSERTION(!mRawPtr, "Leak detected!  Must explicitly take
ownership of an already_AddRefed.");.

Scott, do you think this is worth doing?  We can file a bug if so.
Except for comment #47, I think you have covered the complete list of issues to
be discussed.  Also, I agree that a smarter class is not necessary for
now--let's see if improvements to the existing model can fix 80% of the problems
(probably it can).


Regarding temporary_nsCOMPtr, the implementation seems correct to me, though I
think the name unnecessarily verbose.  "temp" should be just as understandable
and memorable as "temporary", though nsCOMPtr seems more memorable than COMPtr
since people are used to typing that.  It is, however, better than
nsCOMPtr<nsICSSFrameConstructor>(...).  Alternatively it could be named
something more like other functions in the nsCOMPtr world:

- release_Later()
- delayed_Release()

These suggestions might come closer to a functional description, like
dont_AddRef() for example (which does not describe its return type but does
describe what the function is doing with ownership).  Maybe do_ or dont_ in
front of them would be good?
I like your suggestion to detect leaking through already_AddRefed.

As a potential other name for temporary_nsCOMPtr, how about "and_release"?

  nsCOMPtr<nsIInterfaceTypeA> = do_QueryInterface( and_release( f() ) );
I like it.  For experts like us, |and_Release| (note fixed capitalization) reads
very well in situ:

  |do_QueryInterface|, |and_/*then*/Release|, |/*the result of*/ f()|

On the other hand, |and_Release| does not adequately tell non-experts when, how,
or why to use it.  To the casual observer, how does |and_Release(x)| differ from
|Release(x)|, if such a thing existed, or from |NS_RELEASE(x)|, or
|NS_IF_RELEASE(x)|.  |temporary_nsCOMPtr| conveys precisely how it should be
used and why.  An excess of abstraction is not appropriate down here at the
molecular level.

|temporary_nsCOMPtr| fits perfectly with the style of names throughout the
|nsCOMPtr| machinery.  jkeiser: the appropriate capitalization for the names you
supplied would be

  Release_later
  delayed_Release

Capitalization in |nsCOMPtr| is used only to recapitulate existing type and
function names.  Consistent capitalization and consistent abbreviation (in the
case of the |nsCOMPtr| machinery, no abbreviations) leads to fewer typing
mistakes for clients.  Additionally, these two names don't convey the actual
intent (although they _don't_ suffer the same confusion with other |Release|
variations, as jag's |and_Release| does).  Closer to the mark would be something
like |ensure_Release|, or jag's |and_Release| (or |and_then_Release|, or
|and_afterwards_Release| ... ok, so _maybe_ |Release_later| is on the money :-).
 After all, early |Release| is not the problem we are correcting, but the total
failure to |Release| at all.

Comparing name lengths

  temporary_nsCOMPtr
  and_Release
  delayed_Release
  Release_later

I'd say that |temporary_nsCOMPtr| may be the longest, but it's certainly in the
same ball park.

In summary, I think |temporary_nsCOMPtr| has capitalization, abbreviation, and
form consistent with the current naming scheme of |nsCOMPtr| machinery; is not
of debilitating length; and of the choices examined to date, best conveys to the
client and casual reader exactly the right amount of detail about what machinery
will come into play, and why it might be needed.  A better name may come along
before we checkin a patch; but until and unless that happens, I'm ready to
proceed with this choice.
The short name I like the best so far is a variation on jag's suggestion:
|then_Release|.  It still doesn't convey the rules the way |temporary_nsCOMPtr|
does; and may still suffer the confusion problem.  It has the `good reading for
experts' property of jag's original suggestion

  |do_QueryInterface|, |then_Release|, |/*the result of*/ f()|
Filed bug #179275 in response to comment #47 above
I like one of your suggestions in comment #50 best: ensure_Release().  I like it
better than temporary_nsCOMPtr, in fact, because it is a functional description
that captures *why* the statement is there.

then_Release and and_Release are pretty good too, but I think they are going to
be strange to the casual new reader of the code.
Assignee: scc → nobody
Status: ASSIGNED → NEW
QA Contact: scc → xpcom
I don't think this is a useful bug or that we're going to change the comptr/already_AddRefed API at this point.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: