Closed Bug 223255 Opened 21 years ago Closed 14 years ago

deCOMtamination naming nit: no "Get" in infallible/non-null accessor names

Categories

(Core :: Layout, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: brendan, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: dev-doc-complete)

See bug 222134 comment 24.  If you're deCOMtaminating an XPCOM interface pointer
or string getter, which returns a strong ref or owned copy, drop the Get from
the front of the method name:

-  nsCOMPtr<nsIDocument> parentDoc;
-  aDocument->GetParentDocument(getter_AddRefs(parentDoc));
+  nsIDocument *parentDoc = aDocument->ParentDocument();

Let "Get" connote "add a share in ownership" or "transfer ownership", and the
lack of "Get" connote "weak reference to named member".

It'd be good to get this meme spread about quickly, unless someone objects.

/be
ok, I'll use this convention in bug 221866 :)
Some things that we get are unambiguously nouns, but some aren't.  (Consider
GetStyleContent() vs. StyleContent().  Perhaps it should have been named
ContentStyle(), but |nsStyleContent* c = frame->GetContentStyle()| is a little
strange.)

We certainly haven't been doing this so far, and it might keep things more
consistent if we don't start, although I was originally leaning this way (but
roc wasn't, IIRC).
If your weak link accessor computes something and can fail (return null), then
it should be called something other than the member-like noun-name.  GetWeakFoo()?

Past naming mistakes are no excuse for other mistakes! ;-)

/be
We don't need to use method naming conventions to indicate whether a strong or
weak reference is returned. already_AddRefed does that for us.

We might want to classify methods returning pointers into the following three
classes by reading their signatures or call sites:
A) Methods that never fail and never return null.
B) Methods that never fail but may return a null result.
C) Methods that can fail and return null to indicate failure.

I don't believe that it is very important to distinguish B from C. The caller
has to handle null in either case, and how null should be handled depends not
just on whether the callee failed, but on whether the caller wants to recover,
whether the caller can handle a successful null result or it treats a successful
null result a failure anyway, etc. This can just be documented in the method's
header file.

I do agree that it would be helpful to identify methods of type A, methods that
cannot return null and are therefore safe in chains of
GetFoo()->GetBar()->GetBaz()->SayHelloKitty(). I'm not sure what the best way
would be to do that. I guess I could swallow brendan's suggestion and have
non-Get methods like PresContext(), StyleContext(), StyleContent() etc ... I
could live with noun-ambiguous names. For one thing, these methods would usually
be taking no parameters.

Such methods should assert that their result is non-null.

I fear getting from here to there, though. A search and replace script would
work for some names, but not for all.
> A) Methods that never fail and never return null.
> B) Methods that never fail but may return a null result.
> C) Methods that can fail and return null to indicate failure.


What I had in mind was distinguishing A from B and C via Foo() instead of GetFoo().

I agree on B and C collapsing into the same state that requires care in calling.

Given already_AddRefed (which is far from universally used), I agree that we
don't care about strong vs. weak -- except again, it is important to any reader
of doc->Foo()->Bar()->...() to know that Foo and Bar return weak refs.  And for
the reader, it helps to drop the Get.

Now too much of this can lead to Hungarian notation, which serves no purpose
with a good compiler other than to ugly-up names with type and mode cybercrud
prefixes.  But here, I'm not asking that unnecessary information be encoded in
the name -- I'm arguing for removal of a misleading verb prefix, Get.

It's true that Get is overloaded in OOP in general, but in Mozilla, we can make
it mean "fallibly compute or acquire some kind of strong ref" if we choose.  At
least, I think we can -- if we can keep pushing already_AddRefed, a latecomer to
the Mozilla party, we can promulgate this Foo() over GetFoo() naming convention.

As for who will do what, I volunteer to go over bryner's changes and make the
necessary patch, in a little while (after 1.5a is out).

/be
so should we differentiate between functions returning an already_AddRefed
(strong) and ones returning a rawpointer (weak)?
Jonas, there is no need to differentiate between them... name a reasonable use
case where such differentiation would help?  ;)
sicking: I think roc's point is that if people use already_AddRefed<T> for
strong ref return type, then the method naming convention can continue to be Get
(I agree: Get implies strong), and more to the point, people can't go wrong by
calling

  rv = doc->GetFoo()->GetBar(getter_AddRefs(bar))

if GetFoo returns an already_AddRefed, because the compiler will catch the mistake.

Readability doesn't weigh heavily here.  Some might prefer yet another naming
convention to distinguish GetFoo from GetBar, but I don't see the point, and I
think another convention would be (a) uglier; (b) not adopted widely or at least
consistently.

So, I think no new naming convention is needed for strong ref r.v. (via
already_AddRefed) vs. strong ref return via out param (GetBar in the example). 
Both are valid forms of getters; both have their uses, even though the GetFoo
type is not usable in a pure XPCOM interface.

But, if you mean should we distinguish between Foo(), which falls into roc's
class (A) and returns a weak ref (and never null) from a GetFoo() that returns
an already_AddRefed, then of course I agree!  I'm all in favor of this Foo()
convention. ;-)

/be
Fixing summary.

/be
Summary: deCOMtamination naming nit: no Get in weak-ref accessor names → deCOMtamination naming nit: no "Get" in infallible/non-null accessor names
I'd like to hear dbaron and bz voice an opinion before we seal this in blood.
I do feel that it would be good to differentiate A from (B or C).  I'm tired of
inconsistencies in whether things are null-checked... ;)

The actual convention we decide on for differentiating does not matter to me so
much, really.

One thing with the Foo() converntion is that we have existing classes in the
tree that use Foo() in somewhat (but maybe not quite) that way (eg nsXULElement,
some stuff in http, etc).  We would want to look over such uses, as possible,
and see whether they correspond to the new meaning invested in the convention.
There's also

D. Methods that may fail and may also return null as a non-failure condition.

I encountered one of these with my nsIDocument deCOMtamination.
Methods of type D should always return nsresult.

I like the short names for methods of type A since they'll often be used in a
chain. I say we go ahead with this. I will start the ball rolling by converting
some obvious methods to the new syntax.

I don't see what bz was referring to in nsXULElement...
Hmm ... now that I look at methods like "nsStyleStruct*
GetStyleData(nsStyleStructID aSID);" I start to wonder whether there are going
to be ambiguities. If a method happens to return a pointer and its name does not
start with "Get", one might just assume that it never returns null ... what what
if it's not actually a getter method? I guess it's most confusing when the name
is not obviously a verb, a la dbaron's comment #2.

So how about other naming conventions? Can we name methods starting with, say,
"The"?
nsIPresContext* ThePresContext()
{ return TheStyleContext()->TheRuleNode()->ThePresContext(); }

nsIPresContext* APresContext()
{ return AStyleContext()->ARuleNode()->APresContext(); }

nsIPresContext* _PresContext()
{ return _StyleContext()->_RuleNode()->_PresContext(); }

nsIPresContext* MPresContext()
{ return MStyleContext()->MRuleNode()->MPresContext(); }

Hmm, I dunno.
My gut feeling is that the problem already exists if functions have so similar
names but don't do similar things, independant of this newly proposed naming
convension. In such cases the function should simply get a different name.

I'm not sure which exact functions you are talking about so I could definitly be
wrong.
I have one more crazy idea: use GET for emphasis:

nsIPresContext* GETPresContext()
{ return GETStyleContext()->GETRuleNode()->GETPresContext(); }

"Go and fetch me a prescontext, son, and if you can't find one, DON'T BOTHER
COMING HOME"
Ah, yes, those would have to be changed.
GETMeOutOfThisUglyNamingConvention();

Please, can't we just use nouns and noun phrases?

/be
Except methods, denoted with (), are actions, which are verbs.
Naming them pseudo-imperatively as nouns is common enough abuse,
I suppose.  However, the rule:

"A method expressed as a noun, without prefix Get, acts to
 retrieve a strongly referenced object" (or the reverse)

while good for code brevity, and easy for mavens to understand,
is very hard for learners to absorb. This is because they are asked
to detect the absence of a meaningless qualifier as a trigger
to deducing the semantics.  That chore is not easy, concrete thinking
for them.

Slightly clearer is to use "Weak" instead of "Get" for weak references.
Both method kinds are getters, implicitly, so why use Get on only half of
them? That is inconsistent.  The business of verb abuse remains, but
at least with a "Weak" method prefix, the status of the object so
retrieved is blindingly clear. It is also a warning to the user that
invoking such a method results in an object that is not as formal as it
might otherwise be. "Weak" thus acts as a very meaningful qualifier.

I argue that the benefits of marking when an object must be "fully
referenced" are far outweighed by the benefits of making code more
accessible so that learners' eyes can gaze with confidence upon it.
Both masters may be served with "Weak", but not with "Get".
(Verbosity for learners?  Not for me; I'm a Unix/C hacker.  Weed out those who
can't keep up without crutch-words, Hungarian notation, and other such noise ;-)

A couple of counterpoints:

In general, learners are not edified for long before they become adept, but
everyone suffers from the verbosity, forever.  Is the Unix cp command better
named copy?  I think not.  That's a general point; on to specifics.

I challenge the assertion that "[the noun-phrase for weak-ref fetchers rule] is
very hard for learners to absorb."  What evidence is there for "hard", never
mind the otiose "very"?  On the contrary,

  rv = doc->BaseURI()->GetSpec(spec);

is less ambiguous, more concise, and arguably easier to learn than

  rv = doc->GetBaseURI()->GetSpec(spec);

where Get is used in two different, incompatible ways.

(And as all of us originally on this bug agreed, we don't want GetBaseURI, in
the above a method *not* returning an nsresult and not adding a reference, to be
confused with GetSpec, which is a pure XPCOM method (albeit with string result).
 GetFoo or a similar naming convention should mean XPCOM/strong; WeakFoo or Foo
should mean non-XPCOM/weak.)

So, on to "Weak" vs. "".

"Weak" is not a verb, so I'm not sure what happened to the first paragraph's
argument.  More, the benefit of Weak littering the code, when with naked data
member references (as in private/protected class/subclass code) we would have had:

  rv = doc->mBaseURI->GetSpec(spec);

which is more like the noun-phrase idea than any other proposal, seems to both
belabor the obvious, and make a naming distinction without a difference.  And
again, it taxes everyone outside the class implementation with the burden of
"Weak" noise substrings.

I think we still have consensus among the deCOMtaminators.  If so, I'll document
this under http://www.mozilla.org/projects/xpcom/.

/be
> Both method kinds are getters, implicitly

Not really.  The methods we are talking about here are what would just be member
accesses if it were not for possibly having to call cross-module and for this
pesky data protection thing people like to have.  ;)

Consider how the following code matches with the concept of methods as "verbs":

  foo->Bar() = "test";

Where Bar() is declared as:

  char*& Bar();

Think about what the code does and how easy it is to describe if the method is
considered to be a verb vs how easy the it is to describe if the method is
considered to be a noun.

In my opinion, blinkering yourself in the way you describe is highly detrimental
to understanding the code you are looking at; catering to such blinkered
thinking should not be an API design criterion.
Brendan, mind summarizing what we've agreed on in terms of the A,B,C,D
classification of comment 4 and comment 12?
A) Methods that never fail and never return null.
B) Methods that never fail but may return a null result.
C) Methods that can fail and return null to indicate failure.
D. Methods that may fail and may also return null as a non-failure condition.

Summary rule:
For A, use Foo().
For B, C, and D, use GetFoo().

As roc points out, the issue of weak vs. strong reference result is independent
of fallibility (Nigel, you wouldn't want us saying "InfalliblyWeakFoo()" or
"InfalliblyStrongFoo()", I trust!) and is type-checked by the compiler.  If you
have a case A where the result should be a strong ref, return an
already_AddRefed<T>.

Distinguishing B-D by naming convention seems fruitless to me, based on our
experience with XPCOM and deCOMtamination.

/be
What brendan said, but I have one further question. For methods of type D, which
should we prefer?

nsresult GetFoo(nsIFoo** aResult); // returns strong (addrefed) ref
nsIFoo* GetFoo(nsresult* aResult); // returns weak ref

? I think I prefer the latter for efficiency, although it departs further from
XPCOM.
Oh, yeah.  We do have agreement on the issue of methods of type A.  Make it so!

As for comment 26, I think I favor the second syntax.  As far as that goes,
would it make sense to have the methods handle a null nsresult* and to default
it to null, a la do_QueryInterface and company?  That will make the lives of
callers who don't care about why the pointer is null maximally simple.
> I challenge the assertion that "[the noun-phrase for weak-ref fetchers rule] is
> very hard for learners to absorb."  What evidence is there for "hard", never
> mind the otiose "very"?

Viewing others' code is an obvious scenario for example-driven learning.
Two fundamentals of example-driven learning are to be concrete and explicit.
The absence of "Get" implying extra semantics does not help either
fundamental. Say it as plain as a tree on a hill if you want it understood.

> Unix brevity.

With 23 years allegiance I agree. But brevity will not correct an
absence of concrete information. It just codes (in the symbolic sense)
information already present into less space. Nothing wrong with that; it
is no argument in favour of an reverse-absent indicator, though.

> What happened to the first para's argument.

I merely point out that there is a bigger picture, but declined arguing
it here since only the smaller picture "Get" / "Weak" is on-topic.

> Weak is not a noun

I proposed it as an occassional prefix, just as Get is proposed to
become an occassional prefix. prefixes are word-parts, not words
or nouns.

All this: ' "Weak" is not a verb ... "Weak" noise substrings.'

seems a little blurry. You propose to litter the code with Get and
yet equal littering with Weak (or some other term) is apparently worse
on some yet-to-be-defined basis.

On Boris' remarks: I accept the world is wide; and accessors may do
more than just access local members. And yet the beauty of COM/XPCOM is
in its implicit simplicity for *ordinary* programmers. At the C++
level there is plenty of punctuation soup to be had, but that shouldn't
be used to deconstruct the very common major cases that are to
suffer naming conventions under this bug. And yes, of course, constructors
and function/pointer references can have noun names. My remarks were
restricted to explicit stating of method names in a typical way.

Now [expects howls of anguish from everyone], it is natural that
XPIDL definitions will follow the lead of C++ uses of XPCOM at
some point - the trickle down effect. Yes I know there are
implementation classes that are separate in some cases.  The users
of scripted XPIDL are certainly not all UNIX lovers, they do not
embrace that ethos, and they may even be in majority Microserfs.
Regardless of their Lin/Win preference, many are less than code
wizards. If you want to promote Mozilla as a robust platform and
draw a developer crowd, this audience is important, because amongst
the horde are a few gems that will be very helpful folk.

Your decision here is not just a club decision about deCOMtaminating;
it has real impact on the future digestability of the platform.
There are more average programmers available to be enthusiastic about
Mozilla than there are highly competant ones; that is the market reality.
What goes for HTML and JavaScript goes for XPIDL, albeit to
a lesser degree.

Surely you do not want a large impedence mismatch
between a set of XPIDLs suitable for broad consumption and an
underlying implementation? That is foolish. Or do you propose
a formally maintained translation of C++ XPCOM semantics into
a semantically separate set of XPIDL names? I doubt that would
ever work except in the most restricted case; and it would be a
huge maintenance and performance worry.  I do not mean the
simple translation done be xpidlgen. And yes,I know that
weak references are handled extensively when XPconnect is
involved.

In all this I have yet to see one good argument why Get is
as good as Weak, evenfor C++ coders, except that 4 letters are
involved rather than 3, and that brevity suits a few ports of
Mozilla. At no time have I suggested long names are better. If I were to
substitute a proposal for "W" or "Wk" over "Get", then by all
logic here it should get up? I merely point out the fallaciousness
of debating a single character. It is the reverse-absence of
"Get" that is the most worrying, though. It raises the barrier
to access. It forces more menial beginner tasks onto your expert
time, because there will be fewer motivated learners helping out.
regarding comment 27: I don't like having the nsresult as an optional argument
for the very reason that it is easy to ignore. We already have way too much code
that doesn't do proper errorhandling so we should IMHO not encourage that.

IMHO If a function can fail but can also return null as success-value it should
always use:

nsresult GetFoo(nsIFoo** aResult);

as calling convention. I realize that it's going to be more refcounting, but i
doubt that that is going to make a big difference. It's much more important to
have code that functions well rather then to save those extra few percent in
performance.
> And yet the beauty of COM/XPCOM is in its implicit simplicity for *ordinary*
> programmers.

We're very explicitly NOT talking about COM/XPCOM here.  We're talking internal,
probably-non-virtual, probably-inlined, certainly-not-XPCOM methods.

> it is natural that XPIDL definitions will follow the lead of C++ uses of XPCOM
> at some point - the trickle down effect. 

1)  We're not talking about XPCOM
2)  The proposed syntax is (on purpose) fundamentally not compatible with XPIDL
    or the way XPCOM works, thus could not trickle out to XPIDL.

> And yes,I know that weak references are handled extensively when XPconnect is
> involved.

XPConnect is not involved in any way in any of the methods under discussion.

Please do your research on the background to this discussion before bringing in
a whole slew of irrelevancies, ok?
> For methods of type D, which should we prefer?
> 
> nsresult GetFoo(nsIFoo** aResult); // returns strong (addrefed) ref
> nsIFoo* GetFoo(nsresult* aResult); // returns weak ref

D) Methods that may fail and may also return null as a non-failure condition.

In light of bryner's definition of D, how is the nsIFoo* (nsresult* aResult)
form any more efficient?  The caller has to check aResult no matter the return
value (null does not mean failure).  I say use XPCOM for all (D) variants.  So I
agree with Sicking's comment 29.

/be
nigel: deCOMtamination means elimination of gratuitous XPCOM uses.  No one said
"Weak" loses to "Get" because it has one more character.  People use Get all
over, it's part of XPCOM (XPIDL => C++ prepends Get and Set to generate methods
for an attribute), we are used to it.  "Weak", we are not used to, and no one
likes it.

/be
No problems all. I realise I sound like some kind of space alien
when I occassionally contribute to these bugs, but I have
a goal of sorts, which (in part) is to stretch thinking out towards
learning consequences. If I ever manage to launch my rocket one
day I want you all to be used to me by then and not panic ;-).
My daily work practices are somewhat different to yours.

I have spent many years in the roles you occupy (but rarely in open
source), and I appreciate the frustration you have with me (in
the role I take here). You'll just have to assume my remarks aren't
in English for now.  Boris & Brendan, I do understand that
XPCOM is not "related" to the subject at hand in a simple API sense.
But my interest in relatedness, and in exposure of symbolic
information, is not always cast in terms of connected bits and bytes.

There's lots to argue here, especially many assumptions sprayed around,
but you've had a full dose of me, so I'll shush up for now.
>nsresult GetFoo(nsIFoo** aResult);

doesn't that make it equally easy to ignore the return value?
> In light of bryner's definition of D, how is the nsIFoo* (nsresult* aResult)
> form any more efficient?

Because you can return a weak pointer and avoid the cost of reference counting.

>>nsresult GetFoo(nsIFoo** aResult);
> doesn't that make it equally easy to ignore the return value?

Amen. In fact, I claim that if you use the "nsIFoo* GetFoo(nsresult*)" version,
it's harder to ignore the return value because (assuming we don't accept null
there) you have to provide a variable to put it in.
roc: you're right that the return value is more efficiently propagated than out
params, on many compiler/target-architectures (and never less efficiently).  The
issue of weak vs. strong reference seems irrelevant, again: one could return a
weak ref directly, or via the out param.  The only issue is efficiency, or so I
think you mean.

So, in the non-null weak pointer case, the caller does not need to load the
nsresult out param that it passed by address.  Only in the null return value
case must that result code be checked to disambiguate null-ok from null-failure.
This could make for a noticeable cycle savings over against the proper XPCOM
signature, depending on call frequency.

I'm not sure nsIFoo* (nsresult*) makes it harder to ignore the nsresult, though.
 If my interpretation of your efficiency argument is correct, people will ignore
the nsresult for non-null returns, and the only hazard is that they'll treat
null as null-ok always and thereby hide a failure.  That seems about as likely
to me as people "pulling a hyatt" ;-) and ignoring the nsresult r.v. of a proper
XPCOM signature.

Do we really want to document and promote this signature as yet another special
form?  What do others think?

/be
> The issue of weak vs. strong reference seems irrelevant, again: one could return
> a weak ref directly, or via the out param.

Except that we don't allow weak pointers to be returned in out params, currently.
IMHO we should not add that as a new signature. First of all i seriously doubt
that the extra addref/release will make a big differance. So far we havn't
really seen huge improvements speedwise from the deCOMtamination, at least of
content, and we've been changing a lot of functions, not just reducing
refcoutning but also converting virtual functions into inline ones.
I'm not saying it's a waste of effort, the code looks a lot nicer and there has
been some speedincrease. But I doubt that the few functions that fall into D
will make a difference, feel free to prove me wrong and i'll reconsider.

Second a function returning an nsresult should be indication enough for people
that errors need to be checked for. People seems to have fallen out of this
habit, but hopefully the deCOMtamination process will help this since we'll have
fewer functions that needlessly return an nsresult.
I basically agree, Jonas, I don't want to introduce a new style of signature, I
just want to make sure we make the decision for the right reasons.

Converting nsIFrame::GetFrameType to return a weak rather than a strong pointer
--- but leaving it virtual, since it is overridden in many places --- reduced
btek Tp by 1-2%. Those addrefs and releases do matter. However I agree with you
that functions of type D are not so common. (In fact it might be better to try
to avoid creating functions of type D if we can, because they're the most
complex for users to deal with.)
Re: comment 38: deCOMtamination can win significant time perf improvement, but
it always wins code size reductions.

Re: comment 39: I agree we should avoid type D, and not over-optimize it with
yet another special form in deCOMtaminated code.  Sounds like we're in agreement
again.  Should I write all this up and put it someplace on mozilla.org?  Perhaps
under projects/xpcom, but perhaps not, since it's deCOMtamination.  In case we
put the deCOMtamination doc elsewhere, a link from somewhere under
projects/xpcom would be good.

Rob, don't you have a deCOMtamination doc on your site?

/be
I wrote:

> [deCOMtamination] always wins code size reductions.

I meant object code, but source code becomes simpler too, substantially (the two
are related).  Just wanted to make sure no one is looking for cycle savings
only, or ahead of code size savings.

/be
"Com addresses software design in a very pragmatic way", from Box's Essential COM.

Removing COM from where it shouldn't have been can be and should be doc'ed under
"what not to do, what we are doing about it, and how you can help". :-)
Assignee: layout.misc-code → nobody
QA Contact: layout.misc-code
Blocks: deCOM
Should this bug still be open?
I don't think it needs to no.

Not sure what resolution to do here since this was more of a policy change than an actual specific code change. Marking FIXED since we've followed this policy fairly well since.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Could someone update

https://developer.mozilla.org/en/Mozilla_Coding_Style_Guide

to talk about noun-phrase-named infallible getters vs. GetFoo-named fallible ones?

/be
Keywords: dev-doc-needed
Product: Core → Core Graveyard
Component: Layout: Misc Code → Layout
Product: Core Graveyard → Core
You need to log in before you can comment on or make changes to this bug.