Open Bug 508163 Opened 15 years ago Updated 2 years ago

outparamdel nsBaseHashtable and subclasses

Categories

(Core :: XPCOM, defect)

defect

Tracking

()

People

(Reporter: zwol, Unassigned)

References

Details

nsBaseHashtable::Get() is asking to be outparamdel'ed:

  PRBool Get(KeyType aKey, UserDataType* pData NS_OUTPARAM) const
  {
    EntryType* ent = GetEntry(aKey);

    if (!ent)
      return PR_FALSE;

    if (pData)
      *pData = ent->mData;

    return PR_TRUE;
  }

->

  UserDataType* Get(KeyType aKey) const
  {
    EntryType* ent = GetEntry(aKey);
    if (ent)
      return ent->mData;
    return NULL;
  }

Similarly for subclasses (nsDataHashtable, nsClassHashtable, nsInterfaceHashtable, nsRefPtrHashtable).  I'm not sure how big the caller rewrite would wind up being.
That signature isn't right, though: returning a UserDataType* would give you a pointer to the internal class (which may not always exist... the stored class is often different, e.g. UserDataType = nsIContent* but the stored type is nsCOMPtr<nsIContent>). And the instant you do anything with the hashtable (removing or adding) the pointer becomes invalid.
How is the existing signature not wrong, then?
The existing signature is PRBool Get(KeyType key, nsIContent** aResult);

And assigns to the users nsIContent* location.
If what you're saying is the rewritten function should return "UserDataType" instead of "UserDataType*", then yes, that was a silly mistake.

If you're saying something else, then I still don't get it; the implicit conversion to UserDataType done by the return statement should have *exactly* the same semantics as the implicit conversion to UserDataType done by "*pData = ent->mData".
If UserDataType is a pointer type and you can use NULL as a signal that the key is not present, that would work.... or you could pass in a default value.

In some cases the user data type is a large struct and using it as the return value is annoying... we'd probably want to keep both signatures.
The last sentence of comment 1 seems to be talking about either a pointer to unstable hashtable storage (but why is that the issue here?), or about failure to hold a strong ref on the return value (bug in caller). ???

/be
It only now dawns on me that UserDataType doesn't have to be a pointer type.  The API seems so pointer-centric that I thought anything else wouldn't work.  (In particular, how exactly do you Put() a large structure?)

The original suggestion is a lot less compelling if UserDataType can be something that doesn't have a convenient sentinel value, and as you say, for large structs.
OH. :-) Yeah, lack of null-like "no value" inhabiting UserDataType would break the assumption wanted here.

Is it worthwhile template-specializing for the pointer-type case?

/be
On reflection, what I really *want* (for things like bug 497495) is an interface analogous to STL (unordered_)map<K,V>::get() -- this returns an "iterator", which is either a pointer to V or a class that acts like a pointer to V.  This can be NULL (so NULL is available for the "no value" result); it avoids copying a large structure for return-by-value; it can be specialized to handle the difference between nsIContent* and nsCOMPtr<nsIContent>; and it would permit a GetOrAdd() that default-initializes and returns a new V if there wasn't one for that K.  All without out-parameters.
wrt comment 1, as long as the interface contract was "this function returns an iterator which will be invalidated after the next hashtable operation", then returning a pointer follows a generally established lookup pattern.  To catch misuses, the concrete UserDataType* iterator can be replaced by an abstract Iterator type that is equivalent to a pointer in release builds and asserts on use after invalidation in debug builds.  This has the advantage that it avoids the copying of large struct types, as is currently done.

But, if one is morbidly opposed to the potential for error with reference semantics, it is also simple and relatively cheap to do:

template <class T> struct maybe {
  bool has_value;
  T value;
  maybe() : has_value(false) {}
  maybe(const T &v) : has_value(true), value(v) {}
  operator ConvertibleToBool();
  T &operator*();
};

maybe<UserDataType> Get(KeyType aKey) const {
  if (!found)
    return maybe<UserDataType>();
  return maybe<UserDataType>(...);
}

which allows the user write:

if (maybe<nsIContent*> p = table.Get(key)) {
  ... p->GetDocument();
}
I'm certainly not opposed to the iterator interface, but I'm pretty sure it can't be trivially implemented with the current code: you'd have to update the iterators as the hashtable was modified. Bug 506410 has something along those lines, I think.

I am perfectly fine with a RawGet function which returns a DataType*, with appropriately scary warnings about misuse. I think that guarding with a wrapper class and assertions is probably too much overhead.
But we won't automatically rewrite to that signature, so moving this bug back to XPCOM.
Component: Rewriting and Analysis → XPCOM
QA Contact: rewriting-and-analysis → xpcom
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.