Open Bug 803439 Opened 7 years ago Updated 7 years ago

Various potential LinkedList interface enhancements/changes

Categories

(Core :: MFBT, defect)

defect
Not set

Tracking

()

mozilla19

People

(Reporter: Waldo, Unassigned)

References

Details

(Whiteboard: [Leave open])

Attachments

(2 files)

Some issues I noticed while converting the preflight cache to use LinkedList:

The preflight cache uses LinkedList to implement an LRU cache, so at one point there's code that removes a list element, then reinserts it at the front of the list -- entry->remove(), then list.insertFront(entry).  Would a moveToFront(entry) method make sense?

In one place the prclist code did grabbed the tail of the list, then removed it from the list, then briefly used the disconnected element.  The obvious translation would be list.getLast(), then remove(), then use.  That can be condensed in this particular case into list.popLast() then use.  In doing so I noticed that we're assuming the list isn't empty here.  If the list is empty, popLast() returns null.  Should there be a popLast() that asserts the list isn't empty?  Should popLast() just be changed, maybe?  Certainly nsAString::Last() asserts non-emptiness, and people live with that.  (Of necessity as there's no possible sentinel, to be sure, but still.)

I think the preflight cache is a singleton, so in this case it probably doesn't matter.  But in general one might imagine some concern about a list element, believed to be in one list, actually residing in another -- and removing the element removes it from that other list.  It might be nice if there were a remove method that got passed the believed-containing list, and asserted that that list actually contained the element.

Which kind of leads into another issue.  Should there maybe be a method that says whether an element exists in a particular list?  Since this would be a linear-time operation, if this were thought useful, we might call it linearContains to make clear its complexity.

Anyway.  Thoughts on all this appreciated, and please tell me if we already considered any of these.  :-)  (I'm pretty sure we never considered the first, but my memory's hazy on the others.)
> Would a moveToFront(entry) method make sense?

Maybe, if this is a really common operation.  OTOH I believe in parsimonious APIs, and remove(); insertFront(); isn't awful.

I might like it more if you could do entry->moveToFront(), but we don't have the requisite pointers to do that.  :)

> Should there be a popLast() that asserts the list isn't empty?

Again, in the name of parsimony, I don't see why you can't just MOZ_ASSERT it yourself.  Note that if you ever dereference this pointer, you're asserting that it's non-null.  I kind of doubt that it's so common to popLast() and not dereference that we need a whole new function for that.

> It might be nice if there were a remove method that got passed the believed-containing 
> list, and asserted that that list actually contained the element.

I'd be OK adding a method like debugOnlyAssertIsInList(list) (rather than baking this into remove()).  It has to be debug-only because supporting that method either requires an extra pointer in the linked list or requires us to walk the whole list.

> Should there maybe be a method that says whether an element exists in a particular 
> list?

I don't think this is a common use-case except for debugging.  For people who care, they can add a linked-list pointer to the class which inherits from LinkedListElement.
(In reply to Justin Lebar [:jlebar] from comment #1)
> > Would a moveToFront(entry) method make sense?
> 
> Maybe, if this is a really common operation. OTOH I believe in parsimonious
> APIs, and remove(); insertFront(); isn't awful.

Perhaps. Nobody else would benefit from this yet (maybe this'll change when all the prclist users have been removed), so probably this is the cautious route for now.

> > Should there be a popLast() that asserts the list isn't empty?
> 
> Again, in the name of parsimony, I don't see why you can't just MOZ_ASSERT
> it yourself.

This is true. But assertions work best when you don't have to remember to do them. Here, I expect a lot of people will implicitly assume the acted-upon list is non-empty but won't go out of their way to check/assert it. Developers are lazy; they're more likely to call a special method than to add an assert and "bloat" their code.

Looking at popFirst/popLast callers, it looks like almost an even split between ones that assume the list isn't empty and the ones that don't.

And, of course, parsimony isn't actually sufficient to say we shouldn't have small helpers. If it were, we wouldn't have popFirst or popLast, since those are just compositions of getFirst/getLast and remove. :-)

> Note that if you ever dereference this pointer, you're
> asserting that it's non-null. I kind of doubt that it's so common to
> popLast() and not dereference that we need a whole new function for that.

The value of an explicit assertion is it's more understandable than an implicit null-deref crash. It's also the earliest place to diagnose the problem. Consider also that the pointer won't necessarily be immediately dereferenced -- it might be some time later, when the list's emptiness is not so easily determined.

> > It might be nice if there were a remove method that got passed the believed-containing 
> > list, and asserted that that list actually contained the element.
> 
> I'd be OK adding a method like debugOnlyAssertIsInList(list) (rather than
> baking this into remove()). It has to be debug-only because supporting that
> method either requires an extra pointer in the linked list or requires us to
> walk the whole list.

Certainly, the remove method would test for existence in the list only in debug builds.

Patches for these tweaks to follow.
Some of the maybe* callers are just performing an action on every element of the list while clearing the list.  It's possible those cases would be better off with element types being classes with a destructor, that would perform that action (this wasn't possible with prclist, of course), but that'd be a separate change.
Attachment #676819 - Flags: review?(justin.lebar+bug)
The webgl places that have comments saying which list is being removed from seem like a clear argument for removeFrom(list).  (I had named it remove initially, but there seems little reason to overload if it's not necessary.  Saving four characters doesn't seem worth it.)

A single-platform try run for both of these patches:

https://tbpl.mozilla.org/?tree=Try&rev=2e69eff9b2f1
Attachment #676821 - Flags: review?(justin.lebar+bug)
Comment on attachment 676821 [details] [diff] [review]
Add a removeFrom(list) method that debug-asserts the element is in list

This is righteous, although I feel compelled to continue beating the 8 lines of context horse.  :)
Attachment #676821 - Flags: review?(justin.lebar+bug) → review+
Comment on attachment 676821 [details] [diff] [review]
Add a removeFrom(list) method that debug-asserts the element is in list

> +    /* Identical to remove(), but also asserts that this element is in list. */

Can you explicitly say "asserts in debug builds" so there's no confusion?
Comment on attachment 676819 [details] [diff] [review]
Make popFirst/popLast assert non-empty, add maybePopFirst/Last for places that don't mind empty lists

Sorry to keep you in suspense on this one all afternoon; I wanted to give this one some thought.

I think my main hang-up with this patch is over the fact that getFirst() and getNext() return null if the list is empty, but popFirst() asserts.  This change to popFirst() doesn't conform to the aesthetic I was trying for in this interface where methods returned null to indicate that there's no element there.

To be consistent, we'd need to rename getFirst() and getNext() to maybeGetFirst() and maybeGetNext().  But I hope you dislike that, because I sure do.

Looking at the callers, I don't disagree strongly that it would be helpful to have a method which does what this patch's popFirst() does.  But it's not clear to me how to do it in a way that's consistent with the interface, and I feel like consistency here is quite important.

I'll see if I feel the same way in the morning?
> +      list.assertContains(static_cast<const T*>(this));

Shouldn't this call asT()?
Comment on attachment 676819 [details] [diff] [review]
Make popFirst/popLast assert non-empty, add maybePopFirst/Last for places that don't mind empty lists

If you called it popFirstNotNull(), I guess that would be an OK compromise to me.  How would you feel about that?
Comment on attachment 676819 [details] [diff] [review]
Make popFirst/popLast assert non-empty, add maybePopFirst/Last for places that don't mind empty lists

Cancelling the r? while we discuss.
Attachment #676819 - Flags: review?(justin.lebar+bug)
No worries about suspense, I have enough stuff to do that I'm almost never in a rush.  :-)

Hmm.  I get the getFirst inconsistency bit.  Perhaps instead of getFirst and such, we should do it the way Gecko does it and have first() and getFirst(): the former never returning null (maybe asserting), the latter maybe returning null.  Or maybe the former returning a reference, and the latter returning a nullable pointer.

The fundamental issue, the part that smells to me, is the pop* methods are mutators that are loosey-goosey and might not mutate.  Aside from JavaScript's Array.prototype.pop (which is even worse because undefined isn't a sentinel), I can't think of languages or libraries that have a similar method for lists.  Java and C#, for example, both have methods to remove the first/last element from a linked list, but they throw on empty lists.

http://docs.oracle.com/javase/6/docs/api/java/util/LinkedList.html
http://msdn.microsoft.com/en-us/library/ms132181.aspx

Using null to indicate wasn't-there might be convenient, but in the absence of a way to require the user to handle null/non-null both, it seems a bit error-prone.

Perhaps the problem is we should be providing some sort of iterator object for removing and acting upon each element, rather than requiring people to open-code it with getFirst()/getNext().  Luke's been adding iterators left and right in JS to abstract over this stuff, and it seems to work pretty nicely.
> Perhaps instead of getFirst and such, we should do it the way Gecko does it and have first() and 
> getFirst(): the former never returning null (maybe asserting), the latter maybe returning null.  
> Or maybe the former returning a reference, and the latter returning a nullable pointer.

I might be convinced by this if we could change a fair number of callers from getFirst() to first().  But if most people want getFirst() (and I'd expect they do, because most of the time getFirst() is part of a loop iterating over the list), then first() just adds confusion and potential for bugs.

(Note also that getFirst() is parallel to getNext(), but we can't have next() because there's no method to guard on to tell whether you're the end of the list, and adding such a method wouldn't add any value except allowing us to have next().)

> Aside from JavaScript's Array.prototype.pop (which is even worse because undefined isn't a 
> sentinel), I can't think of languages or libraries that have a similar method for lists.

How about Java?  It's called "pollFirst()".  :)

It's true that their popFirst() throws an exception if the list is empty, but so does their getFirst().

> Perhaps the problem is we should be providing some sort of iterator object for removing and 
> acting upon each element, rather than requiring people to open-code it with 
> getFirst()/getNext().  Luke's been adding iterators left and right in JS to abstract over this 
> stuff, and it seems to work pretty nicely.

Could you link me some code?  But of course popFirst() is used for more than just iteration, so I don't think this is the whole problem.

> The fundamental issue, the part that smells to me, is the pop* methods are mutators that are 
> loosey-goosey and might not mutate.

I'm not convinced that this is a source of confusion; at least, it was not confusing in any of the code that you touched in this patch.

More persuasive to me is the argument that the method should include an explicit assertion that the value you're getting is not null.  But perhaps explicit is better than implicit [1].

[1] http://www.python.org/dev/peps/pep-0020/
> More persuasive to me is the argument that the method should include an explicit assertion that 
> the value you're getting is not null.

erm, that should be an /implicit/ assertion.
(In reply to Justin Lebar [:jlebar] from comment #12)
> then first() just adds confusion and potential for bugs.

Little more than two getFirst() overloads causes.  The getFoo/foo naming convention helps.  Their having different return types also helps; no one could expect a method returning a reference to do anything sane on an empty list.  People may look at docs to figure out the difference -- but only once or twice.

As for users only needing one or the other behavior, I think this is mostly because we have few users so far, those mostly deriving from prclist usage.

> getFirst() is parallel to getNext()

getFirst() is a method on the list; getNext() is a method on individual elements.  The two cases are quite distinct, and not parallel at all -- except that prclist forced them to be parallel.

> It's true that their popFirst() throws an exception if the list is empty,
> but so does their getFirst().

Similar to how first() would work.

> Could you link me some code?  

Most places that have a "done()" in them in js/src are examples:

http://mxr.mozilla.org/mozilla-central/search?string=done%28%29&find=js%2Fsrc%2F&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central

> popFirst() is used for more than just iteration, so I don't think this is
> the whole problem.

True.

> I'm not convinced that this is a source of confusion; at least, it was not
> confusing in any of the code that you touched in this patch.

The point is to help readers of client code, by making a call to popFirst() imply something: that the list was non-empty before, and is smaller after.  No such implication is possible if popFirst() silently does nothing on empty lists.  Which actually goes back to the initial point, about confusion: I will willingly trade very slight confusion in writing code, if it helps with reading that code (which of course happens much more often).
http://hg.mozilla.org/integration/mozilla-inbound/rev/5bd42ae5efd2
Whiteboard: [Leave open]
Target Milestone: --- → mozilla19
> As for users only needing one or the other behavior, I think this is mostly because we 
> have few users so far, those mostly deriving from prclist usage.

Would you be OK coming back to first() when we have a compelling use-case, then?

> The point is to help readers of client code, by making a call to popFirst() imply 
> something: that the list was non-empty before, and is smaller after.  No such 
> implication is possible if popFirst() silently does nothing on empty lists.

I understand, but I don't think this speaks to my point, which is that I have not seen code which was confusing (to write or read) in this way with the existing popFirst().  I find your earlier argument that we want the method name to imply not-null more compelling, but again maybe explicit is better than implicit.

In both cases here I feel like we're prematurely optimizing the API, and we don't know enough about how it will actually be used.
One useful function to add to the API would be transferring list elements between lists:

http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/nsHostResolver.cpp#77

I don't know if it's really worth it to convert the bits in netwerk/, though; all that code has some fairly strong ties to NSPR anyway...
> One useful function to add to the API would be transferring list elements between lists:

I don't object to adding that one.
You need to log in before you can comment on or make changes to this bug.