Closed Bug 715405 Opened 13 years ago Closed 13 years ago

Add a type-safe linked list class


(Core :: MFBT, defect)

Not set



Tracking Status
firefox12 --- fixed


(Reporter: justin.lebar+bug, Assigned: justin.lebar+bug)



(Keywords: dev-doc-complete, Whiteboard: [qa-])


(1 file, 2 obsolete files)

PRCList is a huge pain to use.  We could do a lot better with a templated class.

I have code written, but I need to test it out.
Blocks: 715308
OS: Linux → All
Hardware: x86_64 → All
Assignee: nobody → justin.lebar+bug
Assuming that this can live in MFBT
Component: General → MFBT
QA Contact: general → mfbt
Attached patch Patch v1 (obsolete) — Splinter Review
Comment on attachment 586555 [details] [diff] [review]
Patch v1

I feel like this is something khuey can review, but I'd like the blessing of a mfbt peer.  Waldo?
Attachment #586555 - Attachment description: Bug 715405 - Add a type-safe linked list class. → Patch v1
Attachment #586555 - Attachment filename: Bug-715405---Add-a-type-safe-linked-list-class.patch → patch
Attachment #586555 - Flags: review?(khuey)
Attachment #586555 - Flags: review?(jeff.walden)
Comment on attachment 586555 [details] [diff] [review]
Patch v1

>+        /* Check that the next/prev pointers match up. */
>+        LinkedListElement<T> *prev = this;
>+        LinkedListElement<T> *cur = this->mNext;
>+        do {
>+            MOZ_ASSERT(cur->mPrev = prev);
>+            MOZ_ASSERT(prev->mNext = cur);

Wait, should this be an assignment?
Ouch, no!
Whiteboard: [needs review]
Attachment #586555 - Flags: review?(jeff.walden) → review?(jwalden+bmo)
Blocks: 716523
Jeff pinged me on IRC about comparing this to the webkit LinkedList classes.

Jeff, I'm still happy to talk about this in realtime, but I wanted to address the concern that, because webkit has three linked list classes, we're probably going to need more than one ourselves, and therefore we shouldn't name this class "LinkedList" (and should perhaps design it differently?):

I guess I'm not particularly concerned about this.  For one thing, we currently have only one linked list in Gecko, PRClist, (at least, I'm not aware of any others), and prima facia we haven't desperately needed a different one.  (LinkedList<T> operates on the same principal as PRCList.)  For another, none of the three webkit classes enjoys widespread use according to my searches [1]:

 SinglyLinkedList: 1 use
 SentinelLinkedList: 3 uses
 DoublyLinkedList: 4 uses

(Most of these uses are in the JS heap management code.)

Also note that their singly linked list supports only push and pop; it's really more of a linked stack than a list.  :)

But more generally, I prefer to design iteratively, for the task at hand.  You Ain't Gonna Need It.  If we did need another linked list class, it wouldn't be so painful to rename the one we have here, right?

[1] (Google is taking down code search, so who knows how long this will be valid for.)
Comment on attachment 586555 [details] [diff] [review]
Patch v1

Review of attachment 586555 [details] [diff] [review]:

General comments for mfbt stuff:

When we've added stuff to mfbt, we've usually added at least one use of it.  That ensures the code builds, that the implementation works to some moderate extent, and that the spellings of things are reasonable in context.  No need to make everything that could use something, use it, but we want to at least know that it's reasonable for *someone*.  Please find one existing place that uses a circular linked list and make it use this.  The patch that depends on this kind of qualifies -- except that you can't see that the spellings are reasonable in context with a separate-bug patch.  :-\  Maybe it's not too big a problem.

We use camelCaps for method names, don't prefix member fields with "m" or arguments with "a", don't brace single-line if/loop bodies, and put * by the type name in |T* foo|.  We're not always consistent on all these points (I mean to clean it up at some point), but that's no reason to make things worse.  I'd like to hold currently-clean mfbt code to a standard as high as the JS standard, and given how we roll it's difficult to get consistency without enforcing it from the start.

::: mfbt/LinkedList.h
@@ +3,5 @@
> +
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this file,
> + * You can obtain one at */
> +

mfbt files contain a summary comment describing what they implement, so that the source files can be the primary documentation for everything, and directory listings and to merely direct readers to the appropriate source file:

Please add one here.

Since we don't have much experience with this with the 2.0 header to know what'll cause MXR to still display those summary comments, you probably have to experiment with this across m-c merges and MXR source pickups until you figure out something that works.  :-\  Alternatively, just use the old header for now, and let the rewrite script figure it out.  (I would be so lazy if it were me.  :-) )

@@ +4,5 @@
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this file,
> + * You can obtain one at */
> +
> +#ifndef mozilla_LinkedList_h

It looks like the convention has been to have a trailing _ on such guards in this directory.

@@ +6,5 @@
> + * You can obtain one at */
> +
> +#ifndef mozilla_LinkedList_h
> +#define mozilla_LinkedList_h
> +

You should enclose this in #ifdef __cplusplus / #endif, in case a C/C++ dual-nature header ever wants to include this.  I wouldn't be surprised to see a public JS header using this eventually.

@@ +8,5 @@
> +#ifndef mozilla_LinkedList_h
> +#define mozilla_LinkedList_h
> +
> +/*
> + * == Usage Notes ==

The more common separation we've used for describing interfaces and implementations puts the interface description, uses, and so on before the class definition (as here), but it puts the implementation information within the definition.  And we haven't used == header == for separation in interface comments, usually.  I suspect they'd make it easier for descriptions to be less concise than possible for the level of informativeness we want, a bad tendency.

@@ +17,5 @@
> + * The class T which will be inserted into the linked list must inherit from
> + * LinkedListElement<T>.  A given object may be in only one linked list at a
> + * time.
> + *
> + * == Example ==

I think example code should generally implement some sensible concept, or at least make sense, in addition to just demonstrating the implemented API.  How about if you changed the example to be a list of observers, with addition, notification, and removal ops?  ElemType would become Observer, ElemContainer would become ObserverList, then Foo would become separate notify, add, and remove methods.  (add/remove could be simple one-liners that forward, which seems best for understandability.)

@@ +37,5 @@
> + *
> + *       aElem.Remove();
> + *     }
> + *
> + *     LinkedList<ElemType> mList;

Hyper-nitpick, but for readability, put the list at the top of the class (where it and its type will be seen first, at a scan), then put the methods in a public section following.  Although I guess it's not really nitpicky, because otherwise the class isn't usable at all...

@@ +45,5 @@
> + * == Implementation Notes ==
> + *
> + * Circular linked lists with a sentinel node are fast and easy to program, but
> + * they're not so easy to use; the sentinel node is of the same type as a list
> + * element, but you have to be sure never to cast the sentinel node to T*.

As a general rule, implementation notes like this should go inside the implementation, away from the interface description that serves as primary documentation for the class and its use.  The best place for this, it seems to me, is by the isSentinel member of LinkedListElement.

@@ +50,5 @@
> + *
> + * LinkedList and LinkedListElement provide a typesafe interface to a circular
> + * linked list.  Instead of returning the sentinel node and relying on the
> + * programmer not to cast it to T*, we return NULL to indicate that you've hit
> + * the end of the list.

This info should really go up in the first paragraph of the overall comment.  Although it should be rephrased to not use the "sentinel" jargon, simply saying that the ends of the lists are represented by NULL.  NULL's equally as clear as sentinels here, and it's more user-friendly.

@@ +66,5 @@
> +#include "mozilla/Assertions.h"
> +
> +namespace mozilla {
> +
> +template<class T>

I prefer using typename when a template type parameter won't necessarily be a class.  I know, it comes to the same thing, but might as well avoid the mental disconnect if possible...

@@ +84,5 @@
> +    /*
> +     * Get the next element in the list, or NULL if this is the last element in
> +     * the list.
> +     */
> +    T *GetNext()

Existing mfbt style would use camelCaps |T* getNext()| here.  Same for all the other methods.

@@ +99,5 @@
> +        return mPrev->Get();
> +    }
> +
> +    /*
> +     * Insert aElem before this element in the list.  |this| must be part of a

Hmm, so


isn't inserting elem before aElem, but rather aElem before elem?  Either way this could be interpreted seems easily confused with the other, and either way won't be obviously readable in the context of a use, far away from this description.

Gecko's frame class uses setNextSibling for this operation.  WebKit's DoublyLinkedList uses setNext and setPrev for these.  This naming style seems unambiguous; please use setNext and setPrevious (which I guess would just chain to setNext, or vice versa) instead.

@@ +122,5 @@
> +    }
> +
> +    /*
> +     * Remove this element from the list which contains it.  If this element is
> +     * not currently part of a linked list, this method does nothing.

Hmm, the latter behavior seems...complex.  What's your rationale for this over asserting that this is in a list?  I think we should assert in-listness in this method, and if a use case comes up later (that wouldn't be more clearly satisfied by a check in the caller), we can ease the requirement.

@@ +149,5 @@
> +     */
> +    friend class LinkedList<T>;
> +
> +    /* Only LinkedList objects should call this constructor. */
> +    LinkedListElement(bool aIsRoot)

bools as function arguments are difficult to read, so make this a protected enum { Sentinel }.  (There's only one caller with one constant value, so no need for an Element enum too.)  The member field can still be bool; the signature of the ctor called would indicate the right translation.

@@ +158,5 @@
> +    }
> +
> +    /*
> +     * Return |this| cast to T* if we're a LinkedListElement object, or return
> +     * NULL if we're a LinkedList object.

This isn't really the check you're making.  Rather, you're converting an element into either the T* it represents or the NULL a sentinel represents.  I was confused for a bit by the way you described this, until I'd worked through the implications of what I intuitively thought should happen in the body of this method (see the next comment).

Plausibly a better name for this would be toPointerOrNull?

@@ +164,5 @@
> +    T *Get()
> +    {
> +        if (!mIsRoot) {
> +            return static_cast<T*>(this);
> +        }

I think this would be more readable if the field were named for sentinel-ness than for root-ness.  And I would invert the check here to avoid a double-negative when reading:

  if (isSentinel)
    return NULL;
  return static_cast<T*>(this);

The intuition here should be that sentinel corresponds to null, and otherwise this corresponds to an element.  The extra negation makes that less clear to me.

@@ +189,5 @@
> +    /*
> +     * Insert aElem after this element, but don't check that this element is in
> +     * the list.  This is called by LinkedList::InsertFront().
> +     */
> +    void InsertAfterUnsafe(T *aElem)

There's really no difference between this->insertBeforeUnsafe(aElem) and aElem->insertAfterUnsafe(this), right?  It's just whether you want to talk about inserting after, or about inserting before.  There should definitely be only one implementation of inserting, not two, given it's a bit twisty to understand.

@@ +202,5 @@
> +    }
> +
> +    LinkedListElement *mNext;
> +    LinkedListElement *mPrev;
> +    const bool mIsRoot;

As it would be Bad for someone to copy an instance of this class because mutations to the copy/instance would totally screw up the instance/copy, please MOZ_DELETE the copy constructor and assignment operators in a private: section.

@@ +206,5 @@
> +    const bool mIsRoot;
> +};
> +
> +template<class T>
> +class LinkedList : private LinkedListElement<T>

Why private inheritance here, rather than composition?  Composition seems much preferable to me; otherwise the implementation has to keep in mind when it's calling a LinkedList method versus a method of an element, and the two method namespaces have to be careful not to trample on each other.

@@ +226,5 @@
> +    {
> +        /* 
> +         * InsertAfter asserts InList, which would fail if someone called
> +         * InsertFront on an empty list.  InsertAfterUnsafe skips the InList
> +         * assertion.

I'd just say /* Bypass the in-list assertion that InsertAfter would make. */,

Come to think of it, this is also going to skip the !inList() assert for aElem.  Or, hm, there's no actual such assert even in insertAfter.  Add it to the unsafe insertion methods?

@@ +259,5 @@
> +    /*
> +     * Get and remove the first element of the list.  If the list is empty,
> +     * return NULL.
> +     */
> +    T *PopFirst()

"removeFirst" seems a better name, same for Last.

@@ +293,5 @@
> +    /*
> +     * In a debug build, make sure that the list is sane (no cycles, consistent
> +     * next/prev pointers, only one root).  Has no effect in release builds.
> +     */
> +    void DebugAssertIsSane()

Hmm, and we encounter the problem of checking doubly-linked-list consistency.  Is it even possible to have inconsistency, if every insertion asserts that the element being inserted is asserted to not be in any other list?  I think it may not be...

@@ +308,5 @@
> +             slow = slow->mNext,
> +             fast1 = fast2->mNext,
> +             fast2 = fast1->mNext) {
> +
> +            MOZ_ASSERT(slow != fast1 && slow != fast2);

It's been our experience in the JS engine that having two assertion is better than just one, because should the assertion fail, you know more precisely which part was problematic.  (If it were just |slow != fast1| failing, it's possible the second half might also be failing, of course, but even still, two asserts provides strictly more information than one.)  Make this assertion, and the one in the loop below it, two.

@@ +335,5 @@
> +        LinkedListElement<T> *prev = this;
> +        LinkedListElement<T> *cur = this->mNext;
> +        do {
> +            MOZ_ASSERT(cur->mPrev = prev);
> +            MOZ_ASSERT(prev->mNext = cur);


@@ +342,5 @@
> +            cur = cur->mNext;
> +        } while (cur != this);
> +#endif
> +    }
> +};

Copy constructor and assignment operator should be deleted here too.

@@ +344,5 @@
> +#endif
> +    }
> +};
> +
> +} // namespace mozilla

/* namespace mozilla */ for any potential users that must be C-compatible and don't use -std=c99.
Attachment #586555 - Flags: review?(jwalden+bmo) → review-
> Gecko's frame class uses setNextSibling for this operation.  WebKit's DoublyLinkedList uses setNext 
> and setPrev for these.  This naming style seems unambiguous; please use setNext and setPrevious 
> (which I guess would just chain to setNext, or vice versa) instead.

WebKit::DoublyLinkedList::setNext() is not the same as mfbt::LinkedList::insertAfter at all!

template<typename T> inline void DoublyLinkedListNode<T>::setNext(T* next)
    static_cast<T*>(this)->m_next = next;

WebKit's DoublyLinkedList uses "append()", which I'd be fine with on the LinkedList class, but not on LinkedListElement.
> put * by the type name in |T* foo|

So mfbt is explicitly *not* SpiderMonkey style?  Or is the wiki wrong?
Ugh, I mis-skimmed on DoublyLinkedList::setNext.  I agree append doesn't make sense for individual elements.  setNext per nsIFrame's meaning of it, however, I think, does make sense and is unambiguous.

mfbt is its own special snowflake amalgam -- mostly but not entirely SpiderMonkey.  There hasn't been an iron fist of style applied to it quite yet, just what gets reviewed (not consistently)...
> "removeFirst" seems a better name [than popFirst()], same for Last.

I think

  T* foo = list.removeFirst()

isn't as clear as

  T* foo = list.popFirst()

since "pop" makes it explicit that you're going to get something and what you're going to get.  (It's conceivable that a [poorly-designed] list class might return the *new* list head from removeFirst().)  I have to imagine that the return value of (pop/remove)First will be used much more often than it's ignored.

I totally agree that insert{Before,After} aren't the best names, but I don't like "setNext", because it suggests that the function does what webkit's setNext function does, which insertAfter definitely *doesn't* do.  I'll try to think of a better name or see if I can find one.

The rest of the review comments look good to me; thanks for being so thorough, Jeff.
Whiteboard: [needs review]
Attachment #586555 - Flags: review?(khuey)
bz, Gecko's nsIFrame has setNextSibling for inserting one frame after another frame in a frame list.  Were any other names considered?

We're implementing a doubly linked list here, and we need names for inserting before/after an element in the list.  But insertBefore/insertAfter are a bit ambiguous about whether |a->insertAfter(b)| inserts a after b or b after a.

I tend to think setNext is at least not too bad -- which is hardly a ringing endorsement.  You have any better ideas left over from the frame list changes?
> Hmm, and we encounter the problem of checking doubly-linked-list consistency.  Is it even possible 
> to have inconsistency, if every insertion asserts that the element being inserted is asserted to not 
> be in any other list?  I think it may not be...

I was thinking DebugAssertIsSane() might be useful for checking that the list hasn't been corrupted, more than for checking that the LinkedList hasn't messed it up.
That's certainly reasonable.  I'm just wondering where we could actually use it, while not inducing overly-quadratic behaviors in any users.  Maybe only the client code gets to play with it, and the implementation itself is stuck not using it.  :-\
Upon reflection, I'm fine with setNext/setPrevious if we can't come up with anything better.
> There's really no difference between this->insertBeforeUnsafe(aElem) and 
> aElem->insertAfterUnsafe(this), right?

Except MOZ_ASSERT(!elem->isInList()), which is key.
I wonder if we want   LinkedList::setFirst() instead of LinkedList::insertFront(), so we're symmetrical to LinkedList::getFirst() in the way that LinkedList::setNext() is symmetrical to  LinkedList::getNext().

I admit that I prefer insertFront() in the absence of this analogy.
Attached patch Patch v2 (obsolete) — Splinter Review
Review: jwalden+bmo

I left popFirst and popLast, but I'm happy to change these to removeFirst and removeLast if you really think they're better.
Attachment #587925 - Flags: review?(jwalden+bmo)
Hm...git-bz usually works, and then sometimes, it does that!  Thanks, Josh.
Attachment #586555 - Attachment is obsolete: true
I forgot to take care of a few style nits; patch forthcoming.
Attached patch Patch v2.1Splinter Review
Nix a few braces.
Attachment #588040 - Flags: review?(jwalden+bmo)
Attachment #587925 - Attachment is obsolete: true
Attachment #587925 - Flags: review?(jwalden+bmo)
(In reply to Jeff Walden (remove +bmo to email) from comment #7)

> > +    T *GetNext()
> Existing mfbt style would use camelCaps |T* getNext()| here.  Same for all
> the other methods.

I think we should either

1) fix MFBT to conform to Gecko; or
2) fix Gecko to conform to MFBT.

The current mix is weird.
> Were any other names considered?

I was changing an existing singly-linked-list datastructure that already had a SetNextSibling.  I just left the name the same to avoid changing tons of existing callers.
I think that setFirst sounds like it would move the sentinel to between the element previous to the one calling setFirst and the one calling setFirst, making the old first and last be linked.
>FWIW, I hope review comments for the LinkedList class cause LinkedList's capitalization to be Gecko-style. :)
Comment on attachment 588040 [details] [diff] [review]
Patch v2.1

Review of attachment 588040 [details] [diff] [review]:

::: mfbt/LinkedList.h
@@ +41,5 @@
> +/* A type-safe doubly-linked list class. */
> +
> +#ifndef mozilla_LinkedList_h_
> +#define mozilla_LinkedList_h_
> +#ifdef __cplusplus

Nitpicky, but I'd prefer this ifdef start before the C++ stuff begins, so before |namespace mozilla|.

@@ +45,5 @@
> +#ifdef __cplusplus
> +
> +/*
> + * The classes LinkedList<T> and LinkedListElement<T> together form a
> + * convenient, type-safe doubly-linked list implementation.

Hmm.  Ordinarily we would have this comment by the class, and I think everything in mfbt does now.  But that would mean you'd have stuff like the forward-declares and stuff before it, here -- and moreover, an entire class as well.  That's a readability mess.  So this is a good deviation.  We should make the same change to all the other headers, for consistency.  (You don't need to do that here, but if you want to file the followup, that's cool.)

@@ +84,5 @@
> + *       for (Observer* o = list.getFirst();
> + *            o != NULL;
> + *            o = o->getNext()) {
> + *
> + *         o->Observe(topic);

Remove the blank line above this?

@@ +198,5 @@
> +    LinkedListElement(const LinkedList<T>& other) MOZ_DELETE;
> +
> +    friend class LinkedList<T>;
> +
> +    enum LinkedListNodeKind {

"LinkedList" is redundant with context here -- just "NodeKind".

@@ +214,5 @@
> +    /*
> +     * Return |this| cast to T* if we're a normal node, or return NULL if we're
> +     * a sentinel node.
> +     */
> +    T* AsTypeT()

asTypeT; I'd actually be inclined to just make it asT(), myself.

@@ +367,5 @@
> +             slow = slow->prev,
> +             fast1 = fast2->prev,
> +             fast2 = fast1->prev) {
> +
> +            MOZ_ASSERT(slow != fast1);

Remove this blank line too.
Attachment #588040 - Flags: review?(jwalden+bmo) → review+
> Nitpicky, but I'd prefer this ifdef start before the C++ stuff begins, so before |namespace mozilla|.

What do you mean?  Everything which appears before the ifdef __cplusplus is either a comment, a preprocesssor directive, or whitespace.
I'd leave the introductory comment and the #includes outside the #ifdef.  See, for example, what mfbt/GuardObjects.h does.
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla12
This should be mentioned on in the same style as the other mentions -- enough of a teaser to tell the reader what's there, then directing him to the actual file for full description.
Keywords: dev-doc-needed
Depends on: 722581
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.