Closed
Bug 1174625
Opened 9 years ago
Closed 9 years ago
Overhaul PLDHashTable's iterator
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla41
Tracking | Status | |
---|---|---|
firefox41 | --- | fixed |
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
Attachments
(1 file, 1 obsolete file)
27.01 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
Currently you have to use a while-loop with PLDHashTable::Iterator, because the operation for getting the current element and the operation that advances to the next element are fused. If they are split, as is typical, you can use a for-loop, which is usually nicer. In particular, in a for-loop version the scope of the Iterator is bound to the iteration loop, unlike a while-loop version where the scope extends further. This was causing problems for me in cases where the PLDHashTable was being modified or deleted just after the iteration loop.
Assignee | ||
Comment 1•9 years ago
|
||
This change splits PLDHashTable::Iterator::NextEntry() into two separate functions, which allow you to get the current element and advance the iterator separately, which means you can use a for-loop to iterate instead of a while-loop. As part of this change, the internals of PLDHashTable::Iterator were significantly changed and simplified (and modelled after js::HashTable's equivalent code). It's no longer duplicating code from PL_DHashTableEnumerator. The chaos mode code was a casualty of this, but given how unreliable that code has proven to be (see bug 1173212, bug 1174046) this is for the best. (We can reimplement chaos mode once PLDHashTable::Iterator is back on more solid footing again, if we think it's important.) All these changes will make it much easier to add an alternative Iterator that removes elements, which was turning out to be difficult with the prior code. In order to make the for-loop header usually fit on a single line, I deliberately renamed a bunch of things to have shorter names. In summary, you used to write this: PLDHashTable::Iterator iter(&table); while (iter.HasMoreEntries()) { auto entry = static_cast<FooEntry*>(iter.NextEntry()); // ... do stuff with |entry| ... } // iter's scope extends beyond here and now you write this: for (PLDHashTable::Iter iter(&table); !iter.Done(); iter.Next()) { auto entry = static_cast<FooEntry*>(iter.Get()); // ... do stuff with |entry| ... } // iter's scope doesn't reach here
Attachment #8622307 -
Flags: review?(nfroyd)
Comment 2•9 years ago
|
||
Comment on attachment 8622307 [details] [diff] [review] Overhaul PLDHashTable's iterator Review of attachment 8622307 [details] [diff] [review]: ----------------------------------------------------------------- Are you going to fix up the comm-central usages, too? ::: xpcom/glue/pldhash.cpp @@ +903,4 @@ > INCREMENT_RECURSION_LEVEL(mTable); > > + // Advance to the first live entry, or to the end if there are none. > + while (true) { It's too bad that this loop is ever-so-slightly different than the one in Next(); they ought to be unified into a private AdvanceToNextLiveEntry() function. That would make future support for chaos mode somewhat easier, I bet.
Attachment #8622307 -
Flags: review?(nfroyd) → review+
Comment 3•9 years ago
|
||
Alternate idea: instead of: PLDHashTable::Iter iter(&table) we could say that the canonical spelling is: auto iter = table.Iterator() which is slightly shorter and avoids the needs for nested types, which are kind of ugly. I'm not entirely sure we can avoid exposing the Iter type, but we might be able to avoid marking a lot of its members as public? And we could give it a non-shortened name, too. Sadly, something like: auto iter = table.RemovingIterator() is a little longer than the alternative: PLDHashTable::RIter iter(&table) but maybe that's not so bad? RIter's not really a great name. I'd expect RemovingIterator to be less common, so a little extra verbosity seems OK.
Flags: needinfo?(n.nethercote)
Assignee | ||
Comment 4•9 years ago
|
||
> Are you going to fix up the comm-central usages, too? Yes, I already have a patch, though I haven't yet filed a bug. > It's too bad that this loop is ever-so-slightly different than the one in > Next(); they ought to be unified into a private AdvanceToNextLiveEntry() > function. That would make future support for chaos mode somewhat easier, I > bet. Ok... I guess I can move this code: > if (mCurrent == mLimit) { > return; > } > PLDHashEntryHdr* entry = reinterpret_cast<PLDHashEntryHdr*>(mCurrent); > if (ENTRY_IS_LIVE(entry)) { > return; > } into an always-inline function called something like IsLimitOrLiveEntry(). (FWIW, in js::HashTable the loops look like this: > while (cur < end && !cur->isLive()) > ++cur; > > while (++cur < end && !cur->isLive()) > continue; It's more compact because js::HashTable is templated and so you know what the element type actually is.)
Flags: needinfo?(n.nethercote)
Assignee | ||
Comment 5•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #3) > Alternate idea: instead of: > > PLDHashTable::Iter iter(&table) > > we could say that the canonical spelling is: > > auto iter = table.Iterator() > > which is slightly shorter and avoids the needs for nested types, which are > kind of ugly. I'm not entirely sure we can avoid exposing the Iter type, > but we might be able to avoid marking a lot of its members as public? And > we could give it a non-shortened name, too. > > Sadly, something like: > > auto iter = table.RemovingIterator() > > is a little longer than the alternative: > > PLDHashTable::RIter iter(&table) > > but maybe that's not so bad? RIter's not really a great name. I'd expect > RemovingIterator to be less common, so a little extra verbosity seems OK. I chose the short names because it's much nicer if you can get the whole loop head on a single line; otherwise, the only other reasonable way (IMO) to format a for-loop is over three lines, one per piece. Something I've done in some follow-up patches is to use |iter| as the iterator name when possible, but shorten it to |i| if necessary to make it fit on a single line. Elsewhere in the codebase, our iterators have a mix of "Iter" and "Iterator" names. The problem with your suggestion is that it relies on a copy constructor and/or assignment operator. That's ok for |Iter|, but I really want to avoid it for |RIter| because it copying a removing iterator is potentially dangerous and it'll immediately trigger the checking code from bug 1131308. I can't see a good way around this. As for nested types, I considered using PLDHash{Iter,RIter} instead of PLDHashTable::{Iter,RIter} and I could still do that. So, my proposal: rename them to PLDHashIter and PLDHashRemovingIter, but otherwise keep the same initialization form?
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 6•9 years ago
|
||
> The problem with your suggestion is that it relies on a copy constructor and/or assignment operator.
> That's ok for |Iter|, but I really want to avoid it for |RIter| because it copying a removing iterator
> is potentially dangerous and it'll immediately trigger the checking code from bug 1131308. I can't
> see a good way around this.
To clarify this... one of the nice things about distinguishing the read-only iterator and the removing iterator is that I'll be able to plug the major hole in the Checker, which is that currently you don't know if a PL_DHashTableEnumerator() call is a read or a write operation until after the fact. With the iterators, a removing iterator can be tagged as a write operation as soon as it's created. So if we copy a removing iterator, we'll immediately have two concurrent write operations which will trigger Checker's assertions.
I wondered about using rvalue semantics to avoid this but it's not possible to return an rvalue from a function -- it'll cause a dangling reference.
Comment 7•9 years ago
|
||
We discussed this on IRC and decided that we can use the: auto iter = table.Iterator(); form so long as the iterator class has a move constructor. The copy constructor/assignment operators can be deleted. My understanding is that we're keeping Iter/RemovingIter for the class names, and keeping them as inner classes of PLDHashTable.
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 8•9 years ago
|
||
I'll put up a revised patch for you to re-review before landing.
Assignee | ||
Comment 9•9 years ago
|
||
Updated version. I ended up leaving the existing constructor public so you can still use the old style, because that's sometimes useful. (Specifically, in XPConnect there are a bunch of classes that basically wrap PLDHashTable and every one will need an iterator but that iterator is used only once or twice, so the new style ends up being more code than the older style.)
Attachment #8622954 -
Flags: review?(nfroyd)
Assignee | ||
Updated•9 years ago
|
Attachment #8622307 -
Attachment is obsolete: true
Comment 10•9 years ago
|
||
Comment on attachment 8622954 [details] [diff] [review] Overhaul PLDHashTable's iterator Review of attachment 8622954 [details] [diff] [review]: ----------------------------------------------------------------- r=me with something so we don't have the extra Move in release-mode builds. ::: xpcom/glue/pldhash.h @@ +336,5 @@ > > + // If you remove the Move() call here, the code will still work, but the > + // compiler is allowed to do "copy elision", avoiding the creation and > + // moving of the temporary object. This will mask any defects in the move > + // constructor, or code affected by action's of the move constructor. (A Nit: "code affected by actions" @@ +339,5 @@ > + // moving of the temporary object. This will mask any defects in the move > + // constructor, or code affected by action's of the move constructor. (A > + // specific example: this class's move constructor nulls |mTable|, so the > + // destructor must handle that, but with copy elision the nulling is > + // skipped.) So what we're trying to work around here is that if you invoke: auto iter = table.Iter(); we'll actually construct Iterator "in-place" and completely avoid the move constructor and subsequent destruction of the temporary object? And then we're not getting good test coverage? The compiler can clean up a lot of the junk this extra Move induces, but not all of it. And it seems a shame to slow things down just so our test coverage gets a little better. WDYT about: // Big hairy comment. #ifdef PLDHASH_TESTING Iterator Iter() const { return Iterator(this); } #else Iterator Iter() const { return mozilla::Move(Iterator(this)); } #endif And then #define PLDHASH_TESTING in TestPLDHash.cpp before anything else is included? Or maybe re-use DEBUG?
Attachment #8622954 -
Flags: review?(nfroyd) → review+
Comment 11•9 years ago
|
||
Alternatively, it may be possible to write test code that uses Iterator such that the move elision doesn't happen.
Assignee | ||
Comment 12•9 years ago
|
||
> Alternatively, it may be possible to write test code that uses Iterator such
> that the move elision doesn't happen.
Good idea. I'll do that.
Comment 15•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/d34e33ab92f1
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox41:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in
before you can comment on or make changes to this bug.
Description
•