Closed Bug 1325254 Opened 9 years ago Closed 8 years ago

optimize TimerThread data structures

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla55

Project Flags:

Performance Impact

high

Tracking Flags:

Tracking

Status

firefox55

---

fixed

People

(Reporter: bkelly, Assigned: bzbarsky)

References

(Blocks 1 open bug)

Details

Attachments

(8 files, 13 obsolete files)

P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 3.54 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review
P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 6.33 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review
P3 Sort TimerThread list as a binary heap. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 6.49 KB, patch		Details \| Diff \| Splinter Review
forget-wip 9 years ago Ben Kelly [:bkelly, not reviewing] 6.65 KB, patch		Details \| Diff \| Splinter Review
P4 Make nsITimer::Cancel() O(c). r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 9.79 KB, patch		Details \| Diff \| Splinter Review
P3 Sort TimerThread list as a binary heap. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 6.59 KB, patch	froydnj : feedback+	Details \| Diff \| Splinter Review
P4 Make nsITimer::Cancel() O(c). r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 10.92 KB, patch		Details \| Diff \| Splinter Review
P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 4.15 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 6.96 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
3 Sort TimerThread list as a binary heap. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 6.31 KB, patch		Details \| Diff \| Splinter Review
P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 4.13 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 6.95 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P3 Sort TimerThread list as a binary heap. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 6.31 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review
P4 Dynamically allocate Entry structs stored in TimerThread::mTimers. r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 7.14 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review
P5 Make nsITimer::Cancel() O(c). r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 10.09 KB, patch	froydnj : feedback+	Details \| Diff \| Splinter Review
P6 Improve nsTimerImpl locking and remove TimerThread::RemoveTimer(). r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 14.17 KB, patch		Details \| Diff \| Splinter Review
P6 Improve nsTimerImpl locking and remove TimerThread::RemoveTimer(). r=froydnj 9 years ago Ben Kelly [:bkelly, not reviewing] 9.18 KB, patch		Details \| Diff \| Splinter Review
wip 9 years ago Ben Kelly [:bkelly, not reviewing] 3.00 KB, patch		Details \| Diff \| Splinter Review
wip 9 years ago Ben Kelly [:bkelly, not reviewing] 11.62 KB, text/plain		Details
wip 9 years ago Ben Kelly [:bkelly, not reviewing] 11.63 KB, patch		Details \| Diff \| Splinter Review
Part 5 updated to review comments 9 years ago Boris Zbarsky [:bzbarsky] 10.60 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review

Ben Kelly [:bkelly, not reviewing]

Reporter

Description

•

9 years ago

As a side project I'd like to investigate optimizing TimerThread a bit. I believe its data structures are a bit sub-optimal at the moment. For example, the list of pending timers is held in an nsTArray<>. The next timer to run is at index 0. This means that executing and remove the next timer is guaranteed to be an O(n) operation. I did some measurements of timer removal on my 2013 MBP. Under relatively idle conditions the browser sees numbers like this: ### ### TimerThread::RemoveTimerInternal() 34 entries took 0.056 us ### ### TimerThread::RemoveTimerInternal() 34 entries took 0.059 us ### ### TimerThread::RemoveTimerInternal() 35 entries took 0.168 us ### ### TimerThread::RemoveTimerInternal() 34 entries took 0.075 us ### ### TimerThread::RemoveTimerInternal() 14 entries took 2.228 us ### ### TimerThread::RemoveTimerInternal() 14 entries took 2.231 us ### ### TimerThread::RemoveTimerInternal() 14 entries took 1.998 us Under heavy timer load, however, we can see values like this: ### ### TimerThread::RemoveTimerInternal() 53973 entries took 25.409 us ### ### TimerThread::RemoveTimerInternal() 53972 entries took 33.673 us ### ### TimerThread::RemoveTimerInternal() 53972 entries took 29.870 us ### ### TimerThread::RemoveTimerInternal() 53972 entries took 18.971 us ### ### TimerThread::RemoveTimerInternal() 53971 entries took 20.727 us ### ### TimerThread::RemoveTimerInternal() 53970 entries took 21.273 us This is somewhat significant since these removals are performed while holding a lock. This is a lock that the main thread also acquires to insert timers into the list. Also, timers have been shown in general to effect battery performance. Burning extra CPU cycles in timer overhead does not help us in battery showdowns with other browsers. It would be nice to be more efficient here.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 1

•

9 years ago

I believe the generally accepted way to build an efficient timer list is to use a binary heap. My initial plan is to try that. The easiest way to do this is to use something like std::vector with the standard c++ heap algorithms: std::make_heap - http://en.cppreference.com/w/cpp/algorithm/make_heap std::push_heap - http://en.cppreference.com/w/cpp/algorithm/push_heap std::pop_heap - http://en.cppreference.com/w/cpp/algorithm/pop_heap This makes insertion and removal O(log n). For arbitrary cancellations I plan to simply leave an entry in the list, but clear its timer reference or otherwise mark it as a no-op.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 2

•

9 years ago

Actually, it seems we should be able to use nsTArray iterators with the std::make_heap() and friends.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 3

•

9 years ago

Attached patch P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 4

•

9 years ago

Attached patch P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj (obsolete) — Details — Splinter Review

These are just some initial refactor patches. I ran out of steam before getting to the heap sorting tonight.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 5

•

9 years ago

Attached patch P3 Sort TimerThread list as a binary heap. r=froydnj (obsolete) — Details — Splinter Review

This makes the case of deleting the first entry in the list take ~1us even under heavy timer load. The case of canceling a timer is still a problem, though, because we have to iterate the list. I'm working on patches to fix that.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 6

•

9 years ago

Attached patch forget-wip (obsolete) — Details — Splinter Review

This is a work-in-progress patch to try to fix the issue with cancelations taking a long time. It tries to add a back channel pointer from nsTimerImpl to the Entry so we can just directly tell the Entry to drop its ref. There is some bug, however, as its currently crashing. I guess probably when the nsTArray is realloc'd and all the pointers move.

Ben Kelly [:bkelly, not reviewing]

Reporter

Updated

•

9 years ago

Attachment #8821019 - Attachment is patch: true

Benjamin Smedberg

Comment 7

•

9 years ago

Argh, let's just get rid of the timer thread.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 8

•

9 years ago

(In reply to Benjamin Smedberg [:bsmedberg] from comment #7) > Argh, let's just get rid of the timer thread. Using OS timer APIs would be even better yes, but implementing that for all platforms will take a larger effort. I expect there to be compat issues across OS versions, etc. (I thought perhaps cross-platform compat was a reason we had the timer thread in the first place.) Also we will always want to have a cross-platform fallback in the timer thread.

Benjamin Smedberg

Comment 9

•

9 years ago

I'm not convinced we need a platform-specific timer impl. The real effect of a timer is that it wakes up the event loop at a particular time. If when we waitForEvent we set a timeout at the next time a timer is scheduled, we'll just wake up and we can then notice that it's time to fire the timer. No extra runnables are necessary.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 10

•

9 years ago

That sounds reasonable, but also a rather large change. Even if we implemented that, though, it would not remove the need for the data structure I am modifying here. We still need a list of scheduled timers in a data structure that can efficiently tell us the next deadline. We also need this list to support arbitrary cancellation. So I still would like to implement the binary heap here. Happy to see a separate bug about removing Timer thread.

Nathan Froyd [:froydnj]

Comment 11

•

9 years ago

(In reply to Ben Kelly [Food Coma / PTO / back Jan 3 :bkelly] from comment #10) > So I still would like to implement the binary heap here. Happy to see a > separate bug about removing Timer thread. We have a separate bug about removing the timer thread (or rewriting timers entirely, I forget how it was phrased), but I can't find it atm...

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 12

•

9 years ago

I still think native timers might be worth it. Its not clear how well gecko-level timer solutions work with mobile devices that enter low-power modes. The native timers are probably more battery efficient and also more accurate when waking from low-power mode. That's just a theory, though.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 13

•

9 years ago

Attached patch P4 Make nsITimer::Cancel() O(c). r=froydnj (obsolete) — Details — Splinter Review

This stops the crashing and should make Cancel() efficient. I need to rebuild without DEBUG to evaluate the perf impact.

Attachment #8821019 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 14

•

9 years ago

So, I tried running these patches combined with bug 1336598 and bug 1336822 on this test: https://people-mozilla.org/~bkelly/timer-flood/index.html These patches let us execute about 5x more timers in the same amount of time. Without these patches we execute about 3000 timers per second. With these patches we execute 15k to 20k timers per second. Not that we need to optimize for throughput during a timer flood, but it demonstrates a significant reduction in overhead from managing the timer list.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 15

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7c5bee010bdad3e7ead3c1598780c6aaba40db0a

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 16

•

9 years ago

Attached patch P3 Sort TimerThread list as a binary heap. r=froydnj (obsolete) — Details — Splinter Review

Updated patch to not return -1 in a bool returning method. Also, I made AppendElement() when inserting a timer fallible again. https://treeherder.mozilla.org/#/jobs?repo=try&revision=4ade033a0b28c9dff28ce97505bdc58140700a65

Attachment #8821016 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 17

•

9 years ago

Attached patch P4 Make nsITimer::Cancel() O(c). r=froydnj (obsolete) — Details — Splinter Review

Add some comments to try to document the structure.

Attachment #8833677 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 18

•

9 years ago

Comment on attachment 8820998 [details] [diff] [review] P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj This is a refactor patch in preparation for later patches. It basically makes TimerThread use RefPtr<> in the mTimers list instead of manually calling AddRef()/Release().

Attachment #8820998 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 19

•

9 years ago

Comment on attachment 8820999 [details] [diff] [review] P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj This modifies the mTimers list to use an Entry struct instead of storing nsTimerImpl refs directly. The purpose here is to let us remove the nsTimerImpl ref on cancel without modifying the list itself. This is important for a couple reasons: 1) It gets us closer to our goal of O(c) operations because dropping the ref is constant time. 2) The heap data structure only allows accessing the item at the front of the list. Operations involve moving the front item to the back and calling std::pop_heap(). To append to the heap you add an item to the end of the list and then use std::push_heap(). It does not support removing an item from the middle of the heap. The main tradeoff here is we must burn through any empty canceled Entry structures during our TimerThread run loop. This is a minimal cost, though, compared to removing entries in the middle of a large sorted data structure.

Attachment #8820999 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 20

•

9 years ago

Comment on attachment 8833798 [details] [diff] [review] P3 Sort TimerThread list as a binary heap. r=froydnj This patch modifies the mTimers list to be a binary heap data structure as defined by std::push_heap() and std::pop_heap().

Attachment #8833798 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 21

•

9 years ago

Comment on attachment 8833800 [details] [diff] [review] P4 Make nsITimer::Cancel() O(c). r=froydnj This patch addresses the last O(n) operation. In order to drop the nsTimerImpl ref from the mTimers Entry struct we still need to iterate over the mTimers list to find it. This patch modifies nsTimerImpl to have a reference directly to a Holder object. It can then inform that Holder to forget the timer object. The mTimers Entry owns a Holder object instead of an nsTimerImpl. So in nsTimerImpl Cancel() can simply ask the Holder to forget it without any iteration of the list. Note, we must use a separate Holder object instead of the Entry structure itself. Since the Entry are stored in the nsTArray by value they can be moved by realloc. Therefore we store a separate allocated Holder that lives at a stable place in memory.

Attachment #8833800 - Flags: review?(nfroyd)

Nathan Froyd [:froydnj]

Comment 22

•

9 years ago

Comment on attachment 8820998 [details] [diff] [review] P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj Review of attachment 8820998 [details] [diff] [review]: ----------------------------------------------------------------- Thank you. Less NS_ADDREF/RELEASE cannot be a bad thing.

Attachment #8820998 - Flags: review?(nfroyd) → review+

Nathan Froyd [:froydnj]

Comment 23

•

9 years ago

Comment on attachment 8820999 [details] [diff] [review] P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj Review of attachment 8820999 [details] [diff] [review]: ----------------------------------------------------------------- I think this makes sense. ::: xpcom/threads/TimerThread.cpp @@ +525,5 @@ > > + // skip any canceled timers > + while(!mTimers.IsEmpty() && !mTimers[0].mTimerImpl) { > + mTimers.RemoveElementAt(0); > + } I think it'd be better to write this loop as: // See how many timers at the front of the list have been canceled. size_t firstActive = 0; while (firstActive < mTimers.Length() && !mTimers[firstActive].mTimerImpl) { firstActive++; } if (firstActive != 0) { mTimers.RemoveElementsAt(0, firstActive); } I see that it's going to be replaced in later patches, but we should try to use the "biggest" possible calls into nsTArray.

Attachment #8820999 - Flags: review?(nfroyd) → review+

Nathan Froyd [:froydnj]

Comment 24

•

9 years ago

Nathan Froyd [:froydnj]

Comment 25

•

9 years ago

Comment on attachment 8833798 [details] [diff] [review] P3 Sort TimerThread list as a binary heap. r=froydnj Review of attachment 8833798 [details] [diff] [review]: ----------------------------------------------------------------- Some discussion required, see below. I don't think the idea is bad, though. ::: xpcom/threads/TimerThread.cpp @@ +470,1 @@ > } Same comments on this loop apply; we should remove things from the heap and then delete a range from the array. @@ +530,1 @@ > } Here too. ::: xpcom/threads/TimerThread.h @@ +91,5 @@ > > + Entry(Entry&& aOther) > + : mTimeout(aOther.mTimeout) > + , mTimerImpl(mozilla::Move(aOther.mTimerImpl)) > + { } These are just the default behavior, right? If we really want these, can we just say |= default;|? @@ +104,5 @@ > bool operator<(const Entry& aRight) const > { > + // Reverse logic since we are inserting into a max heap > + // that sorts the "largest" value to index 0. > + return mTimeout > aRight.mTimeout; Can we just make this a function object that we pass in to any call to pop_heap or similar? That would make the behavior a little more obvious at the call sites.

Attachment #8833798 - Flags: review?(nfroyd) → feedback+

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 26

•

9 years ago

(In reply to Nathan Froyd [:froydnj] from comment #25) > ::: xpcom/threads/TimerThread.cpp > @@ +470,1 @@ > > } > > Same comments on this loop apply; we should remove things from the heap and > then delete a range from the array. I could make a separate RemoveCanceledTimers() method that does its own internal std::pop_heap and RemoveElementsAt(). I am hesitant to try to track a separate "last valid item" in mTimers in order to support a state where the array is not fully sorted. > @@ +104,5 @@ > > bool operator<(const Entry& aRight) const > > { > > + // Reverse logic since we are inserting into a max heap > > + // that sorts the "largest" value to index 0. > > + return mTimeout > aRight.mTimeout; > > Can we just make this a function object that we pass in to any call to > pop_heap or similar? That would make the behavior a little more obvious at > the call sites. I can do this. It would also allow me to make the Entry objects heap allocated instead of stored in the nsTArray by value. This would avoid the need for a separate Holder class in the P4 patch.

Ben Kelly [:bkelly, not reviewing]

Reporter

Updated

•

9 years ago

Attachment #8833800 - Flags: review?(nfroyd)

Nathan Froyd [:froydnj]

Comment 27

•

9 years ago

Comment on attachment 8833800 [details] [diff] [review] P4 Make nsITimer::Cancel() O(c). r=froydnj Review of attachment 8833800 [details] [diff] [review]: ----------------------------------------------------------------- Clearing review, since comments on previous patches will change this a bit. ::: xpcom/threads/TimerThread.h @@ +143,5 @@ > > Entry(const TimeStamp& aMinTimeout, const TimeStamp& aTimeout, > nsTimerImpl* aTimerImpl) > + : mTimeout(std::max(aMinTimeout, aTimeout)) > + , mHolder(new Holder(aTimerImpl)) mozilla::MakeUnique<Holder>(aTimerImpl), please. @@ +162,5 @@ > } > > + ~Entry() > + { > + } This could just be |= default;|? ::: xpcom/threads/nsTimerImpl.h @@ +43,5 @@ > +class nsTimerImplHolder > +{ > +public: > + virtual void > + Forget(nsTimerImpl* aTimerImpl) = 0; AFAICT, we only have one inheritor of this interface, so maybe it doesn't need to be virtual?

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 28

•

9 years ago

Attached patch P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj (obsolete) — Details — Splinter Review

Updated this patch to remove ReleaseTimerInternal() declaration from the header. I removed the implementation but forgot to nuke the declaration before.

Attachment #8820998 - Attachment is obsolete: true

Attachment #8834204 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 29

•

9 years ago

Attached patch P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj (obsolete) — Details — Splinter Review

Address review feedback adding a new method: TimerThread::RemoveLeadingCanceledTimersInternal() This does the bulk remove using RemoveElementsAt().

Attachment #8820999 - Attachment is obsolete: true

Attachment #8834205 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 30

•

9 years ago

Attached patch 3 Sort TimerThread list as a binary heap. r=froydnj (obsolete) — Details — Splinter Review

This adds the bulk removal of canceled timers to the heap sorting patch. I am going to do the separate comparator function in a separate patch.

Attachment #8833798 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 31

•

9 years ago

I need to rebase on bug 1328643.

Depends on: 1328643

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 32

•

9 years ago

Attached patch P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj — Details — Splinter Review

Rebase.

Attachment #8834204 - Attachment is obsolete: true

Attachment #8834245 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Reporter

Updated

•

9 years ago

Attachment #8834245 - Attachment description: bug1325254_p1_timerrefptrlist.patch → P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 33

•

9 years ago

Attached patch P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj — Details — Splinter Review

Rebase

Attachment #8834205 - Attachment is obsolete: true

Attachment #8834246 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 34

•

9 years ago

Attached patch P3 Sort TimerThread list as a binary heap. r=froydnj — Details — Splinter Review

Rebase.

Attachment #8834216 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 35

•

9 years ago

Attached patch P4 Dynamically allocate Entry structs stored in TimerThread::mTimers. r=froydnj — Details — Splinter Review

This patch moves the Entry to its own dynamically allocated structure with a separate comparator func.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 36

•

9 years ago

Attached patch P5 Make nsITimer::Cancel() O(c). r=froydnj — Details — Splinter Review

Implement the Holder strategy using the dynamic Entry objects.

Attachment #8833800 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 37

•

9 years ago

The rebased patches seem to leak. I must have missed some change in bug 1328643 that I need to account for. I'm too tired to figure it out for now. Lets see what try says overnight: ttps://treeherder.mozilla.org/#/jobs?repo=try&revision=9f4bcd6eb09be329a4ecc3afe124fab8638c56ba

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 38

•

9 years ago

Actually, it seems its not leaking. The memory does go away when I stop the timer flood and force memory minimization.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 39

•

9 years ago

We do seem to have more lock contention after bug 1328643. I may need to tune the setTimeout() back pressure mechanism again.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 40

•

9 years ago

Attached patch P6 Improve nsTimerImpl locking and remove TimerThread::RemoveTimer(). r=froydnj (obsolete) — Details — Splinter Review

This patch removes the need to lock the TimerThread mutex when canceling an nsTimerImpl. This is no longer necessary now that we can just call mHolder->Forget(). This does remove a wakeup of TimerThread that was in RemoveTimer(), but I think thats ok. Our strategy is to leave canceled timers in the mTimers heap until they are drained by the normal timer thread operation. So no need to immediately wakeup here. Also, this patch fixes some locking around mHolder. This was probably unsafe in the P5 patch. Unfortunately this did require switching to a ReentrantMonitor in nsTimerImpl. We can call SetHolder() from a number of paths. Some already hold the lock and others don't. Using the ReentrantMonitor was the only reasonable way I could see to make this work.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 41

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9a78673864d177cd82b96451a50ae448cca2bb3f

Nathan Froyd [:froydnj]

Comment 42

•

9 years ago

(In reply to Ben Kelly [:bkelly] from comment #40) > Also, this patch fixes some locking around mHolder. This was probably > unsafe in the P5 patch. Unfortunately this did require switching to a > ReentrantMonitor in nsTimerImpl. We can call SetHolder() from a number of > paths. Some already hold the lock and others don't. Using the > ReentrantMonitor was the only reasonable way I could see to make this work. I haven't looked at the patch, but: 1. Yay for locking fixes; 2. Boo for ReentrantMonitor. I really do not want to start using one of these in the timer code; it's less efficient compared to a plain Mutex and we really ought to be able to do better.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 43

•

9 years ago

Hmm, but the P6 removal of the TimerThread mMutex lock now allows us to cancel a timer in the middle of its processing. I'll have to think about this more.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 44

•

9 years ago

(In reply to Nathan Froyd [:froydnj] from comment #42) > 2. Boo for ReentrantMonitor. I really do not want to start using one of > these in the timer code; it's less efficient compared to a plain Mutex and > we really ought to be able to do better. Well, a lot of this is due to the incestuous nature between nsTimerImpl and TimerThread. They call back and forth between themselves in cycles so much its hard to be clear about what will re-enter or not. Anyway, I have to think about the P6 patch more anyways.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 45

•

9 years ago

I'll need to come back to this later tonight or this week. So I can remember, here is the stack for the comment 43 issue: * thread #10: tid = 0x188bed, 0x000000010245af1b XUL`TimerThread::Run() [inlined] RefPtr<nsTimerImpl>::get() const + 4 at RefPtr.h:283, name = 'Timer', stop reason = EXC_BAD_ACCESS (code=1, address=0x28) * frame #0: 0x000000010245af1b XUL`TimerThread::Run() [inlined] RefPtr<nsTimerImpl>::get() const + 4 at RefPtr.h:283 [opt] frame #1: 0x000000010245af17 XUL`TimerThread::Run() [inlined] RefPtr<nsTimerImpl>::operator nsTimerImpl*() const & at RefPtr.h:296 [opt] frame #2: 0x000000010245af17 XUL`TimerThread::Run() [inlined] TimerThread::Entry::Value(this=<unavailable>) const at TimerThread.h:118 [opt] frame #3: 0x000000010245af17 XUL`TimerThread::Run(this=<unavailable>) + 231 at TimerThread.cpp:465 [opt] frame #4: 0x0000000102465abb XUL`nsThread::ProcessNextEvent(this=<unavailable>, aMayWait=<unavailable>, aResult=0x00007000004abdb7) + 1067 at nsThread.cpp:1261 [opt] frame #5: 0x0000000102464ba3 XUL`NS_ProcessNextEvent(aThread=<unavailable>, aMayWait=true) + 51 at nsThreadUtils.cpp:389 [opt] frame #6: 0x00000001028afd3a XUL`mozilla::ipc::MessagePumpForNonMainThreads::Run(this=<unavailable>, aDelegate=<unavailable>) + 266 at MessagePump.cpp:368 [opt] frame #7: 0x0000000102883c19 XUL`MessageLoop::Run() [inlined] MessageLoop::RunInternal(this=<unavailable>) + 73 at message_loop.cc:238 [opt] frame #8: 0x0000000102883c0a XUL`MessageLoop::Run() [inlined] MessageLoop::RunHandler() at message_loop.cc:231 [opt] frame #9: 0x0000000102883c0a XUL`MessageLoop::Run(this=<unavailable>) + 58 at message_loop.cc:211 [opt] frame #10: 0x0000000102463cff XUL`nsThread::ThreadFunc(aArg=<unavailable>) + 351 at nsThread.cpp:494 [opt] frame #11: 0x000000010213b7ea libnss3.dylib`_pt_root(arg=0x0000000100735790) + 218 at ptthread.c:216 [opt] frame #12: 0x00007fff840c199d libsystem_pthread.dylib`_pthread_body + 131 frame #13: 0x00007fff840c191a libsystem_pthread.dylib`_pthread_start + 168 frame #14: 0x00007fff840bf351 libsystem_pthread.dylib`thread_start + 13

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 46

•

9 years ago

To fix the crash I think we will need to have some way of doing: "get the first valid timer out of mTimers and return it to me while holding the nsTimerImpl lock" We can then operate on the nsTimerImpl while holding its lock and know that it won't get canceled out from under us. My idea for doing this would look something like this: nsTimerImpl::LockedRef firstTimer; RemoveLeadingCanceledTimers(firstTimer); if (!firstTimer) { // no active timers } else { // There is an active timer, firstTimer has a strong ref to it, and we hold its lock. // When firstTimer goes out of scope it unlocks the timer and drops its strong ref to // the timer. }

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 47

•

9 years ago

Attached patch P6 Improve nsTimerImpl locking and remove TimerThread::RemoveTimer(). r=froydnj — Details — Splinter Review

This gets rid of the ReentrantMonitor from the previous version of the path. I fixed the issue with reentrant locking by exposing a separate SetHolderLocked() method for paths we know are called with the lock already held. I still need to sort out locking within the Entry object here, though. This patch will crash periodically in its current form.

Attachment #8834408 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 48

•

9 years ago

Attached patch wip (obsolete) — Details — Splinter Review

Work-in-progress on my LockedRef concept.

Ben Kelly [:bkelly, not reviewing]

Reporter

Updated

•

9 years ago

Attachment #8838900 - Attachment is patch: true

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 49

•

9 years ago

Attached file wip (obsolete) — Details

Another work-in-progress patch. This should solve the crashing issues, but probably needs some cleanup. https://treeherder.mozilla.org/#/jobs?repo=try&revision=4f153123c10ec52838321c724d10e4491dce2e35

Attachment #8838900 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 50

•

9 years ago

(In reply to Nathan Froyd [:froydnj] from comment #27) > > + ~Entry() > > + { > > + } > > This could just be |= default;|? The destructor is populated with code in later patches in this bug, so I did not bother to mark it `= default` in this patch. > > + virtual void > > + Forget(nsTimerImpl* aTimerImpl) = 0; > > AFAICT, we only have one inheritor of this interface, so maybe it doesn't > need to be virtual? So, I really dislike circular references between classes using the friend keyword. I think having a virtual method is a small price to pay for having a cleaner interface between TimerThread and nsTimerImpl. If I could separate them more, I would.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 51

•

9 years ago

Comment on attachment 8834247 [details] [diff] [review] P3 Sort TimerThread list as a binary heap. r=froydnj This patch has been relatively stable and offers an incremental improvement. It sorts the timer list in a binary heap so insertions and removals are O(log n). I tried to address previous review feedback by using bulk array removals, etc.

Attachment #8834247 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 52

•

9 years ago

Comment on attachment 8834248 [details] [diff] [review] P4 Dynamically allocate Entry structs stored in TimerThread::mTimers. r=froydnj This is a refactoring patch that makes the timer list store UniquePtr<Entry> instead of Entry value objects. This is necessary for the follow-up P5 patch where we need a stable address location for the Entry object.

Attachment #8834248 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 53

•

9 years ago

Comment on attachment 8834249 [details] [diff] [review] P5 Make nsITimer::Cancel() O(c). r=froydnj This patch makes cancellation O(c). Instead of iterating through the timer list looking for the Entry to clear, we now can directly clear the Entry through a callback pointer from the nsTimerImpl itself. Note, this is still not ideal since we lock the entire list by calling through TimerThread::RemoveTimer(). This protects the Entry datastructure from multi-thread access, but its a very course grained lock. The P6 patch will attempt to improve this locked. Or perhaps we could improve the locking in a separate follow-up bug. For now, though, this patch does offer an improvement by reducing the algorithmic complexity involved.

Attachment #8834249 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 54

•

9 years ago

Try build for update to P5 patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=aca304e4a17a2ca63275a47fd1ff8b6e6e182ee2

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 55

•

9 years ago

Attached patch wip — Details — Splinter Review

Try build including the new locking fixes: https://treeherder.mozilla.org/#/jobs?repo=try&revision=92fb716d471c0fa0ed2cc1b88da782a50dbf05c2

Ben Kelly [:bkelly, not reviewing]

Reporter

Updated

•

9 years ago

Attachment #8839013 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 56

•

9 years ago

Yea, I think the P6 patch is a step to far. But the changes up to the P5 patch seem stable and are still an improvement.

Nathan Froyd [:froydnj]

Updated

•

9 years ago

Attachment #8834247 - Flags: review?(nfroyd) → review+

Nathan Froyd [:froydnj]

Updated

•

9 years ago

Attachment #8834248 - Flags: review?(nfroyd) → review+

Nathan Froyd [:froydnj]

Comment 57

•

9 years ago

Comment on attachment 8834249 [details] [diff] [review] P5 Make nsITimer::Cancel() O(c). r=froydnj Review of attachment 8834249 [details] [diff] [review]: ----------------------------------------------------------------- Sorry for the delay. I think I thought this patch was harder to review that it actually was. I like where this is going, just one concern below. ::: xpcom/threads/nsTimerImpl.h @@ +44,5 @@ > +class nsTimerImplHolder > +{ > +public: > + virtual void > + Forget(nsTimerImpl* aTimerImpl) = 0; I'm not wild about having an abstract class that is essentially an implementation detail of a single client (TimerThread). What's wrong with simply forward-declaring nsTimerImplHolder here, and then leaving the implementation details up to TimerThread.{h,cpp}? You might want to rename it at that point, since it'd have to be the name of the concrete class used in TimerThread, but otherwise there shouldn't be any issues, since nsTimerImpl doesn't actually manipulate nsTimerImplHolder objects in any interesting way. Ah, I guess P6 does some manipulation of the holder from within nsTimerImpl. Well, OK, we could be gross and say: class nsTimerImplHolder { public: void Forget(nsTimerImpl*); }; // In some .cpp: void nsTimerImplHolder::Forget(nsTimerImpl* aTimerImpl) { static_cast<SUBCLASS*>(this)->Forget(aTimerImpl); } Which is a little gross, but at least we're not paying the overhead of the vtable, virtual method calls, etc. etc.

Attachment #8834249 - Flags: review?(nfroyd) → feedback+

Boris Zbarsky [:bzbarsky]

Assignee

Updated

•

9 years ago

Blocks: 1354616

Boris Zbarsky [:bzbarsky]

Assignee

Updated

•

9 years ago

Flags: needinfo?(bzbarsky)

Whiteboard: [qf:p1]

Nathan Froyd [:froydnj]

Comment 58

•

9 years ago

Discovered today that the patch set requires massive rebasing. :(

Boris Zbarsky [:bzbarsky]

Assignee

Comment 59

•

9 years ago

Attached patch Part 5 updated to review comments — Details — Splinter Review

Attachment #8859749 - Flags: review?(nfroyd)

Boris Zbarsky [:bzbarsky]

Assignee

Updated

•

9 years ago

Assignee: bkelly → bzbarsky

Boris Zbarsky [:bzbarsky]

Assignee

Updated

•

9 years ago

Flags: needinfo?(bzbarsky)

Nathan Froyd [:froydnj]

Comment 60

•

9 years ago

Comment on attachment 8859749 [details] [diff] [review] Part 5 updated to review comments Review of attachment 8859749 [details] [diff] [review]: ----------------------------------------------------------------- Thanks!

Attachment #8859749 - Flags: review?(nfroyd) → review+

Pulsebot

Comment 61

•

9 years ago

Pushed by bzbarsky@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/4d5f549491e7 P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/29679296c946 P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/a501885ead35 P3 Sort TimerThread list as a binary heap. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/4b6b6aad7688 P4 Dynamically allocate Entry structs stored in TimerThread::mTimers. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/030533bed090 P5 Make nsITimer::Cancel() O(c). r=froydnj

Carsten Book [:Tomcat]

Comment 62

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/4d5f549491e7 https://hg.mozilla.org/mozilla-central/rev/29679296c946 https://hg.mozilla.org/mozilla-central/rev/a501885ead35 https://hg.mozilla.org/mozilla-central/rev/4b6b6aad7688 https://hg.mozilla.org/mozilla-central/rev/030533bed090

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox55: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla55

Florian Bender

Updated

•

8 years ago

Blocks: 1393744

Daniel Veditz [:dveditz]

Updated

•

8 years ago

Depends on: 1383019

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

4 years ago

Performance Impact: --- → P1

Whiteboard: [qf:p1]

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Updated

•

3 years ago

Blocks: 1787974

You need to log in before you can comment on or make changes to this bug.