Note: There are a few cases of duplicates in user autocompletion which are being worked on.

optimize TimerThread data structures

RESOLVED FIXED in Firefox 55

Status

()

Core
XPCOM
RESOLVED FIXED
7 months ago
3 months ago

People

(Reporter: bkelly, Assigned: bz)

Tracking

unspecified
mozilla55
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

(Whiteboard: [qf:p1])

Attachments

(8 attachments, 13 obsolete attachments)

4.13 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
6.95 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
6.31 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
7.14 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
10.09 KB, patch
froydnj
: feedback+
Details | Diff | Splinter Review
9.18 KB, patch
Details | Diff | Splinter Review
wip
11.63 KB, patch
Details | Diff | Splinter Review
10.60 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
As a side project I'd like to investigate optimizing TimerThread a bit.  I believe its data structures are a bit sub-optimal at the moment.

For example, the list of pending timers is held in an nsTArray<>.  The next timer to run is at index 0.  This means that executing and remove the next timer is guaranteed to be an O(n) operation.

I did some measurements of timer removal on my 2013 MBP.  Under relatively idle conditions the browser sees numbers like this:

### ### TimerThread::RemoveTimerInternal() 34 entries took 0.056 us
### ### TimerThread::RemoveTimerInternal() 34 entries took 0.059 us
### ### TimerThread::RemoveTimerInternal() 35 entries took 0.168 us
### ### TimerThread::RemoveTimerInternal() 34 entries took 0.075 us
### ### TimerThread::RemoveTimerInternal() 14 entries took 2.228 us
### ### TimerThread::RemoveTimerInternal() 14 entries took 2.231 us
### ### TimerThread::RemoveTimerInternal() 14 entries took 1.998 us

Under heavy timer load, however, we can see values like this:

### ### TimerThread::RemoveTimerInternal() 53973 entries took 25.409 us
### ### TimerThread::RemoveTimerInternal() 53972 entries took 33.673 us
### ### TimerThread::RemoveTimerInternal() 53972 entries took 29.870 us
### ### TimerThread::RemoveTimerInternal() 53972 entries took 18.971 us
### ### TimerThread::RemoveTimerInternal() 53971 entries took 20.727 us
### ### TimerThread::RemoveTimerInternal() 53970 entries took 21.273 us

This is somewhat significant since these removals are performed while holding a lock.  This is a lock that the main thread also acquires to insert timers into the list.

Also, timers have been shown in general to effect battery performance.  Burning extra CPU cycles in timer overhead does not help us in battery showdowns with other browsers.  It would be nice to be more efficient here.
(Reporter)

Comment 1

7 months ago
I believe the generally accepted way to build an efficient timer list is to use a binary heap.  My initial plan is to try that.

The easiest way to do this is to use something like std::vector with the standard c++ heap algorithms:

  std::make_heap - http://en.cppreference.com/w/cpp/algorithm/make_heap
  std::push_heap - http://en.cppreference.com/w/cpp/algorithm/push_heap
  std::pop_heap - http://en.cppreference.com/w/cpp/algorithm/pop_heap

This makes insertion and removal O(log n).

For arbitrary cancellations I plan to simply leave an entry in the list, but clear its timer reference or otherwise mark it as a no-op.
(Reporter)

Comment 2

7 months ago
Actually, it seems we should be able to use nsTArray iterators with the std::make_heap() and friends.
(Reporter)

Comment 3

7 months ago
Created attachment 8820998 [details] [diff] [review]
P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj
(Reporter)

Comment 4

7 months ago
Created attachment 8820999 [details] [diff] [review]
P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj

These are just some initial refactor patches.  I ran out of steam before getting to the heap sorting tonight.
(Reporter)

Comment 5

7 months ago
Created attachment 8821016 [details] [diff] [review]
P3 Sort TimerThread list as a binary heap. r=froydnj

This makes the case of deleting the first entry in the list take ~1us even under heavy timer load.

The case of canceling a timer is still a problem, though, because we have to iterate the list.  I'm working on patches to fix that.
(Reporter)

Comment 6

7 months ago
Created attachment 8821019 [details] [diff] [review]
forget-wip

This is a work-in-progress patch to try to fix the issue with cancelations taking a long time.  It tries to add a back channel pointer from nsTimerImpl to the Entry so we can just directly tell the Entry to drop its ref.

There is some bug, however, as its currently crashing.  I guess probably when the nsTArray is realloc'd and all the pointers move.
(Reporter)

Updated

7 months ago
Attachment #8821019 - Attachment is patch: true
Argh, let's just get rid of the timer thread.
(Reporter)

Comment 8

7 months ago
(In reply to Benjamin Smedberg [:bsmedberg] from comment #7)
> Argh, let's just get rid of the timer thread.

Using OS timer APIs would be even better yes, but implementing that for all platforms will take a larger effort.  I expect there to be compat issues across OS versions, etc.  (I thought perhaps cross-platform compat was a reason we had the timer thread in the first place.)

Also we will always want to have a cross-platform fallback in the timer thread.
I'm not convinced we need a platform-specific timer impl. The real effect of a timer is that it wakes up the event loop at a particular time. If when we waitForEvent we set a timeout at the next time a timer is scheduled, we'll just wake up and we can then notice that it's time to fire the timer. No extra runnables are necessary.
That sounds reasonable, but also a rather large change.

Even if we implemented that, though, it would not remove the need for the data structure I am modifying here.  We still need a list of scheduled timers in a data structure that can efficiently tell us the next deadline.  We also need this list to support arbitrary cancellation.

So I still would like to implement the binary heap here.  Happy to see a separate bug about removing Timer thread.
(In reply to Ben Kelly [Food Coma / PTO / back Jan 3 :bkelly] from comment #10)
> So I still would like to implement the binary heap here.  Happy to see a
> separate bug about removing Timer thread.

We have a separate bug about removing the timer thread (or rewriting timers entirely, I forget how it was phrased), but I can't find it atm...
I still think native timers might be worth it.  Its not clear how well gecko-level timer solutions work with mobile devices that enter low-power modes.  The native timers are probably more battery efficient and also more accurate when waking from low-power mode.  That's just a theory, though.
Created attachment 8833677 [details] [diff] [review]
P4 Make nsITimer::Cancel() O(c). r=froydnj

This stops the crashing and should make Cancel() efficient.  I need to rebuild without DEBUG to evaluate the perf impact.
Attachment #8821019 - Attachment is obsolete: true
So, I tried running these patches combined with bug 1336598 and bug 1336822 on this test:

https://people-mozilla.org/~bkelly/timer-flood/index.html

These patches let us execute about 5x more timers in the same amount of time.  Without these patches we execute about 3000 timers per second.  With these patches we execute 15k to 20k timers per second.

Not that we need to optimize for throughput during a timer flood, but it demonstrates a significant reduction in overhead from managing the timer list.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7c5bee010bdad3e7ead3c1598780c6aaba40db0a
Created attachment 8833798 [details] [diff] [review]
P3 Sort TimerThread list as a binary heap. r=froydnj

Updated patch to not return -1 in a bool returning method.  Also, I made AppendElement() when inserting a timer fallible again.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4ade033a0b28c9dff28ce97505bdc58140700a65
Attachment #8821016 - Attachment is obsolete: true
Created attachment 8833800 [details] [diff] [review]
P4 Make nsITimer::Cancel() O(c). r=froydnj

Add some comments to try to document the structure.
Attachment #8833677 - Attachment is obsolete: true
Comment on attachment 8820998 [details] [diff] [review]
P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj

This is a refactor patch in preparation for later patches.  It basically makes TimerThread use RefPtr<> in the mTimers list instead of manually calling AddRef()/Release().
Attachment #8820998 - Flags: review?(nfroyd)
Comment on attachment 8820999 [details] [diff] [review]
P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj

This modifies the mTimers list to use an Entry struct instead of storing nsTimerImpl refs directly.  The purpose here is to let us remove the nsTimerImpl ref on cancel without modifying the list itself.  This is important for a couple reasons:

1) It gets us closer to our goal of O(c) operations because dropping the ref is constant time.
2) The heap data structure only allows accessing the item at the front of the list.  Operations involve moving the front item to the back and calling std::pop_heap().  To append to the heap you add an item to the end of the list and then use std::push_heap().  It does not support removing an item from the middle of the heap.

The main tradeoff here is we must burn through any empty canceled Entry structures during our TimerThread run loop.  This is a minimal cost, though, compared to removing entries in the middle of a large sorted data structure.
Attachment #8820999 - Flags: review?(nfroyd)
Comment on attachment 8833798 [details] [diff] [review]
P3 Sort TimerThread list as a binary heap. r=froydnj

This patch modifies the mTimers list to be a binary heap data structure as defined by std::push_heap() and std::pop_heap().
Attachment #8833798 - Flags: review?(nfroyd)
Comment on attachment 8833800 [details] [diff] [review]
P4 Make nsITimer::Cancel() O(c). r=froydnj

This patch addresses the last O(n) operation.  In order to drop the nsTimerImpl ref from the mTimers Entry struct we still need to iterate over the mTimers list to find it.

This patch modifies nsTimerImpl to have a reference directly to a Holder object.  It can then inform that Holder to forget the timer object.  The mTimers Entry owns a Holder object instead of an nsTimerImpl.  So in nsTimerImpl Cancel() can simply ask the Holder to forget it without any iteration of the list.

Note, we must use a separate Holder object instead of the Entry structure itself.  Since the Entry are stored in the nsTArray by value they can be moved by realloc.  Therefore we store a separate allocated Holder that lives at a stable place in memory.
Attachment #8833800 - Flags: review?(nfroyd)
Comment on attachment 8820998 [details] [diff] [review]
P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj

Review of attachment 8820998 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you.  Less NS_ADDREF/RELEASE cannot be a bad thing.
Attachment #8820998 - Flags: review?(nfroyd) → review+
Comment on attachment 8820999 [details] [diff] [review]
P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj

Review of attachment 8820999 [details] [diff] [review]:
-----------------------------------------------------------------

I think this makes sense.

::: xpcom/threads/TimerThread.cpp
@@ +525,5 @@
>  
> +      // skip any canceled timers
> +      while(!mTimers.IsEmpty() && !mTimers[0].mTimerImpl) {
> +        mTimers.RemoveElementAt(0);
> +      }

I think it'd be better to write this loop as:

// See how many timers at the front of the list have been canceled.
size_t firstActive = 0;
while (firstActive < mTimers.Length() && !mTimers[firstActive].mTimerImpl) {
  firstActive++;
}

if (firstActive != 0) {
  mTimers.RemoveElementsAt(0, firstActive);
}

I see that it's going to be replaced in later patches, but we should try to use the "biggest" possible calls into nsTArray.
Attachment #8820999 - Flags: review?(nfroyd) → review+
Comment on attachment 8820999 [details] [diff] [review]
P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj

Review of attachment 8820999 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/threads/TimerThread.cpp
@@ +465,5 @@
>  
> +      // skip any canceled timers
> +      while(!mTimers.IsEmpty() && !mTimers[0].mTimerImpl) {
> +        mTimers.RemoveElementAt(0);
> +      }

Same comment from the previous review goes for this loop as well.
Comment on attachment 8833798 [details] [diff] [review]
P3 Sort TimerThread list as a binary heap. r=froydnj

Review of attachment 8833798 [details] [diff] [review]:
-----------------------------------------------------------------

Some discussion required, see below.  I don't think the idea is bad, though.

::: xpcom/threads/TimerThread.cpp
@@ +470,1 @@
>        }

Same comments on this loop apply; we should remove things from the heap and then delete a range from the array.

@@ +530,1 @@
>        }

Here too.

::: xpcom/threads/TimerThread.h
@@ +91,5 @@
>  
> +    Entry(Entry&& aOther)
> +      : mTimeout(aOther.mTimeout)
> +      , mTimerImpl(mozilla::Move(aOther.mTimerImpl))
> +    { }

These are just the default behavior, right?  If we really want these, can we just say |= default;|?

@@ +104,5 @@
>      bool operator<(const Entry& aRight) const
>      {
> +      // Reverse logic since we are inserting into a max heap
> +      // that sorts the "largest" value to index 0.
> +      return mTimeout > aRight.mTimeout;

Can we just make this a function object that we pass in to any call to pop_heap or similar?  That would make the behavior a little more obvious at the call sites.
Attachment #8833798 - Flags: review?(nfroyd) → feedback+
(In reply to Nathan Froyd [:froydnj] from comment #25)
> ::: xpcom/threads/TimerThread.cpp
> @@ +470,1 @@
> >        }
> 
> Same comments on this loop apply; we should remove things from the heap and
> then delete a range from the array.

I could make a separate RemoveCanceledTimers() method that does its own internal std::pop_heap and RemoveElementsAt().  I am hesitant to try to track a separate "last valid item" in mTimers in order to support a state where the array is not fully sorted.

> @@ +104,5 @@
> >      bool operator<(const Entry& aRight) const
> >      {
> > +      // Reverse logic since we are inserting into a max heap
> > +      // that sorts the "largest" value to index 0.
> > +      return mTimeout > aRight.mTimeout;
> 
> Can we just make this a function object that we pass in to any call to
> pop_heap or similar?  That would make the behavior a little more obvious at
> the call sites.

I can do this.  It would also allow me to make the Entry objects heap allocated instead of stored in the nsTArray by value.  This would avoid the need for a separate Holder class in the P4 patch.
(Reporter)

Updated

5 months ago
Attachment #8833800 - Flags: review?(nfroyd)
Comment on attachment 8833800 [details] [diff] [review]
P4 Make nsITimer::Cancel() O(c). r=froydnj

Review of attachment 8833800 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing review, since comments on previous patches will change this a bit.

::: xpcom/threads/TimerThread.h
@@ +143,5 @@
>  
>      Entry(const TimeStamp& aMinTimeout, const TimeStamp& aTimeout,
>            nsTimerImpl* aTimerImpl)
> +      : mTimeout(std::max(aMinTimeout, aTimeout))
> +      , mHolder(new Holder(aTimerImpl))

mozilla::MakeUnique<Holder>(aTimerImpl), please.

@@ +162,5 @@
>      }
>  
> +    ~Entry()
> +    {
> +    }

This could just be |= default;|?

::: xpcom/threads/nsTimerImpl.h
@@ +43,5 @@
> +class nsTimerImplHolder
> +{
> +public:
> +  virtual void
> +  Forget(nsTimerImpl* aTimerImpl) = 0;

AFAICT, we only have one inheritor of this interface, so maybe it doesn't need to be virtual?
Created attachment 8834204 [details] [diff] [review]
P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj

Updated this patch to remove ReleaseTimerInternal() declaration from the header.  I removed the implementation but forgot to nuke the declaration before.
Attachment #8820998 - Attachment is obsolete: true
Attachment #8834204 - Flags: review+
Created attachment 8834205 [details] [diff] [review]
P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj

Address review feedback adding a new method:

  TimerThread::RemoveLeadingCanceledTimersInternal()

This does the bulk remove using RemoveElementsAt().
Attachment #8820999 - Attachment is obsolete: true
Attachment #8834205 - Flags: review+
Created attachment 8834216 [details] [diff] [review]
3 Sort TimerThread list as a binary heap. r=froydnj

This adds the bulk removal of canceled timers to the heap sorting patch.  I am going to do the separate comparator function in a separate patch.
Attachment #8833798 - Attachment is obsolete: true
I need to rebase on bug 1328643.
Depends on: 1328643
Created attachment 8834245 [details] [diff] [review]
P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj

Rebase.
Attachment #8834204 - Attachment is obsolete: true
Attachment #8834245 - Flags: review+
(Reporter)

Updated

5 months ago
Attachment #8834245 - Attachment description: bug1325254_p1_timerrefptrlist.patch → P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj
Created attachment 8834246 [details] [diff] [review]
P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj

Rebase
Attachment #8834205 - Attachment is obsolete: true
Attachment #8834246 - Flags: review+
Created attachment 8834247 [details] [diff] [review]
P3 Sort TimerThread list as a binary heap. r=froydnj

Rebase.
Attachment #8834216 - Attachment is obsolete: true
Created attachment 8834248 [details] [diff] [review]
P4 Dynamically allocate Entry structs stored in TimerThread::mTimers. r=froydnj

This patch moves the Entry to its own dynamically allocated structure with a separate comparator func.
Created attachment 8834249 [details] [diff] [review]
P5 Make nsITimer::Cancel() O(c). r=froydnj

Implement the Holder strategy using the dynamic Entry objects.
Attachment #8833800 - Attachment is obsolete: true
The rebased patches seem to leak.  I must have missed some change in bug 1328643 that I need to account for.  I'm too tired to figure it out for now.  Lets see what try says overnight:

ttps://treeherder.mozilla.org/#/jobs?repo=try&revision=9f4bcd6eb09be329a4ecc3afe124fab8638c56ba
Actually, it seems its not leaking.  The memory does go away when I stop the timer flood and force memory minimization.
We do seem to have more lock contention after bug 1328643.  I may need to tune the setTimeout() back pressure mechanism again.
Created attachment 8834408 [details] [diff] [review]
P6 Improve nsTimerImpl locking and remove TimerThread::RemoveTimer(). r=froydnj

This patch removes the need to lock the TimerThread mutex when canceling an nsTimerImpl.  This is no longer necessary now that we can just call mHolder->Forget().

This does remove a wakeup of TimerThread that was in RemoveTimer(), but I think thats ok.  Our strategy is to leave canceled timers in the mTimers heap until they are drained by the normal timer thread operation.  So no need to immediately wakeup here.

Also, this patch fixes some locking around mHolder.  This was probably unsafe in the P5 patch.  Unfortunately this did require switching to a ReentrantMonitor in nsTimerImpl.  We can call SetHolder() from a number of paths.  Some already hold the lock and others don't.  Using the ReentrantMonitor was the only reasonable way I could see to make this work.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9a78673864d177cd82b96451a50ae448cca2bb3f
(In reply to Ben Kelly [:bkelly] from comment #40)
> Also, this patch fixes some locking around mHolder.  This was probably
> unsafe in the P5 patch.  Unfortunately this did require switching to a
> ReentrantMonitor in nsTimerImpl.  We can call SetHolder() from a number of
> paths.  Some already hold the lock and others don't.  Using the
> ReentrantMonitor was the only reasonable way I could see to make this work.

I haven't looked at the patch, but:

1. Yay for locking fixes;
2. Boo for ReentrantMonitor.  I really do not want to start using one of these in the timer code; it's less efficient compared to a plain Mutex and we really ought to be able to do better.
Hmm, but the P6 removal of the TimerThread mMutex lock now allows us to cancel a timer in the middle of its processing.  I'll have to think about this more.
(In reply to Nathan Froyd [:froydnj] from comment #42)
> 2. Boo for ReentrantMonitor.  I really do not want to start using one of
> these in the timer code; it's less efficient compared to a plain Mutex and
> we really ought to be able to do better.

Well, a lot of this is due to the incestuous nature between nsTimerImpl and TimerThread.  They call back and forth between themselves in cycles so much its hard to be clear about what will re-enter or not.  Anyway, I have to think about the P6 patch more anyways.
I'll need to come back to this later tonight or this week.  So I can remember, here is the stack for the comment 43 issue:

* thread #10: tid = 0x188bed, 0x000000010245af1b XUL`TimerThread::Run() [inlined] RefPtr<nsTimerImpl>::get() const + 4 at RefPtr.h:283, name = 'Timer', stop reason = EXC_BAD_ACCESS (code=1, address=0x28)
  * frame #0: 0x000000010245af1b XUL`TimerThread::Run() [inlined] RefPtr<nsTimerImpl>::get() const + 4 at RefPtr.h:283 [opt]
    frame #1: 0x000000010245af17 XUL`TimerThread::Run() [inlined] RefPtr<nsTimerImpl>::operator nsTimerImpl*() const & at RefPtr.h:296 [opt]
    frame #2: 0x000000010245af17 XUL`TimerThread::Run() [inlined] TimerThread::Entry::Value(this=<unavailable>) const at TimerThread.h:118 [opt]
    frame #3: 0x000000010245af17 XUL`TimerThread::Run(this=<unavailable>) + 231 at TimerThread.cpp:465 [opt]
    frame #4: 0x0000000102465abb XUL`nsThread::ProcessNextEvent(this=<unavailable>, aMayWait=<unavailable>, aResult=0x00007000004abdb7) + 1067 at nsThread.cpp:1261 [opt]
    frame #5: 0x0000000102464ba3 XUL`NS_ProcessNextEvent(aThread=<unavailable>, aMayWait=true) + 51 at nsThreadUtils.cpp:389 [opt]
    frame #6: 0x00000001028afd3a XUL`mozilla::ipc::MessagePumpForNonMainThreads::Run(this=<unavailable>, aDelegate=<unavailable>) + 266 at MessagePump.cpp:368 [opt]
    frame #7: 0x0000000102883c19 XUL`MessageLoop::Run() [inlined] MessageLoop::RunInternal(this=<unavailable>) + 73 at message_loop.cc:238 [opt]
    frame #8: 0x0000000102883c0a XUL`MessageLoop::Run() [inlined] MessageLoop::RunHandler() at message_loop.cc:231 [opt]
    frame #9: 0x0000000102883c0a XUL`MessageLoop::Run(this=<unavailable>) + 58 at message_loop.cc:211 [opt]
    frame #10: 0x0000000102463cff XUL`nsThread::ThreadFunc(aArg=<unavailable>) + 351 at nsThread.cpp:494 [opt]
    frame #11: 0x000000010213b7ea libnss3.dylib`_pt_root(arg=0x0000000100735790) + 218 at ptthread.c:216 [opt]
    frame #12: 0x00007fff840c199d libsystem_pthread.dylib`_pthread_body + 131
    frame #13: 0x00007fff840c191a libsystem_pthread.dylib`_pthread_start + 168
    frame #14: 0x00007fff840bf351 libsystem_pthread.dylib`thread_start + 13
To fix the crash I think we will need to have some way of doing:

 "get the first valid timer out of mTimers and return it to me while holding the nsTimerImpl lock"

We can then operate on the nsTimerImpl while holding its lock and know that it won't get canceled out from under us.

My idea for doing this would look something like this:

  nsTimerImpl::LockedRef firstTimer;

  RemoveLeadingCanceledTimers(firstTimer);

  if (!firstTimer) {
    // no active timers
  } else {
    // There is an active timer, firstTimer has a strong ref to it, and we hold its lock.

    // When firstTimer goes out of scope it unlocks the timer and drops its strong ref to
    // the timer.
  }
Created attachment 8838898 [details] [diff] [review]
P6 Improve nsTimerImpl locking and remove TimerThread::RemoveTimer(). r=froydnj

This gets rid of the ReentrantMonitor from the previous version of the path.  I fixed the issue with reentrant locking by exposing a separate SetHolderLocked() method for paths we know are called with the lock already held.

I still need to sort out locking within the Entry object here, though.  This patch will crash periodically in its current form.
Attachment #8834408 - Attachment is obsolete: true
Created attachment 8838900 [details] [diff] [review]
wip

Work-in-progress on my LockedRef concept.
(Reporter)

Updated

5 months ago
Attachment #8838900 - Attachment is patch: true
Created attachment 8839013 [details]
wip

Another work-in-progress patch.  This should solve the crashing issues, but probably needs some cleanup.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4f153123c10ec52838321c724d10e4491dce2e35
Attachment #8838900 - Attachment is obsolete: true
(In reply to Nathan Froyd [:froydnj] from comment #27)
> > +    ~Entry()
> > +    {
> > +    }
> 
> This could just be |= default;|?

The destructor is populated with code in later patches in this bug, so I did not bother to mark it `= default` in this patch.

> > +  virtual void
> > +  Forget(nsTimerImpl* aTimerImpl) = 0;
> 
> AFAICT, we only have one inheritor of this interface, so maybe it doesn't
> need to be virtual?

So, I really dislike circular references between classes using the friend keyword.  I think having a virtual method is a small price to pay for having a cleaner interface between TimerThread and nsTimerImpl.  If I could separate them more, I would.
Comment on attachment 8834247 [details] [diff] [review]
P3 Sort TimerThread list as a binary heap. r=froydnj

This patch has been relatively stable and offers an incremental improvement.  It sorts the timer list in a binary heap so insertions and removals are O(log n).  I tried to address previous review feedback by using bulk array removals, etc.
Attachment #8834247 - Flags: review?(nfroyd)
Comment on attachment 8834248 [details] [diff] [review]
P4 Dynamically allocate Entry structs stored in TimerThread::mTimers. r=froydnj

This is a refactoring patch that makes the timer list store UniquePtr<Entry> instead of Entry value objects.  This is necessary for the follow-up P5 patch where we need a stable address location for the Entry object.
Attachment #8834248 - Flags: review?(nfroyd)
Comment on attachment 8834249 [details] [diff] [review]
P5 Make nsITimer::Cancel() O(c). r=froydnj

This patch makes cancellation O(c).  Instead of iterating through the timer list looking for the Entry to clear, we now can directly clear the Entry through a callback pointer from the nsTimerImpl itself.

Note, this is still not ideal since we lock the entire list by calling through TimerThread::RemoveTimer().  This protects the Entry datastructure from multi-thread access, but its a very course grained lock.  The P6 patch will attempt to improve this locked.  Or perhaps we could improve the locking in a separate follow-up bug.

For now, though, this patch does offer an improvement by reducing the algorithmic complexity involved.
Attachment #8834249 - Flags: review?(nfroyd)
Try build for update to P5 patch:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=aca304e4a17a2ca63275a47fd1ff8b6e6e182ee2
Created attachment 8839016 [details] [diff] [review]
wip

Try build including the new locking fixes:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=92fb716d471c0fa0ed2cc1b88da782a50dbf05c2
(Reporter)

Updated

5 months ago
Attachment #8839013 - Attachment is obsolete: true
Yea, I think the P6 patch is a step to far.  But the changes up to the P5 patch seem stable and are still an improvement.
Attachment #8834247 - Flags: review?(nfroyd) → review+
Attachment #8834248 - Flags: review?(nfroyd) → review+
Comment on attachment 8834249 [details] [diff] [review]
P5 Make nsITimer::Cancel() O(c). r=froydnj

Review of attachment 8834249 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay.  I think I thought this patch was harder to review that it actually was.

I like where this is going, just one concern below.

::: xpcom/threads/nsTimerImpl.h
@@ +44,5 @@
> +class nsTimerImplHolder
> +{
> +public:
> +  virtual void
> +  Forget(nsTimerImpl* aTimerImpl) = 0;

I'm not wild about having an abstract class that is essentially an implementation detail of a single client (TimerThread).  What's wrong with simply forward-declaring nsTimerImplHolder here, and then leaving the implementation details up to TimerThread.{h,cpp}?  You might want to rename it at that point, since it'd have to be the name of the concrete class used in TimerThread, but otherwise there shouldn't be any issues, since nsTimerImpl doesn't actually manipulate nsTimerImplHolder objects in any interesting way.

Ah, I guess P6 does some manipulation of the holder from within nsTimerImpl.  Well, OK, we could be gross and say:

class nsTimerImplHolder
{
public:
  void Forget(nsTimerImpl*);
};

// In some .cpp:
void
nsTimerImplHolder::Forget(nsTimerImpl* aTimerImpl)
{
  static_cast<SUBCLASS*>(this)->Forget(aTimerImpl);
}

Which is a little gross, but at least we're not paying the overhead of the vtable, virtual method calls, etc. etc.
Attachment #8834249 - Flags: review?(nfroyd) → feedback+
(Assignee)

Updated

3 months ago
Blocks: 1354616
(Assignee)

Updated

3 months ago
Flags: needinfo?(bzbarsky)
Whiteboard: [qf:p1]
Discovered today that the patch set requires massive rebasing. :(
(Assignee)

Comment 59

3 months ago
Created attachment 8859749 [details] [diff] [review]
Part 5 updated to review comments
Attachment #8859749 - Flags: review?(nfroyd)
(Assignee)

Updated

3 months ago
Assignee: bkelly → bzbarsky
(Assignee)

Updated

3 months ago
Flags: needinfo?(bzbarsky)
Comment on attachment 8859749 [details] [diff] [review]
Part 5 updated to review comments

Review of attachment 8859749 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8859749 - Flags: review?(nfroyd) → review+

Comment 61

3 months ago
Pushed by bzbarsky@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4d5f549491e7
P1 Make TimerThread::mTimers store RefPtr<nsTimerImpl> objects. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/29679296c946
P2 Make TimerThread list store an entry struct and just drop nsTimerImpl ref on cancel. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/a501885ead35
P3 Sort TimerThread list as a binary heap. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/4b6b6aad7688
P4 Dynamically allocate Entry structs stored in TimerThread::mTimers. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/030533bed090
P5 Make nsITimer::Cancel() O(c). r=froydnj
https://hg.mozilla.org/mozilla-central/rev/4d5f549491e7
https://hg.mozilla.org/mozilla-central/rev/29679296c946
https://hg.mozilla.org/mozilla-central/rev/a501885ead35
https://hg.mozilla.org/mozilla-central/rev/4b6b6aad7688
https://hg.mozilla.org/mozilla-central/rev/030533bed090
Status: ASSIGNED → RESOLVED
Last Resolved: 3 months ago
status-firefox55: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.