Closed Bug 1350177 Opened 3 years ago Closed 3 years ago

Define ThreadSafeExpirationTracker for the use in SurfaceCache

Categories

(Core :: MFBT, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: bevis, Assigned: bevis)

References

Details

(Whiteboard: [QDL][TDC-MVP][DOM])

Attachments

(4 files, 2 obsolete files)

This is a follow-up of bug 1345464 comment 21.
Given the truth that ExpirationTracker is not a thread-safe class but is widely used by SurfaceCache on multiple threads.
To prevent things getting worse after the change in bug 1345464, we should make this tracker class thread-safe first.
Note: Changes in SurfaceCache would be expected to fix this thread-safety problem.

More concrete suggestion/idea is welcome! :)
One suggestion we have so far from bug 1297111 comment 15 is to use the mutex from the SurfaceCache in nsExpirationTracker by passing it to the constructor but we will have to replace this mutex with a ReentrantMonitor since reentry is possible according to the test result in bug 1297111 comment 14.
Hi Nathan, Timothy,

I hit a problem which prevents me to share the same lock between SurfaceCache & nsExpirationTracker in [1]:
- A leakage of ReentrantMonitor which was declared statically.

In patch part 1 of [1], an optional ReentrantMonitor is allowed to pass into nsExpirationTracker to ensure the thread-safety of the use cases between SurfaceCache & nsExiprationTracker.
(We need ReentrantMonitor instead of Mutex, because reentry is possible between SurfaceCache & nsExpirationTracker.)
However, in patch part 2 where the StaticMutex was replaced with ReentrantMonitor, we will caught by leakcheck because MOZ_COUNT_CTOR is used by the ctor of ReentrantMonitor.
IMO, a static lock is required to protect the access of the global variable of StaticRefPtr<SurfaceCacheImpl> sInstance.
a lazy initalized lock, i.g. StaticAutoPtr<ReentrantMonitor> is not helpful if sInstance is initiated later then the static methods of SurfaceCache is called.

I have 2 solutions so far to fix this problem:
1. Passing the StaticMutex reference carefully in each function call of SurfaceCache and nsExpirationTracker. 
   This is not a good option to me for the maintenance in the future.
2. Create an *OffTheBooks* version ReentrantMonitor for this.

Is there any better suggestion to resolve this problem?

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=9007507fc166aa78737d02ef07b25a82c92f6cda
Flags: needinfo?(tnikkel)
Flags: needinfo?(nfroyd)
If its guarantee that the all the access to SurfaceCache are only done after SurfaceCache::Initialize() and before SurfaceCache::Shutdown(), then we could define a global variable of Atomic<bool> at the beginning of each static SurfaceCache method then move the ReentrantMontor into SurfaceCacheImpl just for the thread-safety of its inner data.
However, the use of static mutex seems not this case.
(In reply to Bevis Tseng[:bevistseng][:btseng] from comment #2)
> (We need ReentrantMonitor instead of Mutex, because reentry is possible
> between SurfaceCache & nsExpirationTracker.)

Is this only because of acquiring the mutex in NotifyExpired here:

https://dxr.mozilla.org/mozilla-central/rev/e03e0c60462c775c7558a1dc9d5cf2076c3cd1f9/image/SurfaceCache.cpp#901

That should be changed to an assert that the current thread holds the mutex (the expiration tracker should hold it automatically when calling NotifyExpired).

Or is there another place where we try to re-enter? Maybe we can remove that.
Flags: needinfo?(tnikkel)
(In reply to Timothy Nikkel (:tnikkel) from comment #4)
> (In reply to Bevis Tseng[:bevistseng][:btseng] from comment #2)
> > (We need ReentrantMonitor instead of Mutex, because reentry is possible
> > between SurfaceCache & nsExpirationTracker.)
> 
> Is this only because of acquiring the mutex in NotifyExpired here:
> 
> https://dxr.mozilla.org/mozilla-central/rev/
> e03e0c60462c775c7558a1dc9d5cf2076c3cd1f9/image/SurfaceCache.cpp#901
> 
> That should be changed to an assert that the current thread holds the mutex
> (the expiration tracker should hold it automatically when calling
> NotifyExpired).
> 
> Or is there another place where we try to re-enter? Maybe we can remove that.

The nsExiprationTracker itself is reentrant:
   - AddObject -> CheckStartTimer()
   - AgeOneGeneration -> NotifyExpired()
   - MarkUsed -> RemoveObject()/AddObject()
   - AgeAllGenerations -> AgeOneGeneration()
So, if we use mutex instead of ReentrantMonitor for SurfaceCache, nsExpirationTracker will be blocked. :(
NI for comment 5.
Flags: needinfo?(tnikkel)
(In reply to Bevis Tseng[:bevistseng][:btseng] from comment #5)
> The nsExiprationTracker itself is reentrant:
>    - AddObject -> CheckStartTimer()
>    - AgeOneGeneration -> NotifyExpired()
>    - MarkUsed -> RemoveObject()/AddObject()
>    - AgeAllGenerations -> AgeOneGeneration()
> So, if we use mutex instead of ReentrantMonitor for SurfaceCache,
> nsExpirationTracker will be blocked. :(

This is not the usual definition of reentrancy that calls for ReentrantMonitor.  The usual problem is:

nsExpirationTracker::SomeFunction -> external code ... -> external code -> nsExpirationTracker::SomeOtherFunction

And you can't necessarily know in SomeOtherFunction that the mutex is already taken.

If we're just concerned about multiple entry points needing to be locked, and some of those entry points call other entry points, we can separate things into internal functions where the mutex is already locked (preferably with assertions and/or |const MutexAutoLock&| parameters) and external functions that grab the mutex.
Per discussion offline with Nathan,
it will be better to provide a thread-safe version nsExpirationTracker for SurfaceCache which requires a MutexAutoLock reference of all the nsExiprationTracker function and keep the original one as main-thread only tracker for other instances.
Flags: needinfo?(tnikkel)
Flags: needinfo?(nfroyd)
Summary: Make nsExpirationTracker thread-safe → Define ThreadSafeExpirationTracker for the use in SurfaceCache
1. Define nsExpirationTracker::EventHandler to allow ThreadSafeExpirationTracker
   to acquire the mutex lock before calling AgeAllGenerations() when timeout or
   memory-pressure is notified.
2. Define ThreadSafeExpirationTracker as a composition class of nsExpirationTracker
   to define a similiar API which always requires a AutoMutexLock be acquired
   before being called.
3. Make sure that the TimerCallback will be fired on the main thread.

Nathan, may I have your review for these change?

Thanks!
Attachment #8852000 - Flags: review?(nfroyd)
Always acquire StaticMutexAutoLock for the use of ThreadSafeExpirationTracker.

Hi Nathan, Timothy,

May I have your review for this change in SurfaceCache?

Thanks!
Attachment #8852002 - Flags: review?(tnikkel)
Attachment #8852002 - Flags: review?(nfroyd)
Comment on attachment 8852000 [details] [diff] [review]
(v1) Patch: Part 1: Define ThreadSafeExpirationTracker.

Review of attachment 8852000 [details] [diff] [review]:
-----------------------------------------------------------------

This looks like a good start.  New classes need documentation, though.  I've also added a few comments below.

I think it works slightly better to have things laid out like:

template<typename T, uint32_t K, typename Mutex, typename AutoLock>
class ExpirationTrackerImpl
{
public:
  nsresult AddObjectLocked(T*, const AutoLock&);
  nsresult RemoveObjectLocked(T*, const AutoLock&);
  ...and so forth.
  virtual void NotifyExpiredLocked(T*, const AutoLock&) = 0;
};

namespace detail {

class PlaceholderLock {
public:
  void Lock() {}
  void Unlock() {}
};

class PlaceholderAutoLock {
public:
  PlaceholderAutoLock(PlaceholderLock&) = default;
  ~PlaceholderAutoLock = default;

  
};

template<typename T, uint32_t K>
using SingleThreadedExpirationTracker = ExpirationTrackerImpl<T, K, PlaceholderLock, PlaceholderAutoLock>
} // namespace detail

template<typename T, uint32_t K>
class nsExpirationTracker : protected ::detail::SingleThreadedExpirationTracker<T, K>
{
  typedef ::detail::PlaceholderLock Lock;
  typedef ::detail::PlaceholderAutoLock AutoLock;

  AutoLock FakeLock() { Lock l; return AutoLock(l); }

public:
  nsExpirationTracker(uint32_t aTimerPeriod, const char* aName)
    : ::detail::SingleThreadedExpirationTracker<T, K>
  { ... }

  nsresult AddObject(T* aObject)
  {
    // We might not be able to add these assertions, depending on how well-behaved current clients
    // of nsExpirationTracker are.
    MOZ_ASSERT(NS_IsMainThread());
    return AddObjectLocked(aObject, FakeLock());
  }

  ...and so forth.

  // This setup is not perfect, because it incurs two virtual calls instead of one, and because
  // people can still call NotifyExpired() inside the inheriting class.  We could get rid of the
  // former problem with CRTP.  But the former problem is something we're just stuck with.  Maybe
  // we could call it NotifyExpiredMainThreadOnly to encourage people to do the right thing.
  virtual void NotifyExpired(T*) = 0;
  void NotifyExpiredLocked(T* aObject, const AutoLock&) override { NotifyExpired(aObject); }
};

and then SurfaceTracker would be defined thusly:

  class SurfaceTracker : public ExpirationTrackerImpl<..., ..., StaticMutex, StaticMutexAutoLock>

The overhead in the single-threaded implementation should be negligible, and we don't have extra overhead in the multi-threaded implementation.  WDYT?

::: xpcom/ds/nsExpirationTracker.h
@@ +80,5 @@
>    /**
> +   * A eventHandler class for ThreadSafeExpirationTracker to acquire lock before
> +   * handling these events.
> +   */
> +  class EventHandler

This class shouldn't be public.

@@ +95,5 @@
>     * period is zero, then we don't use a timer and rely on someone calling
>     * AgeOneGeneration explicitly.
>     */
> +  nsExpirationTracker(uint32_t aTimerPeriod, const char* aName,
> +                      EventHandler* aEventHandler)

This constructor shouldn't be public.

@@ +298,5 @@
> +  void HandleLowMemory() {
> +    AgeAllGenerations();
> +  }
> +
> +  void HandleTimeout() {

These functions shouldn't be public.

@@ +466,5 @@
> +  };
> +
> +  void OnTimeout() override
> +  {
> +    AutoLock lock(Lock());

This doesn't seem correct: Lock() returns a reference, and then you're constructing a new mutex from that reference via copy construction.  That's not correct in the general case.

I see that StaticMutex doesn't prohibit this the way our other mutex classes do.  It should. :(  This is bug 1351372.

@@ +528,5 @@
> +    AutoLockHolder holder(aAutoLock, this);
> +    return mTracker.IsEmpty();
> +  }
> +
> +  class MOZ_STACK_CLASS Iterator

What is this for?  It appears to be unused.
Attachment #8852000 - Flags: review?(nfroyd)
Comment on attachment 8852002 [details] [diff] [review]
(v1) Part 2: Refactor SurfaceCache with ThreadSafeExpirationTracker.

Review of attachment 8852002 [details] [diff] [review]:
-----------------------------------------------------------------

This is OK, but it may require some changes depending on what we decide for part 1.
Attachment #8852002 - Flags: review?(nfroyd) → review+
(In reply to Nathan Froyd [:froydnj] from comment #12)
> The overhead in the single-threaded implementation should be negligible, and
> we don't have extra overhead in the multi-threaded implementation.  WDYT?
Thanks for better suggestion.
I'll give it a trial in next update.
> @@ +466,5 @@
> > +  };
> > +
> > +  void OnTimeout() override
> > +  {
> > +    AutoLock lock(Lock());
> 
> This doesn't seem correct: Lock() returns a reference, and then you're
> constructing a new mutex from that reference via copy construction.  That's
> not correct in the general case.
> 
> I see that StaticMutex doesn't prohibit this the way our other mutex classes
> do.  It should. :(  This is bug 1351372.
>
Sorry, I didn't get your point because what I am doing here is to acquire a StaticMutexAutoLock instead of StaticMutexLock.
IMO, the OnTimeout()/onLowMemory() are new starting points of current thread to access
this tracker compared to other call paths from SurfaceCache::Xxx().

Did I misunderstand anything?

Set NI for this question.

> @@ +528,5 @@
> > +    AutoLockHolder holder(aAutoLock, this);
> > +    return mTracker.IsEmpty();
> > +  }
> > +
> > +  class MOZ_STACK_CLASS Iterator
> 
> What is this for?  It appears to be unused.
I'll remove it.
Flags: needinfo?(nfroyd)
(In reply to Bevis Tseng[:bevistseng][:btseng] from comment #14)
> > > +  void OnTimeout() override
> > > +  {
> > > +    AutoLock lock(Lock());
> Sorry, I didn't get your point because what I am doing here is to acquire a
> StaticMutexAutoLock instead of StaticMutexLock.
> IMO, the OnTimeout()/onLowMemory() are new starting points of current thread
> to access
> this tracker compared to other call paths from SurfaceCache::Xxx().
> 
> Did I misunderstand anything?
> 
> Set NI for this question.

I guess its the variable name "lock" instead of "autoLock" that confused you.
Comment on attachment 8852002 [details] [diff] [review]
(v1) Part 2: Refactor SurfaceCache with ThreadSafeExpirationTracker.

This looks fine. I'll review in detail when it's in a more final form.
Attachment #8852002 - Flags: review?(tnikkel) → feedback+
(In reply to Bevis Tseng[:bevistseng][:btseng] from comment #14)
> Sorry, I didn't get your point because what I am doing here is to acquire a
> StaticMutexAutoLock instead of StaticMutexLock.
> IMO, the OnTimeout()/onLowMemory() are new starting points of current thread
> to access
> this tracker compared to other call paths from SurfaceCache::Xxx().
> 
> Did I misunderstand anything?

No, I totally misunderstood the meaning of the code.  My mistake!
Flags: needinfo?(nfroyd)
1. Define ExpirationTrackerImpl which requires AutoLock before calling Tracker APIs.
2. Move expiration logic from nsExpirationTracker to ExpirationTrackerImpl.
3. Keep nsExprationTracker un-thread-safe with fake Lock acquired.
Attachment #8852000 - Attachment is obsolete: true
Attachment #8852740 - Flags: review?(nfroyd)
Making ExpirationTrackerImpl<nsSHEntryShared, 3, ::detail::PlaceholderLock, ::detail::PlaceholderAutoLock> as a friend class seems tedious.

I'd like to make nsSHEntryShared::GetExpirationState() a public method instead.
Attachment #8852002 - Attachment is obsolete: true
Attachment #8852741 - Flags: review?(bugs)
Revice according to the change in patch part1.
Attachment #8852742 - Flags: review?(tnikkel)
Attachment #8852742 - Flags: review?(nfroyd)
Attachment #8852742 - Attachment description: (v2) Patch Part 3: Part 3: Refactor SurfaceCache with ExpirationTrackerImpl. → (v2) Patch Part 3: Refactor SurfaceCache with ExpirationTrackerImpl.
Attachment #8852742 - Flags: review?(tnikkel) → review?(aosmond)
Comment on attachment 8852742 [details] [diff] [review]
(v2) Patch Part 3: Refactor SurfaceCache with ExpirationTrackerImpl.

Review of attachment 8852742 [details] [diff] [review]:
-----------------------------------------------------------------

::: image/SurfaceCache.cpp
@@ +550,5 @@
>        MOZ_ASSERT(!mCosts.Contains(costEntry),
>                   "Shouldn't have a cost entry for a locked surface");
>      } else {
>        if (MOZ_LIKELY(aSurface->GetExpirationState()->IsTracked())) {
> +        mExpirationTracker.RemoveObjectLocked(aSurface, aAutoLock);

All you really want is to make sure we are holding the lock. I think it would be cleaner if the function prototypes were left unchanged, and instead we overload RemoveObject/AddObject/etc (no need for virtual, as SurfaceCache will always use the proper object type) in the multi-threaded nsExpirationTracker class. The overloaded methods will call GetMutex().AssertCurrentThreadOwns() and followed by the equivalent base class API.
Comment on attachment 8852742 [details] [diff] [review]
(v2) Patch Part 3: Refactor SurfaceCache with ExpirationTrackerImpl.

Review of attachment 8852742 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Andrew Osmond [:aosmond] from comment #22)
> All you really want is to make sure we are holding the lock. I think it
> would be cleaner if the function prototypes were left unchanged, and instead
> we overload RemoveObject/AddObject/etc (no need for virtual, as SurfaceCache
> will always use the proper object type) in the multi-threaded
> nsExpirationTracker class. The overloaded methods will call
> GetMutex().AssertCurrentThreadOwns() and followed by the equivalent base
> class API.
I see. I could make a change for this and request the review again.

Thanks!
Attachment #8852742 - Flags: review?(nfroyd)
Attachment #8852742 - Flags: review?(aosmond)
(In reply to Andrew Osmond [:aosmond] from comment #22)
> All you really want is to make sure we are holding the lock. I think it
> would be cleaner if the function prototypes were left unchanged, and instead
> we overload RemoveObject/AddObject/etc (no need for virtual, as SurfaceCache
> will always use the proper object type) in the multi-threaded
> nsExpirationTracker class. The overloaded methods will call
> GetMutex().AssertCurrentThreadOwns() and followed by the equivalent base
> class API.

Assuming that this comment implies that the thread-safe expiration tracker will have an identical API to the non-threaded one, I am strongly against it.  The whole point of providing |const AutoLock&| parameters is that the API is made virtually impossible to misuse.  I would much rather get compile-time errors that I forgot locks, rather than runtime errors on tests.  Having clarity on where locking is required has also helped catch bugs in areas of code where this style has been adopted, mostly of the "oh, we were calling this function that requires locks without actually holding one" variety.  You can argue that assertions should have been added, but adding function parameters ensures that you never have to worry about forgetting things.
(In reply to Nathan Froyd [:froydnj] from comment #24)
> Assuming that this comment implies that the thread-safe expiration tracker
> will have an identical API to the non-threaded one, I am strongly against
> it.  The whole point of providing |const AutoLock&| parameters is that the
> API is made virtually impossible to misuse.  I would much rather get
> compile-time errors that I forgot locks, rather than runtime errors on
> tests.  Having clarity on where locking is required has also helped catch
> bugs in areas of code where this style has been adopted, mostly of the "oh,
> we were calling this function that requires locks without actually holding
> one" variety.  You can argue that assertions should have been added, but
> adding function parameters ensures that you never have to worry about
> forgetting things.

Agree, the check in compiling time enforce the developer to use the thing right.
In addition, after double check,
the assertion is only available in DEBUG build that seem not enough to me as well.

Thanks for further explanation on this Locked version APIs.

I'll resume the review of this patch again.
Comment on attachment 8852742 [details] [diff] [review]
(v2) Patch Part 3: Refactor SurfaceCache with ExpirationTrackerImpl.

Review of attachment 8852742 [details] [diff] [review]:
-----------------------------------------------------------------

Resume the review per comment 24.
Attachment #8852742 - Flags: review?(nfroyd)
Attachment #8852742 - Flags: review?(aosmond)
(In reply to Nathan Froyd [:froydnj] from comment #24)
> Assuming that this comment implies that the thread-safe expiration tracker
> will have an identical API to the non-threaded one, I am strongly against
> it.  The whole point of providing |const AutoLock&| parameters is that the
> API is made virtually impossible to misuse.  I would much rather get
> compile-time errors that I forgot locks, rather than runtime errors on
> tests.  Having clarity on where locking is required has also helped catch
> bugs in areas of code where this style has been adopted, mostly of the "oh,
> we were calling this function that requires locks without actually holding
> one" variety.  You can argue that assertions should have been added, but
> adding function parameters ensures that you never have to worry about
> forgetting things.

I was unaware this was used anywhere else, as this is the first time I've seen it. While I saw the compile time enforcement, I would prefer from a style perspective to use what we do everywhere else in imagelib. However I don't have a strong opinion on it -- if this is the new-and-improved style, we need to start somewhere. For my own reference, is it the consensus that we should use this style going forward as the default over runtime ownership asserts for new code?
FYI, the AutoLock& parameter has been existed already in several modules like media, layout and security:
http://searchfox.org/mozilla-central/search?q=AutoLock%26&case=false&regexp=false&path=
Attachment #8852742 - Flags: review?(aosmond) → review+
(In reply to Andrew Osmond [:aosmond] from comment #27)
> I was unaware this was used anywhere else, as this is the first time I've
> seen it. While I saw the compile time enforcement, I would prefer from a
> style perspective to use what we do everywhere else in imagelib. However I
> don't have a strong opinion on it -- if this is the new-and-improved style,
> we need to start somewhere. For my own reference, is it the consensus that
> we should use this style going forward as the default over runtime ownership
> asserts for new code?

As much consensus as ever exists in Gecko code. ;)

The profiler has also been converting things to use this style: https://dxr.mozilla.org/mozilla-central/search?q=%2Btype-ref%3AProfilerState%3A%3ALockRef
Comment on attachment 8852740 [details] [diff] [review]
(v2) Patch Part 1: Define ExpirationTrackerImpl as a Thread-Safe Tracker.

Review of attachment 8852740 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8852740 - Flags: review?(nfroyd) → review+
Attachment #8852742 - Flags: review?(nfroyd) → review+
Comment on attachment 8852741 [details] [diff] [review]
(v2) Patch Part 2: Fix compiling error after ExpirationTrackerImpl is introduced.

To start the project of "stop Olli from having to spend such a big % of his days doing reviews", may I have your review for this fix in nsSHEEntryShared instead?

Thanks!
Attachment #8852741 - Flags: review?(bugs) → review?(michael)
(In reply to Bevis Tseng[:bevistseng][:btseng] from comment #31)
> To start the project of "stop Olli from having to spend such a big % of his
> days doing reviews", may I have your review for this fix in nsSHEEntryShared
> instead?
BTW, this is to remove the friend class statement instead of making ExpirationTrackerImpl<nsSHEntryShared, 3, ::detail::PlaceholderLock, ::detail::PlaceholderAutoLock> as a friend class which is tedious after offline discussion with :smaug.
Comment on attachment 8852741 [details] [diff] [review]
(v2) Patch Part 2: Fix compiling error after ExpirationTrackerImpl is introduced.

Review of attachment 8852741 [details] [diff] [review]:
-----------------------------------------------------------------

::: docshell/shistory/nsSHEntryShared.h
@@ +41,5 @@
>    static void Shutdown();
>  
>    nsSHEntryShared();
>  
> +  nsExpirationState *GetExpirationState() { return &mExpirationState; }

Usually I would write this definition under the NS_DECL_* macros.
Attachment #8852741 - Flags: review?(michael) → review+
(In reply to Michael Layzell [:mystor] from comment #33)
> ::: docshell/shistory/nsSHEntryShared.h
> @@ +41,5 @@
> >    static void Shutdown();
> >  
> >    nsSHEntryShared();
> >  
> > +  nsExpirationState *GetExpirationState() { return &mExpirationState; }
> 
> Usually I would write this definition under the NS_DECL_* macros.

Will do! ;)
Pushed by btseng@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd56b45db4eb
Part 1: Define ExpirationTrackerImpl as a Thread-Safe Tracker. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/8f496e855383
Part 2: Fix compiling error after ExpirationTrackerImpl is introduced. r=mystor
https://hg.mozilla.org/integration/mozilla-inbound/rev/3ee3add6393d
Part 3: Refactor SurfaceCache with ExpirationTrackerImpl. r=froydnj,aosmond
Whiteboard: [QDL][BACKLOG][GFX]
Whiteboard: [QDL][BACKLOG][GFX] → [QDL][TDC-MVP][DOM]
Can't build FF anymore. For some reason clang crashes.

1.	/home/smaug/mozilla/hg/vsync/ff_build_opt/dist/include/nsExpirationTracker.h:120:7 <Spelling=/home/smaug/mozilla/hg/vsync/ff_build_opt/dist/include/mozilla/Assertions.h:402:71>: current parser token '>'
2.	/home/smaug/mozilla/hg/vsync/ff_build_opt/dist/include/nsExpirationTracker.h:92:1: parsing struct/union/class body 'ExpirationTrackerImpl'
3.	/home/smaug/mozilla/hg/vsync/ff_build_opt/dist/include/nsExpirationTracker.h:114:3: parsing function body 'ExpirationTrackerImpl<T, K, Mutex, AutoLock>'
4.	/home/smaug/mozilla/hg/vsync/ff_build_opt/dist/include/nsExpirationTracker.h:114:3: in compound statement ('{}')
5.	/home/smaug/mozilla/hg/vsync/ff_build_opt/dist/include/nsExpirationTracker.h:118:23: in compound statement ('{}')
6.	/home/smaug/mozilla/hg/vsync/ff_build_opt/dist/include/nsExpirationTracker.h:120:7 <Spelling=/home/smaug/mozilla/hg/vsync/ff_build_opt/dist/include/mozilla/Assertions.h:426:6>: in compound statement ('{}')
clang-3.8: error: unable to execute command: Segmentation fault (core dumped)
clang-3.8: error: clang frontend command failed due to signal (use -v to see invocation)
Looks like updating to clang 3.9 might have helped.
(Comment 36 (a clang 3.8 bug) is no longer an issue, as of bug 1353941.)

bevis, perhaps we should uplift these changes [along with followup bug 1353941] to Aurora 54, to get this fix out sooner? That might be easier to do sooner rather than later -- it'll be moving to Beta in a week and a half.
Flags: needinfo?(btseng)
(Er, I guess the clang crash that smaug mentioned was actually from changes on related bug 1345464, not from this bug. So followup bug 1353941 (fixing that clang crash) only needs an uplift if we're also uplifting bug 1345464, I guess.)
(In reply to Daniel Holbert [:dholbert] from comment #40)
> (Er, I guess the clang crash that smaug mentioned was actually from changes
> on related bug 1345464, not from this bug. So followup bug 1353941 (fixing
> that clang crash) only needs an uplift if we're also uplifting bug 1345464,
> I guess.)

Sorry about the unexpected crash in clang and thanks for the quick fix.
Flags: needinfo?(btseng)
I'll do more test on 54 before asking uplifting.
Set NI on me for uplifting.
Flags: needinfo?(btseng)
Approval Request Comment
[Feature/Bug causing the regression]:
Race condition found in bug 1297111 comment 10.
The use of nsExpirationTracker in SurfaceCache is wrong:
1. SurfaceCache itself can be accessed in multi-thread with mutex lock protected. 
2. However, its internally data member SurfaceTracker (an subclass of nsExpirationTracker) is an main-thread-access only class according to the implementation of nsExpirationTracker.
[User impact if declined]: Random crash in SurfaceCache as reported in bug in bug 1297111.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: No. 
[List of other uplifts needed for the feature/fix]: No.
[Is the change risky?]: The risk is low.
[Why is the change risky/not risky?]:
We provide a thread-safe implementation of nsExpirationTracker to fix this race condition problem by sharing the same mutex lock that has been acquired by SurfaceCache in original implementation to prevent potential risk of deadlock by introducing a new mutex lock.
[String changes made/needed]: No.

Treeherder result for reference:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0854cadad534dd9cb483e0873c823518b8159e85
Flags: needinfo?(btseng)
Attachment #8856437 - Flags: review+
Attachment #8856437 - Flags: approval-mozilla-aurora?
I think it doesn't apply:
     
----


    Merge merged for commit cd56b45db4eb
    Merge merged for commit 8f496e855383
    Merge failed for commit 3ee3add6393d (parent 8f496e855383)

grafting 390750:3ee3add6393d "Bug 1350177 - Part 3: Refactor SurfaceCache with ExpirationTrackerImpl. r=froydnj,aosmond"
merging image/SurfaceCache.cpp
 warning: conflicts while merging image/SurfaceCache.cpp! (edit, then use 'hg resolve --mark')
abort: unresolved conflicts, can't continue
(use 'hg resolve' and 'hg graft --continue')
----
Flags: needinfo?(btseng)
sorry, I missed that you created a patch to apply to aurora :/
Flags: needinfo?(btseng)
Comment on attachment 8856437 [details] [diff] [review]
(Aurora 54) Patch: Refactor a thread-safe ExpirationTracker for the use in SurfaceCache. r=froydnj,mystor,aosmond

Fix a potential crash. Aurora54+.
Attachment #8856437 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 1367497
You need to log in before you can comment on or make changes to this bug.