Closed Bug 1110487 Opened 10 years ago Closed 9 years ago

implement QuotaManager nsIOfflineStorage interface for Service Worker Cache

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox39 --- fixed

People

(Reporter: bkelly, Assigned: bkelly)

References

Details

Attachments

(5 files, 6 obsolete files)

1.70 KB, patch
ehsan.akhgari
: review+
Details | Diff | Splinter Review
56.21 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
10.15 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
15.90 KB, patch
ehsan.akhgari
: review+
Details | Diff | Splinter Review
12.12 KB, patch
ehsan.akhgari
: review+
Details | Diff | Splinter Review
The current implementation of the Service Worker Cache on maple provides a very simple QuotaManager Client.  Before Cache can be enabled by default, we need to implement the nsIOfflineStorage interface.  (Or the new v2 interface if it lands in time.)
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Jan, please review the QuotaManager specific interactions here.  Ehsan, I'm flagging you for review on the restructuring bits of the Cache code.  I'm sorry these aren't better separated out.

Here is a summary of the changes:

First, the Context now provides the ability to get a ThreadsafeHandle.  This holds a strong ref to the Context and can be used to close or invalidate the Context from any thread.  The Context is inherently a background thread object and nsIOfflineStorage operates on the main thread.  The ThreadsafeHandle is the main mechanism for bridging this gap.

When the Context initializes it will create a ThreadsafeHandle.  This establishes a ref-cycle that holds the Context alive.  This is a change from before where the Context would self-destruct after all Action objects completed.  This ref-cycle now holds the Context alive even while idle.  This is safe now that the QM can tell us to shutdown through the invalidate callback.

The ThreadsafeHandle is handed off to a new OfflineStorage object after the Context completes initialization.  In addition, we allow the next QM synchronized op at this point as well.

If all of the DOM objects and their associated actors close the Manager will call Context::AllowToClose().  This breaks the ref-cycle and lets Context self-destruct when idle.

If QM calls Close() or Invalidate() on the OfflineStorage we also break the ref-cycle.  In the Invalidate() case we also cancel all running Actions and call Invalidate() on the Manager.

Once the Manager has been invalidated it will no longer be returned by the Manager to new actors.  New actors will instead get a new Manager and Context.  The new Context, however, will block in WaitUntilOpenAllowed() until the QM finishes clearing the origin.

The invalidated Manager also fails any new requests from the DOM objects.  This is the best threadsafe way I could come up with to avoid accidentally using an old CacheId with a newly created database.

In order to fully support cancelation through the Context I also had to move the StreamList cancel logic from the Manager to the Context.

Finally, there are some small unrelated fixes that I saw.  For example, the QM ID I was using was not created correctly with GetStorageId().  Also, the Manager::CancelForCacheId() did not properly remove items from the mPendingActions list.

To test this I stole the code from the IDB tests to forcefully delete the QM storage.  I call this between each step in the Cache driver.js framework.

Please let me know if there are any questions.  Thanks!
Attachment #8576918 - Flags: review?(ehsan)
Attachment #8576918 - Flags: review?(Jan.Varga)
Comment on attachment 8576906 [details] [diff] [review]
P1 Fix some non-unified bustage before adding new code.

Review of attachment 8576906 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/cache/ActorChild.h
@@ +34,5 @@
>    FeatureNotified() const;
>  
>  protected:
> +  ActorChild();
> +  virtual ~ActorChild();

I can't find anywhere that we delete an ActorChild using an ActorChild*, so this doesn't need to be virtual.
Attachment #8576906 - Flags: review?(ehsan) → review+
(In reply to :Ehsan Akhgari (not reading bugmail, needinfo? me!) from comment #5)
> I can't find anywhere that we delete an ActorChild using an ActorChild*, so
> this doesn't need to be virtual.

Ugh.  This keeps tripping me up.  I was thinking that no virtual means the base class destructor wouldn't get called.  That's not the case, though, of course.  I'll fix.
Comment on attachment 8576918 [details] [diff] [review]
P2 Implement the nsIOfflineStorage interface in Cache.P2 Implement the nsIOfflineStorage interface in Cache.

Review of attachment 8576918 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/cache/Context.cpp
@@ +158,5 @@
>    State mState;
>    nsresult mResult;
>    QuotaInfo mQuotaInfo;
> +  nsMainThreadPtrHandle<OfflineStorage> mOfflineStorage;
> +  bool mNeedsQuotaRelease;

State with bool would probably pack better if they were together and as last items.

@@ +319,5 @@
> +      if (mNeedsQuotaRelease) {
> +        // Unlock the quota dir if we locked it previously
> +        nsCOMPtr<nsIRunnable> runnable = new QuotaReleaseRunnable(mQuotaInfo);
> +        MOZ_ALWAYS_TRUE(NS_SUCCEEDED(
> +          NS_DispatchToMainThread(runnable, nsIThread::DISPATCH_NORMAL)));

you don't need to specify nsIThread::DISPATCH_NORMAL explicitly for NS_DispatchToMainThread

::: dom/cache/Context.h
@@ +118,5 @@
>  
> +  void AddActivity(Activity* aActivity);
> +  void RemoveActivity(Activity* aActivity);
> +
> +  const QuotaInfo

can this return a reference ?
const QuotaInfo&

::: dom/cache/Manager.cpp
@@ +1434,5 @@
> +Manager::Invalidate()
> +{
> +  NS_ASSERT_OWNINGTHREAD(Manager);
> +  // QuotaManager can trigger this more than once.
> +  mValid = false;

maybe replace it with mInvalidated
I think |if (!mInvalid)| currently occurs more often than |if (mInvalid)|
I also think mInvalidated reads better when it's used along with mShuttingDown.

::: dom/cache/OfflineStorage.cpp
@@ +27,5 @@
> +                         const QuotaInfo& aQuotaInfo)
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +
> +  nsRefPtr<QuotaManager> qm = QuotaManager::GetOrCreate();

This can be Get()
A raw pointer should be fine.

@@ +56,5 @@
> +  , mQuotaInfo(aQuotaInfo)
> +{
> +  MOZ_ASSERT(mContext);
> +
> +  mPersistenceType = PERSISTENCE_TYPE_DEFAULT;

Move to the initialization list. Actually, I think all 3 could go there.
I don't know why DatabaseOfflineStorage::DatabaseOfflineStorage doesn't have it in the list.
And maybe also add:
MOZ_ASSERT(mClient);
GetOrCreateQuotaClient() should be just GetQuotaClient()

@@ +64,5 @@
> +
> +OfflineStorage::~OfflineStorage()
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +  nsRefPtr<QuotaManager> qm = QuotaManager::GetOrCreate();

I don't think we should recreate QM here.
And a raw pointer should be sufficient.

@@ +66,5 @@
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +  nsRefPtr<QuotaManager> qm = QuotaManager::GetOrCreate();
> +  if (!qm) {
> +    NS_WARNING("QuotaManager gone before Cache storage could unregister");

Hm, in theory, you could even assert it.

@@ +99,5 @@
> +  //
> +  // As far as I can tell this is used by QuotaManager to shutdown storages
> +  // when a particular process goes away.  We definitely don't want this
> +  // since we are shared.  Also, the Cache actor code already properly
> +  // handles asynchronous actor destruction when the child process dies.

So if the child process crashes unexpectedly, you have code elsewhere that aborts all running or pending operations which belong to the child process ?

Also, just FYI, if this method always return false, Close() won't be ever called.

::: dom/cache/QuotaClient.cpp
@@ +80,5 @@
> +  NS_IMETHOD Run() MOZ_OVERRIDE
> +  {
> +    MOZ_ASSERT(NS_IsMainThread());
> +    MOZ_ASSERT(mExpectedCalls);
> +    mExpectedCalls -= 1;

Hm, |mExpectedCalls--| ?
or maybe:

if (--mExpectedCalls) {
  return NS_OK;
}

mCallback->Run();

@@ +110,5 @@
> +      // cleared in ~CacheQuotaClient()
> +      sInstance = ref.get();
> +    }
> +    return ref.forget();
> +  }

Maybe replace GetOrCreate() with GetInstance()

@@ +213,5 @@
>  
>    virtual bool
>    IsFileServiceUtilized() MOZ_OVERRIDE
>    {
> +    return true;

I believe you don't need this change.

@@ +230,5 @@
> +    MOZ_ASSERT(NS_IsMainThread());
> +    MOZ_ASSERT(!aStorages.IsEmpty());
> +
> +    nsCOMPtr<nsIRunnable> callback =
> +      new StoragesDestroyedRunnable(aStorages.Length(),aCallback);

Nit: space after ','

@@ +235,5 @@
> +
> +    for (uint32_t i = 0; i < aStorages.Length(); ++i) {
> +      MOZ_ASSERT(aStorages[i]->GetClient());
> +      MOZ_ASSERT(aStorages[i]->GetClient()->GetType() == Client::DOMCACHE);
> +      nsRefPtr<OfflineStorage> storage =

Can this be a raw pointer ?

@@ +237,5 @@
> +      MOZ_ASSERT(aStorages[i]->GetClient());
> +      MOZ_ASSERT(aStorages[i]->GetClient()->GetType() == Client::DOMCACHE);
> +      nsRefPtr<OfflineStorage> storage =
> +        static_cast<OfflineStorage*>(aStorages[i]);
> +      storage->SetDestroyCallback(callback);

Hm, I think it would be cleaner to have AddDestroyCallback().
WaitForStoragesToComplete() could be in theory called multiple times for the same offline storage object. Anyway hopefully, nsIOfflineStorage will go away soon.

@@ +274,5 @@
>  namespace mozilla {
>  namespace dom {
>  namespace cache {
>  
> +already_AddRefed<quota::Client> GetOrCreateQuotaClient()

As I commented elsewhere, this change shouldn't be needed. Just add GetQuotaClient() which calls QuotaClient::GetInstance() I think.

::: dom/cache/moz.build
@@ +25,5 @@
>      'FileUtils.h',
>      'IPCUtils.h',
>      'Manager.h',
>      'ManagerId.h',
> +    'OfflineStorage.h',

Hm, why it this exported ?

::: dom/quota/QuotaManager.cpp
@@ +1430,5 @@
>  
>    // Register clients.
>    mClients.AppendElement(idbClient);
>    mClients.AppendElement(asmjscache::CreateClient());
> +  mClients.AppendElement(cache::GetOrCreateQuotaClient());

Hm, do you really need this change ?
You could just add GetQuotaClient() and call it from OfflineStorage constructor.
Also add an assertion for GetQuotaClient(), it shouldn't be null there.
Attachment #8576918 - Flags: review?(Jan.Varga) → review+
(In reply to Jan Varga [:janv] from comment #7)
> > +  nsRefPtr<QuotaManager> qm = QuotaManager::GetOrCreate();
> 
> A raw pointer should be fine.

I've had previous feedback to always use a strong pointer when calling into an object unless there is an explicit reason preventing it.

> @@ +99,5 @@
> > +  //
> > +  // As far as I can tell this is used by QuotaManager to shutdown storages
> > +  // when a particular process goes away.  We definitely don't want this
> > +  // since we are shared.  Also, the Cache actor code already properly
> > +  // handles asynchronous actor destruction when the child process dies.
> 
> So if the child process crashes unexpectedly, you have code elsewhere that
> aborts all running or pending operations which belong to the child process ?

Yes, because I was handling this condition before implementing this bug.

The actors get ActorDestroy() called in that case, which triggers Manager::RemoveListener().  This patch makes RemoveListener() call Context::AllowToClose() when all the actors are gone.  So effectively we handle the same condition here.

> Also, just FYI, if this method always return false, Close() won't be ever
> called.

Yea, I just figured it was safer to implement it anyway in case it got used somewhere else in QM.

> ::: dom/cache/QuotaClient.cpp
> > +    mExpectedCalls -= 1;
> 
> Hm, |mExpectedCalls--| ?

I have a personal dislike about -- and ++ operators outside of the for-loop idiom. :-)  I don't like relying on implied ordering within a single statement, etc.  It seems unnecessarily complex to save an extra line.

> @@ +213,5 @@
> >  
> >    virtual bool
> >    IsFileServiceUtilized() MOZ_OVERRIDE
> >    {
> > +    return true;
> 
> I believe you don't need this change.

Hmm, ok.  Happy to return false here, but can you explain?


> @@ +237,5 @@
> > +      MOZ_ASSERT(aStorages[i]->GetClient());
> > +      MOZ_ASSERT(aStorages[i]->GetClient()->GetType() == Client::DOMCACHE);
> > +      nsRefPtr<OfflineStorage> storage =
> > +        static_cast<OfflineStorage*>(aStorages[i]);
> > +      storage->SetDestroyCallback(callback);
> 
> Hm, I think it would be cleaner to have AddDestroyCallback().
> WaitForStoragesToComplete() could be in theory called multiple times for the
> same offline storage object. Anyway hopefully, nsIOfflineStorage will go
> away soon.

I actually had AddDestroyCallback() and simplified it to Set.  I'll put it back in.

> 
> @@ +274,5 @@
> >  namespace mozilla {
> >  namespace dom {
> >  namespace cache {
> >  
> > +already_AddRefed<quota::Client> GetOrCreateQuotaClient()
> 
> As I commented elsewhere, this change shouldn't be needed. Just add
> GetQuotaClient() which calls QuotaClient::GetInstance() I think.

So my old CreateQuotaClient() call used to always make a new one.  I switched to GetOrCreateQuotaClient() to indicate that its now a singleton, etc.  If all the ref's drop, though, you can get a new/different Client returned.

It seems similar to QuotaManager::GetOrCreate() naming. :-).

I could add a separate GetQuotaClient(), but I still need the sInstance member, etc.  It seems easier to just do this in one method.  I don't see any real benefit to having a fallible Get here.

> ::: dom/cache/moz.build
> @@ +25,5 @@
> >      'FileUtils.h',
> >      'IPCUtils.h',
> >      'Manager.h',
> >      'ManagerId.h',
> > +    'OfflineStorage.h',
> 
> Hm, why it this exported ?

I export all my headers for simplicity.  Changing Cache to only export a few headers is on my todo list, but low priority.
(In reply to Ben Kelly [:bkelly] from comment #8)
> (In reply to Jan Varga [:janv] from comment #7)
> > > +  nsRefPtr<QuotaManager> qm = QuotaManager::GetOrCreate();
> > 
> > A raw pointer should be fine.
> 
> I've had previous feedback to always use a strong pointer when calling into
> an object unless there is an explicit reason preventing it.
> 

I think QM is special enough, we handle it the same way in IDB and ASMJSCACHE.

> > @@ +213,5 @@
> > >  
> > >    virtual bool
> > >    IsFileServiceUtilized() MOZ_OVERRIDE
> > >    {
> > > +    return true;
> > 
> > I believe you don't need this change.
> 
> Hmm, ok.  Happy to return false here, but can you explain?

This is intended for clients that support FileHandle API.

> > @@ +274,5 @@
> > >  namespace mozilla {
> > >  namespace dom {
> > >  namespace cache {
> > >  
> > > +already_AddRefed<quota::Client> GetOrCreateQuotaClient()
> > 
> > As I commented elsewhere, this change shouldn't be needed. Just add
> > GetQuotaClient() which calls QuotaClient::GetInstance() I think.
> 
> So my old CreateQuotaClient() call used to always make a new one.  I
> switched to GetOrCreateQuotaClient() to indicate that its now a singleton,
> etc.  If all the ref's drop, though, you can get a new/different Client
> returned.
> 
> It seems similar to QuotaManager::GetOrCreate() naming. :-).
> 
> I could add a separate GetQuotaClient(), but I still need the sInstance
> member, etc.  It seems easier to just do this in one method.  I don't see
> any real benefit to having a fallible Get here.
> 

CreateQuotaClient() should be called only once by QM. You shouldn't need to recreate your quota client.
This way it's more visible who primarily owns the quota client object.
I would rather be consistent with other quota client implementations, QuotaManager::GetOrCreate is a different beast :)
Actually, GetQuotaClient() might not be needed if you are able to call QuotaClient::GetInstance() directly.
(In reply to Jan Varga [:janv] from comment #9)
> CreateQuotaClient() should be called only once by QM. You shouldn't need to
> recreate your quota client.
> This way it's more visible who primarily owns the quota client object.
> I would rather be consistent with other quota client implementations,
> QuotaManager::GetOrCreate is a different beast :)
> Actually, GetQuotaClient() might not be needed if you are able to call
> QuotaClient::GetInstance() directly.

How about I go back to just CreateQuotaClient() and then have my OfflineStorage call QuotaManager:GetClient(DOMCACHE)?
Comment on attachment 8576918 [details] [diff] [review]
P2 Implement the nsIOfflineStorage interface in Cache.P2 Implement the nsIOfflineStorage interface in Cache.

Review of attachment 8576918 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/cache/Context.cpp
@@ +562,5 @@
> +  // all Contexts have been destroyed.
> +  nsCOMPtr<nsIRunnable> runnable =
> +    NS_NewNonOwningRunnableMethod(mStrongRef.forget().take(), &Context::Release);
> +  MOZ_ALWAYS_TRUE(NS_SUCCEEDED(
> +    mOwningThread->Dispatch(runnable, nsIThread::DISPATCH_NORMAL)));

Couldn't you use NS_ProxyRelease here?

@@ +691,3 @@
>      }
>    }
> +  mPendingActions = Move(remainingPendingActions);

You could do this in a more straightforward way by iterating backwards and just removing the matching PendingActions.

::: dom/cache/Context.h
@@ +30,5 @@
>  // The Context class is RAII-style class for managing IO operations within the
>  // Cache.
>  //
>  // When a Context is created it performs the complicated steps necessary to
>  // initialize the QuotaManager.  Action objects dispatched on the Context are

I think this paragraph requires a bit of adjustment to reflect how the lifetime of Context works after this patch.

@@ +57,5 @@
> +    void AllowToClose();
> +    void InvalidateAndAllowToClose();
> +  private:
> +    explicit ThreadsafeHandle(Context* aContext);
> +    ~ThreadsafeHandle();

It's best to declare a =delete private copy ctor and assignment operator to make sure the object doesn't accidentally get copied.

@@ +79,5 @@
> +    NS_INLINE_DECL_THREADSAFE_REFCOUNTING(cache::Context::ThreadsafeHandle)
> +  };
> +
> +  // Must call RemoveActivity() in its destructor
> +  class Activity

Please add a short comment describing what this class is.

::: dom/cache/Manager.cpp
@@ +1405,2 @@
>  Manager::RemoveListener(Listener* aListener)
>  {

Should you check if the manager is valid?  (I think this is one, just double checking.)

@@ +1416,5 @@
>  }
>  
>  void
>  Manager::RemoveContext(Context* aContext)
>  {

Should you check mValid here?  Or at least assert it?

::: dom/cache/test/mochitest/driver.js
@@ +26,5 @@
>    }
>  
> +  function clearStorage() {
> +    return new Promise(function(resolve, reject) {
> +      var principal = SpecialPowers.wrap(document).nodePrincipal;

Please add a comment saying where this was copied from.

@@ +98,2 @@
>        .then(runFrameTest)
> +      .then(clearStorage)

You may need to rebase this on top of my changes...  Also for completeness, please modify the new "parallel" code path to run clearStorage once the Promise.all gets resolved.  That should take care of doing the right thing for "both" as well.
Attachment #8576918 - Flags: review?(ehsan) → review+
(In reply to :Ehsan Akhgari (not reading bugmail, needinfo? me!) from comment #11)
> ::: dom/cache/Context.cpp
> @@ +562,5 @@
> > +  // all Contexts have been destroyed.
> > +  nsCOMPtr<nsIRunnable> runnable =
> > +    NS_NewNonOwningRunnableMethod(mStrongRef.forget().take(), &Context::Release);
> > +  MOZ_ALWAYS_TRUE(NS_SUCCEEDED(
> > +    mOwningThread->Dispatch(runnable, nsIThread::DISPATCH_NORMAL)));
> 
> Couldn't you use NS_ProxyRelease here?

NS_ProxyRelease requires an nsISupports.  I just have an AddRef/Release here.

> ::: dom/cache/Manager.cpp
> @@ +1405,2 @@
> >  Manager::RemoveListener(Listener* aListener)
> >  {
> Should you check if the manager is valid?  (I think this is one, just double
> checking.)

No.  The Manager can be either valid or invalid when RemoveListener() is closed.  One of the reasons we use invalid marking on Manager is because there may still be a Listener holding the Manager open.

> >  void
> >  Manager::RemoveContext(Context* aContext)
> >  {
> 
> Should you check mValid here?  Or at least assert it?

No.  The Context can close without the Manager becoming invalid if Manager::RemoveListener() calls Context::AllowToClose().

I'll try to clarify the two ways a Context can be destroyed in comments.
I will address comments and rebase on Monday after Ehsan's patches land in central.
Oops.  I snipped the wrong quote to reply to.

> @@ +56,5 @@
> > +  , mQuotaInfo(aQuotaInfo)
> > +{
> > +  MOZ_ASSERT(mContext);
> > +
> > +  mPersistenceType = PERSISTENCE_TYPE_DEFAULT;
> 
> Move to the initialization list. Actually, I think all 3 could go there.
> I don't know why DatabaseOfflineStorage::DatabaseOfflineStorage doesn't have
> it in the list.

This is not legal in C++.  You have to define a constructor on the base class that the derived class invokes in its initializer list.
Jan,

Please see this failure in mozilla-inbound:

  https://treeherder.mozilla.org/logviewer.html#?job_id=7657794&repo=mozilla-inbound

This suggests to me that QuotaManager is getting deleted out from under my OfflineStorage.  Is this possible?  Do I need to hold a ref to the QuotaManager in my OfflineStorage class?

Note, I now AllowNextSynchronizedOp() immediately after initialization.  The Context and Actions then can create mozStorageConnections using the quota file URL handler.  The crash is happening when that connection is cleaned up.  The OfflineStorage is guaranteed to survive past this point.
Flags: needinfo?(Jan.Varga)
Backed out part 2 for the crashes in comment 20.
https://hg.mozilla.org/integration/mozilla-inbound/rev/ef3127776675
Whiteboard: [leave open]
(In reply to Ben Kelly [:bkelly] from comment #20)
> Jan,
> 
> Please see this failure in mozilla-inbound:
> 
>  
> https://treeherder.mozilla.org/logviewer.html#?job_id=7657794&repo=mozilla-
> inbound
> 
> This suggests to me that QuotaManager is getting deleted out from under my
> OfflineStorage.  Is this possible?  Do I need to hold a ref to the
> QuotaManager in my OfflineStorage class?

You mean QuotaObject (not QuotaManager), right ?

> 
> Note, I now AllowNextSynchronizedOp() immediately after initialization.  The
> Context and Actions then can create mozStorageConnections using the quota
> file URL handler.  The crash is happening when that connection is cleaned
> up.  The OfflineStorage is guaranteed to survive past this point.

We've seen similar stack traces when quota manager deleted an origin but IDB still had open mozStorageConnections.
Are you 100% sure mozStorageConnections are closed before you call QuotaManager::UnregisterStorage ?
Flags: needinfo?(Jan.Varga)
(In reply to Jan Varga [:janv] from comment #22)
> You mean QuotaObject (not QuotaManager), right ?

Well, I thought maybe the QuotaObject and QuotaManager were related.

> We've seen similar stack traces when quota manager deleted an origin but IDB
> still had open mozStorageConnections.
> Are you 100% sure mozStorageConnections are closed before you call
> QuotaManager::UnregisterStorage ?

I *think* this is guaranteed through these mechanisms:

1) UnregsterStorage occurs when ~OfflineStorage() runs.
2) Context holds a strong ref to OfflineStorage
3) ActionRunnable holds a strong ref to Context
4) ActionRunnable holds a strong ref to DBAction
5) DBAction is destroying the connection when DBAction::RunOnTarget() completes.
6) ActionRunnable is held alive by the IO thread while RunOnTarget() is executed.

I'll try to verify this is working as expected.
This fix is looking good in try so far.  I would like to do a few more pushes before landing, but test infrastructure is having problems.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e1ba8547097a
Sadly, no.  Still a problem.
This is looking good now after fixing the ReadStream UAF bug.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a51de8b6bd26
Actually, still a problem when I trigger e10s M1 with parallel tests in place.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f43e910e08ae

I'll investigate.
I think I may see the problem.

In DBAction we create an nsCOMPtr<mozIStorageConnection> on the stack.  The SyncDBAction then calls Resolve() with that connection smart pointer still on the stack.  At that point its a race to see if the underlying Action is destroyed by the resolve process before the stack is unwound on the IO thread.

I may need a kungFuGrip here, but I'm going to see if there is a better way.
Actually, a kungFuGrip will not help.  Even if I hold the Action alive on the target thread, resolving the ActionRunnable will still let the Context be destroyed before the stack connection ref is dropped.  Also, the Action must be destroyed on the owning thread.  A self-ref on the stack would break that guarantee.
Overall I think its just incredibly dangerous to all an Action to "complete" while we're still synchronously in its RunOnTarget() method.  The Action can be deleted out from under the method, the Context can shutdown allowing QuotaManager to clean up, etc.

Lets just kill this entire class of bugs by having Context bounce through the current thread again before completing on the initiating thread.  This is the easiest and safest way to ensure the current method has returned and still allow Resolve() to be used asynchronously.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6dc67c3a4ff2
Blocks: 1142852
Comment on attachment 8581311 [details] [diff] [review]
Cache should ensure Actions are finished before completing. r=ehsan

Review of attachment 8581311 [details] [diff] [review]:
-----------------------------------------------------------------

Please see comment 31 for the explanation.  Short story is the Context can be destroyed while an Action still has its RunOnTarget() on the stack.
Attachment #8581311 - Flags: review?(ehsan)
I caught a try build where an Action was canceled twice.  See the windows M1 failure here:

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=386deeb5ee7c

After thinking about it, though, double cancelation is totally legitimate.  These things can all trigger cancels and are independent of one another:

  1) Browser shutdown
  2) QM origin invalidation
  3) Individual Cache Actions gets canceled because the Cache was deleted and its actors closed.

We should just loosen the assertion here to match reality.

I'm including this here because the addition of QM invalidation makes it more likely to trigger now.
Attachment #8581354 - Flags: review?(ehsan)
Comment on attachment 8581311 [details] [diff] [review]
Cache should ensure Actions are finished before completing. r=ehsan

Review of attachment 8581311 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/cache/Context.cpp
@@ +120,5 @@
> +
> +    // Unfortunately we cannot dispatch to the QM IO thread from anywhere
> +    // except the main thread.  So bounce through the main thread and then
> +    // back through the IO thread to ensure all methods have returned
> +    // before completing.

This is unfortunate, especially if the main thread is busy.  Can you please file a follow-up bug to see if we can lift this restriction?

@@ +351,5 @@
> +      // initialization process.
> +      mState = STATE_COMPLETING;
> +      MOZ_ALWAYS_TRUE(NS_SUCCEEDED(
> +        mInitiatingThread->Dispatch(this, nsIThread::DISPATCH_NORMAL)));
> +      break;

Wouldn't it be cleaner to rewrite this as:

if (qm && !NS_IsMainThread()) {
  // dispatch to the IO thread
} else {
  mState = STATE_COMPLETING;
  // dispatch to initiating trhead
}
Attachment #8581311 - Flags: review?(ehsan) → review+
Attachment #8581354 - Flags: review?(ehsan) → review+
(In reply to :Ehsan Akhgari (not reading bugmail, needinfo? me!) from comment #34)
> Comment on attachment 8581311 [details] [diff] [review]
> Cache should ensure Actions are finished before completing. r=ehsan
> 
> Review of attachment 8581311 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/cache/Context.cpp
> @@ +120,5 @@
> > +
> > +    // Unfortunately we cannot dispatch to the QM IO thread from anywhere
> > +    // except the main thread.  So bounce through the main thread and then
> > +    // back through the IO thread to ensure all methods have returned
> > +    // before completing.
> 
> This is unfortunate, especially if the main thread is busy.  Can you please
> file a follow-up bug to see if we can lift this restriction?

I'll file a bug, but I have a feeling it might be by design.  The LazyIdlThread probably needs to update some meta data in its Dispatch() to determine if its gone idle.

Also, I have an idea to optimize the synchronous Action case.  I'm going to send you an interdiff for this.
I could also move the SetupAction to the main target thread instead of the QM IOThread.  That would avoid needing to redispatch to the LazyIdleThread here.
(In reply to Ben Kelly [:bkelly] from comment #35)
> I'll file a bug, but I have a feeling it might be by design.  The
> LazyIdlThread probably needs to update some meta data in its Dispatch() to
> determine if its gone idle.

Yeah, this is a limitation of LazyIdleThread.
Attachment #8576906 - Flags: checkin+
Rebased for final/override changes in central.  Also folded interdiffs.
Attachment #8576906 - Attachment is obsolete: true
Attachment #8576918 - Attachment is obsolete: true
Attachment #8577837 - Attachment is obsolete: true
Attachment #8577846 - Attachment is obsolete: true
Attachment #8578641 - Attachment is obsolete: true
Attachment #8581678 - Flags: review+
Rebased for final/override.
Attachment #8581311 - Attachment is obsolete: true
Attachment #8581679 - Flags: review+
Another try to make sure I didn't break things with my rebase:

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=eb760d9b67f1
(In reply to Jan Varga [:janv] from comment #37)
> (In reply to Ben Kelly [:bkelly] from comment #35)
> > I'll file a bug, but I have a feeling it might be by design.  The
> > LazyIdlThread probably needs to update some meta data in its Dispatch() to
> > determine if its gone idle.
> 
> Yeah, this is a limitation of LazyIdleThread.

Actually, a nice side effect here is that it means the SetupAction *must* be synchronous.  It can't dispatch any async Runnables to the IO thread anyway.  So I'll just make this a requirement on the setup action and avoid the resolve thread bounce completely in QuotaInitRunnable.
Whiteboard: [leave open]
Optimize synchronous Resolve() calls.  This avoids the thread bounce completely in the QuotaInitRunnable case and most Actions.  Right now the only async Action is CachePutAllAction which will trigger this extra thread bounce.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=64503c587e9e
Attachment #8581720 - Flags: review?(ehsan)
Comment on attachment 8581720 [details] [diff] [review]
P3 interdiff 001 optimize synchronous resolve

Review of attachment 8581720 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!
Attachment #8581720 - Flags: review?(ehsan) → review+
Yea, the last interdiff was not completely threadsafe in QuotaInitRunnable.  I was using AutoRestore with mExecutingRun, but we dispatch to other threads which re-enter Run().  So we could have multiple AutoRestore values on the stack at the same time.  Badness.

This patch solves all that by just making the Resolver a separate object that is created on each entry into Run().  I would have liked to make it a stack object, but the Resolver must be ref'able to satisfy other use cases.

While doing this I fixed Action::Resolver not to implement nsISupports just to get AddRef/Release.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4a2cbb0a1522
Attachment #8581845 - Flags: review?(ehsan)
Comment on attachment 8581845 [details] [diff] [review]
P3 interdiff 002 fix QuotaInitRunnable to handle sync resolve with threadsafety

Review of attachment 8581845 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/cache/Context.cpp
@@ +128,5 @@
>  
> +    bool mResolved;
> +    nsresult mResult;
> +
> +    NS_INLINE_DECL_REFCOUNTING(Context::QuotaInitRunnable::SyncResolver)

Nit: please pass override as the second argument of NS_INLINE_DECL_REFCOUNTING.
Attachment #8581845 - Flags: review?(ehsan) → review+
https://hg.mozilla.org/mozilla-central/rev/a13277e46eb9
https://hg.mozilla.org/mozilla-central/rev/1a468067441b
https://hg.mozilla.org/mozilla-central/rev/fa8fa3aa9ea6
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: