Closed Bug 1246828 Opened 4 years ago Closed 3 years ago

Push shutdown back to profile-before-change2

Categories

(Core :: DOM: IndexedDB, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox50 --- fixed

People

(Reporter: khuey, Assigned: janv)

References

Details

(Keywords: dev-doc-needed)

Attachments

(4 files, 1 obsolete file)

Attached patch Patch (obsolete) — Splinter Review
Basically, there are things in profile-before-change that want to do IDB, and if we shut down IDB during profile-before-change then depending on the ordering of the observers things can get very confused.  roc reported a shutdown hang to me over email that was caused by this.

Thankfully somebody added a profile-before-change2, so lets just use that :P
Attachment #8717241 - Flags: review?(jvarga)
This will affect Cache API as well.  I think that's OK, but just mentioning.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #0)
> Created attachment 8717241 [details] [diff] [review]
> Patch
> 
> Basically, there are things in profile-before-change that want to do IDB,
> and if we shut down IDB during profile-before-change then depending on the
> ordering of the observers things can get very confused.  roc reported a
> shutdown hang to me over email that was caused by this.
> 
> Thankfully somebody added a profile-before-change2, so lets just use that :P

Thanks for figuring this out!
I think we should also update documentation at https://developer.mozilla.org/en/docs/Observer_Notifications
Basically profile-before-change2 comes right after profile-before-change, the difference is that profile-before-change2 shouldn't use any storage API that is controlled by quota manager.
Keywords: dev-doc-needed
Comment on attachment 8717241 [details] [diff] [review]
Patch

Review of attachment 8717241 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8717241 - Flags: review?(jvarga) → review+
I'm going to take a look at this.
Flags: needinfo?(khuey)
So, the problem is that profile-before-change2 is already used by telemetry to save data after all other services finished writing to the profile. See http://hg.mozilla.org/mozilla-central/file/d9f50aa0a1aa/toolkit/components/asyncshutdown/AsyncShutdown.jsm#l10

The attached patch added profile-before-change2 to head.js, so xpcshell tests get the additional notification, quota manager is correctly shut down, etc. but there's also a bunch of telemetry async shutdown blockers (triggered by progile-before-change2) which have never been executed in xpcshell tests.
That's fixable, but if I understand it correctly, telemetry wants to get the notification after all other services, including quota manager, so I think we need to introduce another profile-before-change.

profile-before-change - standard services
profile-before-change2 - quota manager
profile-before-change3 - telemetry

A "pre-patch" coming ...
Currently tested on try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5fdc6bc51e4b&selectedJob=19022350
It seems try was not clean...
New try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5ba90e518c5e
(In reply to Jan Varga [:janv] from comment #7)
> So, the problem is that profile-before-change2 is already used by telemetry
> to save data after all other services finished writing to the profile. See
> http://hg.mozilla.org/mozilla-central/file/d9f50aa0a1aa/toolkit/components/
> asyncshutdown/AsyncShutdown.jsm#l10
> 
> The attached patch added profile-before-change2 to head.js, so xpcshell
> tests get the additional notification, quota manager is correctly shut down,
> etc. but there's also a bunch of telemetry async shutdown blockers
> (triggered by progile-before-change2) which have never been executed in
> xpcshell tests.
> That's fixable, but if I understand it correctly, telemetry wants to get the
> notification after all other services, including quota manager, so I think
> we need to introduce another profile-before-change.
> 
> profile-before-change - standard services
> profile-before-change2 - quota manager
> profile-before-change3 - telemetry

We did a terrible job with naming things when we introduced profile-before-change2; can we call this one something more descriptive?  profile-before-change/quota-manager?  Whatever color we paint the bikeshed, let's not use more numbers.
Flags: needinfo?(jvarga)
(In reply to Nathan Froyd [:froydnj] from comment #9)
> We did a terrible job with naming things when we introduced
> profile-before-change2; can we call this one something more descriptive? 
> profile-before-change/quota-manager?  Whatever color we paint the bikeshed,
> let's not use more numbers.

Ok.
Flags: needinfo?(jvarga)
Blocks: 1246241
(In reply to Jan Varga [:janv] from comment #11)
> There's apparently another problem:
> https://treeherder.mozilla.org/logviewer.html#?job_id=19026887&repo=try

"Assertion failure: gFactoryOps->IsEmpty(), at dom/indexedDB/ActorsParent.cpp:13104"
The assertion is tracked as bug 1228937, noting that it's triggered there under different circumstances but the assertion it's still not a reasonable invariant in that case either.

The problem in a nutshell is:

- FactoryOps:
  - add themselves to gFactoryOps when their state hits DatabaseOpenPending and only remove themselves when they hit State::SendingResults.
  - basically only use the base DatabaseOperationBase::ActorDestroy implementation that sets mActorDestroyed and clears mOperationMayProceed.  The good news is all of the methods invoked by FactoryOp::Run check either IsActorDestroyed() or !OperationMayProceed and bail with an error that will jump the state directly to SendingResults and invoke SendResults().  The bad news is that since ActorDestroy does not invoke their Run() method, the async events they are waiting on need to dispatch them and the event loop has to be spun before they'll die off.

- Factory::ActorDestroy
  - gets invoked in the same turn of the event loop as the FactoryOp PBackgroundIDBFactoryRequests since it, PBackgroundIDBFactory manages them.  So there's zero chance of gFactoryOps emptying itself out.
  - decrements sFactoryInstanceCount and if it his zero asserts gFactoryOps->IsEmpty().  Note that this contradicts the logic in FactoryOp::DirectoryOpen that explicitly checks for !gFactoryOps.

- The child connection is getting closed which results in PBackgroundParent::DestroySubtree happening while there are still outstanding factory ops happening (while IDB is trying to shutdown its connection pool).  I'm assuming this is a result of the test and its IndexedDB references going out of scope and being cleanly shutdown in the child process since XPCShellImpl won't explicitly kill the testshell until _execute_test returns, which it can't do while the QuotaManager's profile-before-change2 observer is spinning the main process main thread event loop.  Since the IDB MOZ_LOG's don't show any other activity in the main process that would keep the sFactoryInstanceCount up otherwise, I'm assuming this is just a generic timing race.  (Which would also explain why it reproduces reliably for me locally on a non-debug build but does not repro on a debug build.)
See Also: → 1228937
Blocks: 1228937
The notifications are renamed in a separate patch.
Assignee: khuey → jvarga
Attachment #8717241 - Attachment is obsolete: true
Attachment #8762497 - Flags: review?(bugmail)
This is Kyle's original patch which I already reviewed.
Attachment #8762500 - Flags: review+
I think I found ever better solution to the |gFactoryOps->IsEmpty()| assertion. Moving destruction of global objects to quota client destructor would fix it too, but this is even cleaner in my opinion.
I reused existing uint64_t and FactoryOp/Database instances are not created/destructed so often, thus the slowdown should be minimal.
Attachment #8762501 - Flags: review?(bugmail)
Attachment #8762497 - Flags: review?(bugmail) → review+
Comment on attachment 8762501 [details] [diff] [review]
Part 3: Modify Factory instance counting to include FactoryOp and Database instances too

Review of attachment 8762501 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/indexedDB/ActorsParent.cpp
@@ +21757,5 @@
>      gLiveDatabaseHashtable->Put(mDatabaseId, info);
>    }
> +
> +  // Balanced in Database::CleanupMetadata().
> +  IncreaseBusyCount();

Restating for my benefit/confirmation: It's my understanding that it's safe to be doing this after the manipulation of gLiveDatabaseHashtable because the control-flow paths into this method are all predicated on !IsActorDestroyed().  (And so the Factory must exist and so gBusyCount > 0.)

(Whereas in the FactoryOp::DirectoryOpen case has the if (!gFactoryOps) guard which is more explicit.)
Attachment #8762501 - Flags: review?(bugmail) → review+
Comment on attachment 8762503 [details] [diff] [review]
Part 4: Rename profile-before-change2 to profile-before-change-qm and profile-before-change3 to profile-before-change-telemetry

Review of attachment 8762503 [details] [diff] [review]:
-----------------------------------------------------------------

Hooray for saner names! :)

When MXR is back up it's probably worth checking whether we should provide a heads-up to any extensions/add-ons about profile-before-change2 changing.
Attachment #8762503 - Flags: review?(bugmail) → review+
(In reply to Andrew Sutherland [:asuth] from comment #18)
> Comment on attachment 8762503 [details] [diff] [review]
> Part 4: Rename profile-before-change2 to profile-before-change-qm and
> profile-before-change3 to profile-before-change-telemetry
> 
> Review of attachment 8762503 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Hooray for saner names! :)
> 
> When MXR is back up it's probably worth checking whether we should provide a
> heads-up to any extensions/add-ons about profile-before-change2 changing.

I was just thinking about it ...
Yeah, I'll post to dev-platform

And thanks for the review!
Attachment #8762501 - Attachment description: Part 3: Modify Factory instance couting to include FactoryOp and Database instances too → Part 3: Modify Factory instance counting to include FactoryOp and Database instances too
Pushed by jvarga@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb195654d695
Part 1: Add an additional notification profile-before-change3 and update telemetry sending code to use it; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/b5885914883c
Part 2: Main changes, quota manager shutdown pushed back to profile-before-change2; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/1fb4f79f5b6b
Part 3: Modify Factory instance counting to include FactoryOp and Database instances too; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/600d22df678e
Part 4: Rename profile-before-change2 to profile-before-change-qm and profile-before-change3 to profile-before-change-telemetry; r=asuth
can this be uplifted?
Flags: needinfo?(jvarga)
(In reply to Jim Mathies [:jimm] from comment #22)
> can this be uplifted?

aurora and beta already contain these patches
Flags: needinfo?(jvarga)
See Also: → 1315306
You need to log in before you can comment on or make changes to this bug.