Closed Bug 1338280 Opened 7 years ago Closed 7 years ago

Make Places restartable in tests

Categories

(Toolkit :: Places, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: lina, Unassigned)

References

Details

Attachments

(4 files)

It's currently possible to shut down Places by sending `profile-change-teardown` and `test-simulate-places-shutdown` observer notifications to `PlacesUtils.history`, then waiting for `places-connection-closed`. Once Places is shut down, though, it's not possible to start it back up again. This makes it impossible for test code to replace the database after initialization. Migration does something like this, but we need to call `setupPlacesDatabase` before accessing `PlacesUtils.history` for the first time. Once the connection is opened, it can't be closed and reopened.

For context, we're currently working on a bookmark repair patch for Sync in bug 1317223. Our integration tests tries to simulate two different clients: one with a complete database, the other with items missing. We want to test that the complete database is able to supply the missing records. I was thinking we can do this by:

1. Initializing the complete database, then shutting down Places.
2. Copying places.sqlite to a backup file.
3. Starting Places back up, deleting some bookmarks, and verifying that Sync can request the deleted records.
4. Shutting down Places again, and renaming the backup file we created in step 2 back to places.sqlite.
5. Starting Places back up, having Sync pretend it's another client, and verifying it uploads the missing records.

Attached is a WIP patch that tries to make `promiseDBConnection` and `withConnectionWrapper` safe for re-initialization. Unfortunately, this doesn't quite work: the underlying `mozIStorageAsyncConnection` is still closed (even though we null out `Database::gDatabase` in http://searchfox.org/mozilla-central/rev/afcf40f3eafd895611a839017730debb58a342a6/toolkit/components/places/Shutdown.cpp#184-188), so trying to call `wrapStorageConnection` throws an exception. I'm guessing this is because `History` and `nsNavBookmarks` still hold references to the DB?

But, the broader question is: does it even make sense to do this? Or should we abandon trying to do this, and figure out a different way to write our integration test?
(In reply to Kit Cambridge [:kitcambridge] from comment #0)
> This makes it impossible for test code
> to replace the database after initialization.

What I did so far in those cases was to attach a pre-generated empty places.sqlite, do the modifications through a normal storage connection, then startup Places.
Shipping an empty and up-to-date db in tests folder wouldn't be a big issue.
In your case you should have 2 pre-generated files.
I agree it doesn't sound great, mostly cause then every time we bump the schema you may have to update those files.

There is an alternative, that is to use Marionette. It can do full startup and shutdown cycles, so it may be a good way to simulate the real thing. Unfortunately I don't know how hard it would be to make it friendly enough to build a sub-harness around it.

> Attached is a WIP patch that tries to make `promiseDBConnection` and
> `withConnectionWrapper` safe for re-initialization. Unfortunately, this
> doesn't quite work: the underlying `mozIStorageAsyncConnection` is still
> closed

Both nsNavHistory, nsNavBookmarks and History (maybe others?) cache the connection on init in mDB, and in the former case the only thing that causes init to run is creating a new nsNavHistory object.
You should basically remove the caching, and use Database::GetDatabase() everywhere. While the overhead is *probably* acceptable (TBV), the current code is made like that also to avoid late consumers in shutdown to be able to wrongly resurrect a dead Places, and cause leaks or crashes. Thus making it possible for anyone to resurrect the connection could open a can of worms.
That special mode should be activated specifically for the test through a pref or such (thus GetDatabase would fail after the first shutdown, but resurrects the connection when specifically required by the test). The pref should likely be ignored in release, so we don't risk to ship it to a large audience.

> But, the broader question is: does it even make sense to do this? Or should
> we abandon trying to do this, and figure out a different way to write our
> integration test?

The best path forward would be to have an harness that can do restarts. Marionette currently is the only thing we have that can do that. Whether it's handy to use it in your case is TBD.
What you are doing may be a short term alternative, but it's not as trivial as it may sound, and it should be pretty much limited to specific testing, cause we can't risk to allow an add-on or other broken internal code to bypass shutdown.
Thanks for the guidance, mak, that's very useful. I don't think Marionette will work for us. We want to have two separate databases, but running side-by-side, and sharing some state and a fake Sync server. Coordinating calls between two separate instances of Firefox is going to be tricky with Marionette, and it doesn't seem like we have a harness that meets our needs.

Option two is making Places restartable for tests only. I have a patch stack that implements just enough to make this work for bug 1317223, but I'm not sure we'd ever want to land it. Posting them anyway to get your feedback on the general approach, but there's no hurry. This isn't a blocker.

Here are the issues I ran into:

1. `Database::GetDatabase()` is called from the main thread and the async threads, and calls a static method of `PlacesShutdownBlocker` to see if it's shutting down. So, we can't check prefs, and we can only use static members.

2. The static method in (1) is why I added the "generation number" kludge: "are we currently shutting down?" is more difficult to answer when we have multiple databases. The counters are in lockstep until we start blocking shutdown. At that point, we increment the shutdown blocker counter, but not the database counter. Since the database counter is one less than the shutdown counter, `PlacesShutdownBlocker::IsStarted` now returns true. Once shutdown is complete and the database's destructor is called, we bump the database's counter.

3. For the destructor to fire, the services can't hold strong refs to the database. I changed them to access the database through a getter that checks if the pref is set. This was a find-and-replace change, but it's a big patch because there are a lot of calls that had to be updated. As you say, we might be able to get away with calling `Database::GetDatabase()` every time, but a nice side effect of the strong refs is that the destructor never fires, making it less likely we'll try to resurrect the database on shutdown. Another idea to guard against that is to only bump the generation counter in the destructor if the pref is set.

4. The wrapped connections in `PlacesUtils` are tricky, since they depend on the clients shutdown blocker. AsyncShutdown (correctly, IMO) doesn't offer a way to "re-arm" a blocker; once it's tripped, the phase is complete. We want that for real shutdown, but not if we're restarting the database multiple times. My kludge is to add a "tag" that uniquely identifies each DB, and pass it as the data to the shutdown observer. The phase is then based on a combination of the topic and tag. There's probably a better way to do this; I admit I don't fully understand the "spinner", "blocker", "phase", "barrier", and "client" concepts that AsyncShutdown implements. This was just the fastest way to get this working.

5. There are bits of state that need to be handled specially depending on whether we're restarting or shutting down. The GUID cache needs to be invalidated, because the database might come back up with a different set of bookmarks. (Bug 1317223 specifically needs this). I think the livemark and keyword caches will need the same treatment, and anything that cleans up on shutdown via `PlacesUtils.registerShutdownFunction`. OTOH, we don't want to set `mCanNotify = false` or remove (some?) observers, because then we'll break the observer mechanism on restart.

So...

* Is all this worth it just for one (complicated) Sync test?

* There are lots of assumptions everywhere that the database is a singleton that exists for the lifetime of the process. This patch is complex because it tries to subvert that, and it's not even complete. Is it worth rethinking how Places is initialized? Or should we really not bother?

* It may be sensible to mock the storage backend in the Sync test, instead of making Places restartable. But it's not clear what a real repair strategy with fake storage tells us. A lot of Sync unit tests use mocks, to the point where they fail to catch obvious breakage because they're not using the real code path. Ideally, our integration test would run against a real database, using as few mocks as possible. Maybe that's not realistic, though.

Anyway, this is turning into a screed, so I'll stop there. :-) When you have a chance to digest this, I'd love your feedback. Thanks!
Flags: needinfo?(mak77)
Blocks: 1343093
Sorry, between my projects and the many other requests, I'm struggling to find time to look into this.
How much do you care about it? What's its priority?
Flags: needinfo?(mak77)
didn't mean to clear the ni.
Flags: needinfo?(mak77)
Sorry, I forgot to circle back on this. We used a different approach for the integration test, so I don't think it's worth doing for the reasons in comment 7.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(mak77)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: