Can we avoid places main thread IO before the first window is shown?

NEW
Unassigned

Status

()

Firefox
General
6 months ago
18 days ago

People

(Reporter: florian, Unassigned)

Tracking

(Blocks: 1 bug, {perf})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qf:investigate])

(Reporter)

Description

6 months ago
See this startup profile on the quantum reference hardware: https://perfht.ml/2p1y7ce

During early startup, we spend 100ms calling mozilla::storage::Service::OpenUnsharedDatabase from mozilla::places::Database::InitFaviconsDatabaseFile and 43ms calling it from mozilla::places::Database::InitDatabaseFile.

We also spend 56ms in mozilla::storage::Connection::Close() called from mozilla::places::Database::InitFaviconsDatabaseFile. Do we really need to close this database (which flushes to disk) during startup?

If making all of this IO async is difficult (as bug 975979 seems to indicate), could we delay some of this to after the first window is shown?

Comment 1

6 months ago
Making init async is impossible, atm. We didn't even finish converting all the APIs to be async.

Afaict, this is not a bug in Places, it's a bug in its consumer, the browser.

Delaying means delaying components using the bookmarks or history services, that includes things like migrating data from other browsers (included automigration) that we currently do before showing the first window, on purpose, and importing our own default bookmarks. And then the docshell will init history as soon as it needs to use global history but I suppose at this point we have a window.
It is non-trivial, probably one could take the stacks initing the services one by one, and see what is needed to delay them. I can't exclude that some of those stacks are easily fixable, especially if the only requirement is to wait for the first window.

(In reply to Florian Quèze [:florian] [:flo] from comment #0)
> See this startup profile on the quantum reference hardware:
> https://perfht.ml/2p1y7ce
> 
> During early startup, we spend 100ms calling
> mozilla::storage::Service::OpenUnsharedDatabase from
> mozilla::places::Database::InitFaviconsDatabaseFile and 43ms calling it from
> mozilla::places::Database::InitDatabaseFile.

This is measured on a new profile, where we must create the schema yet, right?
On an existing profile it's taking far less, since it just has to open the db and check the schema version.

> We also spend 56ms in mozilla::storage::Connection::Close() called from
> mozilla::places::Database::InitFaviconsDatabaseFile. Do we really need to
> close this database (which flushes to disk) during startup?

We could potentially delay this, the reason to close the connection is just that we don't need it anymore after the db has been created. Again, this happens only on a new profile.
Component: Places → General
Product: Toolkit → Firefox
(Reporter)

Comment 2

6 months ago
(In reply to Marco Bonardo [::mak] from comment #1)

> > During early startup, we spend 100ms calling
> > mozilla::storage::Service::OpenUnsharedDatabase from
> > mozilla::places::Database::InitFaviconsDatabaseFile and 43ms calling it from
> > mozilla::places::Database::InitDatabaseFile.
> 
> This is measured on a new profile, where we must create the schema yet,
> right?
> On an existing profile it's taking far less, since it just has to open the
> db and check the schema version.

Unfortunately, no. I haven't figured out yet how to run the profiler on a new profile (because I need... well, the profiler to be installed ;-)).

I just saw another startup profile this morning with this, so it seems to happen at each startup.

> > We also spend 56ms in mozilla::storage::Connection::Close() called from
> > mozilla::places::Database::InitFaviconsDatabaseFile. Do we really need to
> > close this database (which flushes to disk) during startup?
> 
> We could potentially delay this, the reason to close the connection is just
> that we don't need it anymore after the db has been created. Again, this
> happens only on a new profile.

Is there any way to close asynchronously a database that has been opened synchronously? If not, could we at least run this from an idle callback?

Comment 3

6 months ago
(In reply to Florian Quèze [:florian] [:flo] from comment #2)
> > This is measured on a new profile, where we must create the schema yet,
> > right?
> > On an existing profile it's taking far less, since it just has to open the
> > db and check the schema version.
> 
> Unfortunately, no. I haven't figured out yet how to run the profiler on a
> new profile (because I need... well, the profiler to be installed ;-)).
> 
> I just saw another startup profile this morning with this, so it seems to
> happen at each startup.

After looking at the code, you are right, we indeed must open the database to check that it's not corrupt.
We could maybe avoid this and rely on ATTACH DATABASE failures to detect a problem.
Looks like we will also need to try to remove the right db on case of corruption, so that needs some additional work but I think it's worth it.

> > > We also spend 56ms in mozilla::storage::Connection::Close() called from
> > > mozilla::places::Database::InitFaviconsDatabaseFile. Do we really need to
> > > close this database (which flushes to disk) during startup?
> > 
> > We could potentially delay this, the reason to close the connection is just
> > that we don't need it anymore after the db has been created. Again, this
> > happens only on a new profile.
> 
> Is there any way to close asynchronously a database that has been opened
> synchronously? If not, could we at least run this from an idle callback?

Yes, we can asyncClose a synchronous connection.

So, I can probably fix this for existing profiles, fixing it for new profiles would be quite more complex. I'll file a separate bug for the existing profiles case.

Updated

6 months ago
Depends on: 1361322

Comment 4

6 months ago
(In reply to Florian Quèze [:florian] [:flo] from comment #2)
> (In reply to Marco Bonardo [::mak] from comment #1)
> 
> > > During early startup, we spend 100ms calling
> > > mozilla::storage::Service::OpenUnsharedDatabase from
> > > mozilla::places::Database::InitFaviconsDatabaseFile and 43ms calling it from
> > > mozilla::places::Database::InitDatabaseFile.
> > 
> > This is measured on a new profile, where we must create the schema yet,
> > right?
> > On an existing profile it's taking far less, since it just has to open the
> > db and check the schema version.
> 
> Unfortunately, no. I haven't figured out yet how to run the profiler on a
> new profile (because I need... well, the profiler to be installed ;-)).

I think I maybe mentioned this to mstange at some point...  If there is no bug on file for it, we should get one on file.  We should make it possible to capture and dump profiles somewhere without having the extension installed.
You can start the profiler using the env var MOZ_STARTUP_PROFILING=1, then install the add-on and capture a profile right away, and that profile should contain the whole startup including the time before the add-on was installed.

Updated

6 months ago
Whiteboard: [qf] → [qf:investigate]
(Reporter)

Comment 6

5 months ago
I now have a first-startup profile (ie. we are creating the profile during this session) : https://perfht.ml/2sg0mZz

Comment 7

5 months ago
Most of those 130ms on Close are spent doing a checkpoint operation that causes an fsync, moving that fsync on a separate thread sounds good. AsyncClose also has a cost in this case, since it has to spawn a new thread and post to it (with all of its lockings), but should almost always be cheaper than an fsync.

The problem is that from my previous testing just using AsyncClose here completely breaks Places, because the next ATTACH DATABASE query fails (closing opens a transaction and disallows ATTACH calls). What's left to try is to delay the closing to a later time (after the ATTACH call) and see if that works.
Flags: needinfo?(florian)

Updated

5 months ago
Flags: needinfo?(florian)

Comment 8

3 months ago
It would be interesting to measure again on the reference hardware, after we handled the most obvious things:
1. Favicons database now is not opened/closed just to check if it's corrupt
2. The database connection is delayed as far as possible, it is initialized when the first consumer needs to run a query
Depends on: 1371677
Keywords: perf
You need to log in before you can comment on or make changes to this bug.