Can we avoid places main thread IO before the first window is shown?

NEW
Unassigned

Status

()

P3
normal
2 years ago
10 months ago

People

(Reporter: florian, Unassigned)

Tracking

(Blocks: 1 bug, {perf})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qf:investigate][fxperf:p3])

(Reporter)

Description

2 years ago
See this startup profile on the quantum reference hardware: https://perfht.ml/2p1y7ce

During early startup, we spend 100ms calling mozilla::storage::Service::OpenUnsharedDatabase from mozilla::places::Database::InitFaviconsDatabaseFile and 43ms calling it from mozilla::places::Database::InitDatabaseFile.

We also spend 56ms in mozilla::storage::Connection::Close() called from mozilla::places::Database::InitFaviconsDatabaseFile. Do we really need to close this database (which flushes to disk) during startup?

If making all of this IO async is difficult (as bug 975979 seems to indicate), could we delay some of this to after the first window is shown?
Making init async is impossible, atm. We didn't even finish converting all the APIs to be async.

Afaict, this is not a bug in Places, it's a bug in its consumer, the browser.

Delaying means delaying components using the bookmarks or history services, that includes things like migrating data from other browsers (included automigration) that we currently do before showing the first window, on purpose, and importing our own default bookmarks. And then the docshell will init history as soon as it needs to use global history but I suppose at this point we have a window.
It is non-trivial, probably one could take the stacks initing the services one by one, and see what is needed to delay them. I can't exclude that some of those stacks are easily fixable, especially if the only requirement is to wait for the first window.

(In reply to Florian Quèze [:florian] [:flo] from comment #0)
> See this startup profile on the quantum reference hardware:
> https://perfht.ml/2p1y7ce
> 
> During early startup, we spend 100ms calling
> mozilla::storage::Service::OpenUnsharedDatabase from
> mozilla::places::Database::InitFaviconsDatabaseFile and 43ms calling it from
> mozilla::places::Database::InitDatabaseFile.

This is measured on a new profile, where we must create the schema yet, right?
On an existing profile it's taking far less, since it just has to open the db and check the schema version.

> We also spend 56ms in mozilla::storage::Connection::Close() called from
> mozilla::places::Database::InitFaviconsDatabaseFile. Do we really need to
> close this database (which flushes to disk) during startup?

We could potentially delay this, the reason to close the connection is just that we don't need it anymore after the db has been created. Again, this happens only on a new profile.
Component: Places → General
Product: Toolkit → Firefox
(Reporter)

Comment 2

2 years ago
(In reply to Marco Bonardo [::mak] from comment #1)

> > During early startup, we spend 100ms calling
> > mozilla::storage::Service::OpenUnsharedDatabase from
> > mozilla::places::Database::InitFaviconsDatabaseFile and 43ms calling it from
> > mozilla::places::Database::InitDatabaseFile.
> 
> This is measured on a new profile, where we must create the schema yet,
> right?
> On an existing profile it's taking far less, since it just has to open the
> db and check the schema version.

Unfortunately, no. I haven't figured out yet how to run the profiler on a new profile (because I need... well, the profiler to be installed ;-)).

I just saw another startup profile this morning with this, so it seems to happen at each startup.

> > We also spend 56ms in mozilla::storage::Connection::Close() called from
> > mozilla::places::Database::InitFaviconsDatabaseFile. Do we really need to
> > close this database (which flushes to disk) during startup?
> 
> We could potentially delay this, the reason to close the connection is just
> that we don't need it anymore after the db has been created. Again, this
> happens only on a new profile.

Is there any way to close asynchronously a database that has been opened synchronously? If not, could we at least run this from an idle callback?
(In reply to Florian Quèze [:florian] [:flo] from comment #2)
> > This is measured on a new profile, where we must create the schema yet,
> > right?
> > On an existing profile it's taking far less, since it just has to open the
> > db and check the schema version.
> 
> Unfortunately, no. I haven't figured out yet how to run the profiler on a
> new profile (because I need... well, the profiler to be installed ;-)).
> 
> I just saw another startup profile this morning with this, so it seems to
> happen at each startup.

After looking at the code, you are right, we indeed must open the database to check that it's not corrupt.
We could maybe avoid this and rely on ATTACH DATABASE failures to detect a problem.
Looks like we will also need to try to remove the right db on case of corruption, so that needs some additional work but I think it's worth it.

> > > We also spend 56ms in mozilla::storage::Connection::Close() called from
> > > mozilla::places::Database::InitFaviconsDatabaseFile. Do we really need to
> > > close this database (which flushes to disk) during startup?
> > 
> > We could potentially delay this, the reason to close the connection is just
> > that we don't need it anymore after the db has been created. Again, this
> > happens only on a new profile.
> 
> Is there any way to close asynchronously a database that has been opened
> synchronously? If not, could we at least run this from an idle callback?

Yes, we can asyncClose a synchronous connection.

So, I can probably fix this for existing profiles, fixing it for new profiles would be quite more complex. I'll file a separate bug for the existing profiles case.
Depends on: 1361322

Comment 4

2 years ago
(In reply to Florian Quèze [:florian] [:flo] from comment #2)
> (In reply to Marco Bonardo [::mak] from comment #1)
> 
> > > During early startup, we spend 100ms calling
> > > mozilla::storage::Service::OpenUnsharedDatabase from
> > > mozilla::places::Database::InitFaviconsDatabaseFile and 43ms calling it from
> > > mozilla::places::Database::InitDatabaseFile.
> > 
> > This is measured on a new profile, where we must create the schema yet,
> > right?
> > On an existing profile it's taking far less, since it just has to open the
> > db and check the schema version.
> 
> Unfortunately, no. I haven't figured out yet how to run the profiler on a
> new profile (because I need... well, the profiler to be installed ;-)).

I think I maybe mentioned this to mstange at some point...  If there is no bug on file for it, we should get one on file.  We should make it possible to capture and dump profiles somewhere without having the extension installed.
You can start the profiler using the env var MOZ_STARTUP_PROFILING=1, then install the add-on and capture a profile right away, and that profile should contain the whole startup including the time before the add-on was installed.

Updated

2 years ago
Whiteboard: [qf] → [qf:investigate]
(Reporter)

Comment 6

2 years ago
I now have a first-startup profile (ie. we are creating the profile during this session) : https://perfht.ml/2sg0mZz
Most of those 130ms on Close are spent doing a checkpoint operation that causes an fsync, moving that fsync on a separate thread sounds good. AsyncClose also has a cost in this case, since it has to spawn a new thread and post to it (with all of its lockings), but should almost always be cheaper than an fsync.

The problem is that from my previous testing just using AsyncClose here completely breaks Places, because the next ATTACH DATABASE query fails (closing opens a transaction and disallows ATTACH calls). What's left to try is to delay the closing to a later time (after the ATTACH call) and see if that works.
Flags: needinfo?(florian)
Flags: needinfo?(florian)
It would be interesting to measure again on the reference hardware, after we handled the most obvious things:
1. Favicons database now is not opened/closed just to check if it's corrupt
2. The database connection is delayed as far as possible, it is initialized when the first consumer needs to run a query
Depends on: 1371677
Whiteboard: [qf:investigate] → [qf:investigate][fxperf]
On reference hardware:

Startup profile on first run: https://perfht.ml/2Ex8OL4
Startup profile on second run: https://perfht.ml/2EB4VEH

In summary, it looks like on first run we're down from 400ms to 200ms, and on subsequent runs we've largely eliminated the cost (down to 6ms)?
Priority: -- → P3
(Reporter)

Comment 10

10 months ago
(In reply to Doug Thayer [:dthayer] from comment #9)
> On reference hardware:
> 
> Startup profile on first run: https://perfht.ml/2Ex8OL4
> Startup profile on second run: https://perfht.ml/2EB4VEH

What do you mean with "first run" and "second run", is it cold vs warm startup, or is it when creating a profile and then when re-using it?

> In summary, it looks like on first run we're down from 400ms to 200ms, and
> on subsequent runs we've largely eliminated the cost (down to 6ms)?

Measuring main thread IO in 'ms' rarely makes sense, because it tends to take a random amount of time, depending mostly on what the system is doing in the background.
(In reply to Florian Quèze [:florian] from comment #10)
> What do you mean with "first run" and "second run", is it cold vs warm
> startup, or is it when creating a profile and then when re-using it?

Sorry, good point. What I meant is creating a profile vs reusing it. Warm in both cases, so I can do a cold start for a better comparison.

> Measuring main thread IO in 'ms' rarely makes sense, because it tends to
> take a random amount of time, depending mostly on what the system is doing
> in the background.

That's certainly fair - is there an efficient measure or a method that you generally use for determining the severity of main thread IO? Just eyeballing what calls are actually made and using intuition?
Btw, the profiles show more or less what we already know, the db creation is on main thread, we pay the cost of the fsync at the end of the creation, unfortunately we can't do anything about this without lots of work (that we are doing, but it's quite demanding).
(Reporter)

Comment 13

10 months ago
(In reply to Doug Thayer [:dthayer] from comment #11)

> > Measuring main thread IO in 'ms' rarely makes sense, because it tends to
> > take a random amount of time, depending mostly on what the system is doing
> > in the background.
> 
> That's certainly fair - is there an efficient measure or a method that you
> generally use for determining the severity of main thread IO? Just
> eyeballing what calls are actually made and using intuition?

Kinda, yes. All main thread IO is bad. The more often it is being done, the worse it is. Doing main thread IO twice is worse than doing it once (ie. if (file.exists()) file.read(); is worse than try { file.read() }, or reading 2 small files synchronously is worse than reading a single file synchronously containing the same amount of data).

I would like to put in place automated tests with a whitelist to ensure we don't introduce more main thread IO, but it's not been done yet.
(Reporter)

Comment 14

10 months ago
(In reply to Marco Bonardo [::mak] from comment #12)
> Btw, the profiles show more or less what we already know, the db creation is
> on main thread, we pay the cost of the fsync at the end of the creation,
> unfortunately we can't do anything about this without lots of work (that we
> are doing, but it's quite demanding).

Isn't this bug WFM at this point? I thought we had already delayed initializing Places until after first paint. Or is the behavior different when creating a new profile?
don't we have the talos xperf test?
I just fear its whitelist is not up-to-date :(
(In reply to Florian Quèze [:florian] from comment #14)
> Isn't this bug WFM at this point? I thought we had already delayed
> initializing Places until after first paint. Or is the behavior different
> when creating a new profile?

The behavior is more or less the same, it depends if someone runs Places code before the first paint. For example the new policies will do IIRC, because they want to touch bookmarks before the ui is shown.

I also expect this to be WFM, but we need a test checking that is true and preventing regressions. Not sure in which shape.
We can also delay some things further after the UI is shown, like the star button, some things in Activity Stream (<link> icons and link coloring). Ideally just opening Firefox should not initialize the Places DB. I have a patch that asserts whe nthe db is initialized, printing out the js stack too. iirc it was hit by a few things after the ui startup, and some of those can be avoided.
(Reporter)

Comment 17

10 months ago
Marking as fxperf:p3 for now to get it off the triage list, but I'm not sure this bug is actionable. Is the next step to write a test? Is it a duplicate of an existing meta bug about making Places I/O async?
Whiteboard: [qf:investigate][fxperf] → [qf:investigate][fxperf:p3]
FWIW, Bug 975979 is the one about async I/O in Places.
You need to log in before you can comment on or make changes to this bug.