Closed Bug 1363541 Opened 7 years ago Closed 4 years ago

Permission Manager should not block the first window from painting

Categories

(Core :: Permission Manager, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla77
Performance Impact medium
Tracking Status
firefox77 --- fixed

People

(Reporter: mconley, Assigned: baku)

References

(Blocks 2 open bugs)

Details

(Keywords: perf, perf:responsiveness, Whiteboard: [fxperf:p2])

Attachments

(6 files, 1 obsolete file)

Apparently, Permission Manager does main thread, sync IO (superb), and also runs before the first window paints.

It's first kicked off here: http://searchfox.org/mozilla-central/rev/7057a51c5cf29b5b115b1db19ace2cfe3fd83a0e/dom/notification/Notification.cpp#736-740

According to mystor, commenting that out just kicks the can, since it's re-initted as soon as resource://gre-resources/hiddenWindow.html is loaded.

We should not do sync IO here, or if that's somehow unavoidable, we should definitely not block first paint.
Unfortunately the nsIPermissionManger API is very old, very synchronous, and very widely used around our code base. The best we can do is probably to:

a) start trying to read the data asynchronously and allow main thread work to occur while it is being read, and
b) block the main thread when a sync request for the data is made, waiting for that background thread to finish.

We might be able to improve performance a bit by not actually fetching (a.k.a blocking on reading) permissions during initialization, and instead lazily doing it when permissions are requested for a non-resource://, chrome://, or about: URI. about:home does have a permission set on it (uitour - see http://searchfox.org/mozilla-central/rev/7057a51c5cf29b5b115b1db19ace2cfe3fd83a0e/browser/app/permissions#15), but we could probably hardcode that one in.

If we remove the during page load permission checks (by not requiring them for parent process URIs), we still need this info as we start a content process - as there are permissions which need to be eagerly sent to the content process. We used to do this in a sync IPC from the content to the parent, but now we send down a subset of permissions at content process startup (which of course will require full permission manager initialization). I tried looking into not doing this, but we definitely can't do it easily. Even if we did all of this work, I'm not sure that we start the content process late enough that this will actually make a difference to our startup times.

Most profiles which I've seen don't have very large nsPermissionManager databases, but I know that some people have very large permission manager databases, as addons have been known to dump a lot of data into them.

Do you think this is doable and/or desirable ehsan?
Flags: needinfo?(ehsan)
I think we should certainly do something here.

My preferred option is starting an async load and blocking on it if we have to for now.  

This requires getting a sense of when the first time that a permission check happens actually is.  It would be nice if you can do a basic preliminary test, something like setting a breakpoint in the TestPermissionCommon function and looking at the call stack of when it gets called first.

If we can reasonably easily make our code not call it when starting up and showing a window, which sounds doable to me, then, the steps become:

0) Add some telemetry about the size of the permissions DB.  Hopefully it'll be small for most people.  As we'll do the below we need to add telemetry for whether the async load finishes in time or not, etc. too.

1) Make browser/app/permissions not require IO if it currently does (I don't remember how it's preloaded off the top of my head).  Please check that!

2) Kick off an async load of the permissions DB early in the startup.

3) Add code to block on the async load if it's in progress when you need to access the DB.

4) Cross your fingers that by that time the load has finished.

How does this sound?  I think the upside of this plan is that we can do it without too much code changes and complexity/risk.
Flags: needinfo?(ehsan)
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #2)
> I think we should certainly do something here.
> 
> My preferred option is starting an async load and blocking on it if we have
> to for now.  
> 
> This requires getting a sense of when the first time that a permission check
> happens actually is.  It would be nice if you can do a basic preliminary
> test, something like setting a breakpoint in the TestPermissionCommon
> function and looking at the call stack of when it gets called first.

In order to delay this enough to be useful, I'd have to create a definition of the set of origins which we want to allow setting permissions on. I imagine that this set will probably be something along the lines of `http`, `https`, `ftp`, `file`, and perhaps `webextension` URIs. For an initial test I would probably whitelist resource:// chrome:// and about: principals to be checked while the permission manager is not initialized, and would add a print statement when the permission manager would be initialized. I'll see how late in the startup process I can get this. Currently the earliest point in time when I know we _need_ the permission manager to be initialized is process startup, as we need to send down permissions for any file:// URIs or URIs which have registered service workers at process startup. I don't think I'll be able to push this past that point in time.

> If we can reasonably easily make our code not call it when starting up and
> showing a window, which sounds doable to me, then, the steps become:
> 
> 0) Add some telemetry about the size of the permissions DB.  Hopefully it'll
> be small for most people.  As we'll do the below we need to add telemetry
> for whether the async load finishes in time or not, etc. too.
> 
> 1) Make browser/app/permissions not require IO if it currently does (I don't
> remember how it's preloaded off the top of my head).  Please check that!

I think it does perform IO right now (and is probably used by repacks - I'm not sure if we can remove this particular IO - that being said I think it is loaded from the JAR so it shouldn't be too awful).

> 2) Kick off an async load of the permissions DB early in the startup.

I'm thinking we'll want to use a Permissions DB thread to perform this I/O as it will allow us to fill the backing nsTHashtable off main thread as well, and then just hand the data back to the nsPermissionManager on the main thread when we're done, rather than a series of AddInternal calls for each permission as they are loaded from the DB. I imagine it would be much quicker.

> 3) Add code to block on the async load if it's in progress when you need to
> access the DB.
>
> 4) Cross your fingers that by that time the load has finished.

I can add a telemetry probe for when the first request is made as to whether or not the permission manager DB was ready (the probe would probably be Permission Manager wait time)

> How does this sound?  I think the upside of this plan is that we can do it
> without too much code changes and complexity/risk.

Yes, this is pretty much what I was planning to do when I suggested this to you yesterday. Sounds good to me.
I've done some looking into what happens before the window appears which would trigger the permission manager to be initialized before we paint the first window. There are a few things:

0) The NotificationTelemetryService initializes the permission manager very early on. This should be easy to move to a permission-manager-initialized observer.

1) We eagerly make the content browser remote after DOMContentLoaded is fired (http://searchfox.org/mozilla-central/source/browser/base/content/browser.js#1127) - Making a browser remote starts a content process and forces permission manager initialization. This occurs before the window is shown.

2) We start loading 2 HTTP uris before the first window is painted. The order of these being loaded seems to be non-deterministic, and sometimes they load after the window is painted, and sometimes before.
    https://location.services.mozilla.com/v1/country?key=no-mozilla-api-key
    http://detectportal.firefox.com/success.txt

3) If we don't initialize for the above events, then we will initialize the browser when performing sessionrestore.

4) It doesn't seem like we preallocate a new content process before the window is rendered.

The startup process may take long enough that if we start early on (perhaps when we get the profile directory?) in a secondary thread to load the permission manager we could have it initialized before DOMContentLoaded is fired, saving us the need to block there.

If we do that, our biggest problems will be with our migrations code, which access services like the history service in order to perform history migrations. We wouldn't be able to run these migrations on background threads, so we would probably have to make whatever unlucky main thread event triggers the permission manager to be initialized synchronously run the migration if the background permission manager initializer doesn't do it correctly.
Flags: needinfo?(ehsan)
Assigning to self while I investigate.
Assignee: nobody → michael
We talked about this in person with Michael, and he has a plan!
Flags: needinfo?(ehsan)
unassigning as I'm not actively looking into doing this right now
Assignee: michael → nobody
(From bug 975996 comment #16)

> Here is a startup profile of the current nightly on the reference hardware:
> https://perfht.ml/2ti5Tez
> 
> It shows nsPermissionManager::InitDB taking 152ms on the main thread, that's
> 12.9% of the time before first paint.
> 
> Note: this is neither a cold startup nor a warm startup; it was the first
> startup after an update (so most system libraries are already in disk cache)
Whiteboard: [qf]
Michael and I briefly discussed this and it sounds like this will mainly affect existing "dirty" profiles. It sounds like we can work on this post-57.
(In reply to Andrew Overholt [:overholt] from comment #9)
> it sounds like this will mainly affect existing "dirty" profiles.

Why? The file being smaller for recent profiles doesn't make main thread IO cheap when opening the file. It's the open() call that's expensive here, not the read.
Whiteboard: [qf] → [qf:p1]
(In reply to Florian Quèze [:florian] [:flo] from comment #10)
> Why? The file being smaller for recent profiles doesn't make main thread IO
> cheap when opening the file. It's the open() call that's expensive here, not
> the read.

I defer to you with knowledge about what part of opening files is expensive. I can work on making the happy case of starting the nsPermissionManager without migrations faster, but it'll probably reduce the efficiency of startup with migrations (I don't think we've migrated since I was an intern, so I don't think that should be an issue).

Basically I would start initializing off main thread until we need to access something which requires hitting the permission manager. Once that happens, the main thread will finish initializing if it isn't initialized yet, and perform migrations if they are required, synchronously on the main thread.
Assignee: nobody → michael
Whiteboard: [qf:p1] → [qf:p2]
Keywords: perf
Priority: -- → P2
Whiteboard: [qf:p2] → [qf:p1]
Whiteboard: [qf:p1] → [qf:i60][qf:p1]
Whiteboard: [qf:i60][qf:p1] → [qf:f60][qf:p1]
Whiteboard: [qf:f60][qf:p1] → [qf:f61][qf:p1]
Whiteboard: [qf:f61][qf:p1] → [qf:f61][qf:p1][fxperf]
Probably P3 for the front-end team, but definitely a high value platform improvement.
Whiteboard: [qf:f61][qf:p1][fxperf] → [qf:f61][qf:p1][fxperf:p3]
Whiteboard: [qf:f61][qf:p1][fxperf:p3] → [qf:f64][qf:p1][fxperf]
Hey Nika,

I'll bet you're absolutely swamped lately. Is it fair to say that you're unlikely to get to this soon, and we should clear the assignee field?
Flags: needinfo?(nika)
yes :-)
Assignee: nika → nobody
Flags: needinfo?(nika)
Whiteboard: [qf:f64][qf:p1][fxperf] → [qf:p1:f64][fxperf]
Whiteboard: [qf:p1:f64][fxperf] → [qf:p1:f64][fxperf:p2]
Whiteboard: [qf:p1:f64][fxperf:p2] → [qf:p2:responsiveness][fxperf:p2]
Blocks: 1543203
Assignee: nobody → amarchesini
Status: NEW → ASSIGNED
Attachment #9138763 - Attachment description: Bug 1363541 - Modernize the PermissionManager - nsPermission to mozilla::Permission, r=timhuang → Bug 1363541 - Modernize the PermissionManager - part 1 - nsPermission to mozilla::Permission, r=timhuang
Attachment #9138765 - Attachment description: Bug 1363541 - Modernize the PermissionManager - PermissionDelegateHandler to mozilla::PermissionDelegateHandler, r?timhuang → Bug 1363541 - Modernize the PermissionManager - part 2 - PermissionDelegateHandler to mozilla::PermissionDelegateHandler, r?timhuang
Attachment #9138766 - Attachment description: Bug 1363541 - Modernize the PermissionManager - DB handling in a separate thread, r?timhuang → Bug 1363541 - Modernize the PermissionManager - part 3 - DB handling in a separate thread, r?timhuang
Attachment #9139104 - Attachment is obsolete: true
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/aeaa10789071
Modernize the PermissionManager - part 1 - nsPermission to mozilla::Permission, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/9d6b9052f413
Modernize the PermissionManager - part 2 - PermissionDelegateHandler to mozilla::PermissionDelegateHandler, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/80d4d1f0c921
Modernize the PermissionManager - part 3 - DB handling in a separate thread, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/d2ba944a47a3
Modernize the PermissionManager - part 4 - mozilla namespace, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/1f397b686c11
Modernize the PermissionManager - part 5 - headers, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/947073be919f
Modernize the PermissionManager - part 6 - moving code around, r=timhuang
Flags: needinfo?(amarchesini)
Flags: needinfo?(amarchesini)
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/827a9a2866bd
Modernize the PermissionManager - part 1 - nsPermission to mozilla::Permission, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/9f413a8a47bb
Modernize the PermissionManager - part 2 - PermissionDelegateHandler to mozilla::PermissionDelegateHandler, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/aa7c6668b249
Modernize the PermissionManager - part 3 - DB handling in a separate thread, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/bf41b0c139f6
Modernize the PermissionManager - part 4 - mozilla namespace, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/9212bfd89eca
Modernize the PermissionManager - part 5 - headers, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/a775f6e9eb41
Modernize the PermissionManager - part 6 - moving code around, r=timhuang

Tier-1 failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&selectedJob=297105361&resultStatus=testfailed%2Cbusted%2Cexception&failure_classification_id=2&tochange=5e48ec747a431dd0f25403867c03ca2880da8cf0&fromchange=9017241c45e669a83636529d262e45c29646ef70

TEST-UNEXPECTED-TIMEOUT | toolkit/components/telemetry/tests/unit/test_GeckoView.js | Test timed out
PID 8880 | Assertion failure: PermissionAvailable(aPrincipal, aType), at /builds/worker/checkouts/gecko/extensions/permissions/PermissionManager.cpp:1651

Flags: needinfo?(amarchesini)
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/53560f35598a
Modernize the PermissionManager - part 1 - nsPermission to mozilla::Permission, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/90b516fa9aa5
Modernize the PermissionManager - part 2 - PermissionDelegateHandler to mozilla::PermissionDelegateHandler, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/1d7b14496648
Modernize the PermissionManager - part 3 - DB handling in a separate thread, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/8b4dd0da0d5a
Modernize the PermissionManager - part 4 - mozilla namespace, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/273f709e526e
Modernize the PermissionManager - part 5 - headers, r=timhuang
https://hg.mozilla.org/integration/autoland/rev/0c9474723b67
Modernize the PermissionManager - part 6 - moving code around, r=timhuang
Regressions: 1629273
See Also: → 1629369
Regressions: 1629450
Regressions: 1629274
Performance Impact: --- → P2
Whiteboard: [qf:p2:responsiveness][fxperf:p2] → [fxperf:p2]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: