Closed Bug 988304 Opened 6 years ago Closed 5 years ago

Avoid main-thread IO for {profile}\blocklist.xml

Categories

(Toolkit :: Add-ons Manager, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla32

People

(Reporter: rvitillo, Assigned: rvitillo)

References

(Blocks 1 open bug)

Details

(Keywords: main-thread-io)

Attachments

(1 file, 4 obsolete files)

{profile}\blocklist.xml is a main-thread IO offender which appears in about 20% of Telemetry's submissions after filtering out IO operations during startup and shutdown.
I think this is bug 861438

Irving, any thoughts on what it would take to fix this?
Flags: needinfo?(irving)
Blair wants to rewrite the blocklist service (bug 782261), but if we don't mind doing work that would be thrown away during the rewrite we could get rid of the main thread I/O.

1) The blocklist service reads blocklist.xml from disk synchronously at (or near) startup time, the first time a service consumer asks if something is blocklisted and after new blocklists are downloaded (bug 717567). We could provide an async API to this and start converting consumers; we'd need a pattern like I used in XPIProvider.jsm to support both sync and async reads of the file to handle calls to the sync API unless we can eliminate all the calls.

2) The blocklist service uses sync I/O to store new blocklist after it is downloaded. Since this is already event-driven, we could easily convert this to async write. The code immediately triggers a sync read of the file and application of the new blocklist rules, so fixing only the write wouldn't solve all the jank.
Flags: needinfo?(irving)
Duplicate of this bug: 861438
Splitting part 2 of comment 2 off to a separate bug at Vladan's suggestion.
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #0)
> {profile}\blocklist.xml is a main-thread IO offender which appears in about
> 20% of Telemetry's submissions after filtering out IO operations during
> startup and shutdown.

Ouch :(

I still have no timeframe for when the rewrite may happen. Given that, I think it would be worth figuring out exactly how much impact it would be to convert the existing blocklist service to be an entirely async API (ignoring the actual work to make IO async for this investigation).
The consumers of nsIBlocklistService are:

- nsPluginTags.cpp
- nsObjectLoadingContent.cpp
- AddonUpdateChecker.jsm
- PluginProvider.jsm
- XPIProvider.jsm

where XPIProvider seems to be the "biggest" consumer of the API. It would seem reasonable to keep both sync and async methods for the service considering that using only async versions of the blocklist methods in XPIProvider would cause a chain reaction that requires quite some changes to XPIProvider.jsm and its consumers.

With my configuration, blocklist.xml is read shortly after starting up by nsPluginTag::GetBlocklistState, apparently when the js navigator object is accessed which happens when my homepage (google.com) is loaded; see the stacktrace below. I tend to assume that it's not rare to have google set as homepage so this case might very well be the most frequent one.

nsPluginTag::GetBlocklistState
nsPluginHost::ScanPluginsDirectory
nsPluginHost::ScanPluginsDirectoryList
nsPluginHost::FindPlugins
nsPluginHost::LoadPlugins
nsPluginHost::GetPlugins
nsPluginArray::EnsurePlugins
nsPluginArray::GetMimeTypes
nsMimeTypeArray::EnsurePluginMimeTypes
nsMimeTypeArray::NamedGetter
nsMimeTypeArray::NamedItem
mozilla::dom::Navigator::JavaEnabled
mozilla::dom::NavigatorBinding::javaEnabled
mozilla::dom::GenericBindingMethod
js::CallJSNative
js::Invoke
Interpret
js::RunScript
js::Invoke
js_fun_call

I am not really sure that we could use a callback in this case; any suggestions are welcome. But then again, if we read blocklist.xml synchronously for the most common scenario we don't really gain anything.

Given all of that, we might consider to asynchronously preload blocklist.xml early at startup and fallback to a synchronous load in case the async load has not been completed when the file is required. We could run an experiment with Telemetry to check if the distribution of "async reads successfully completed" is reasonable.
Flags: needinfo?(irving)
That's an interesting stack trace - we need to do a lot of work to see if Java is enabled... This isn't a new discovery, see bug 634774 (somebody already put a telemetry probe in mozilla::dom::Navigator::JavaEnabled() to record the time spent, http://telemetry.mozilla.org/#beta/29/CHECK_JAVA_ENABLED).

I think aklotz is working on async plugin startup, which could also provide an opportunity to fix that code path.

Rather than doing the speculative async read, I'd prefer to leave the sync I/O in place with telemetry to show us how often it's used, and then rework the callers to fix the worst offenders.
Flags: needinfo?(irving)
Assignee: nobody → rvitillo
Irving, do you know by chance what is the OS.* way of retrieving the XCurProcD path?
Attachment #8419488 - Flags: review?(irving)
Comment on attachment 8419488 [details] [diff] [review]
Avoid main-thread IO for {profile}\blocklist.xml, v1

Review of attachment 8419488 [details] [diff] [review]:
-----------------------------------------------------------------

I'd like to see unit tests for this; you could mock out OS.File like I do in http://dxr.mozilla.org/mozilla-central/source/toolkit/mozapps/extensions/test/xpcshell/test_DeferredSave.js to control the async sequence of when the read returns.

::: toolkit/mozapps/extensions/nsBlocklistService.js
@@ +269,5 @@
>   * It also blocklists plugins with data from blocklist.xml.
>   */
>  
>  function Blocklist() {
>    let os = getObserverService();

Existing code, but you can remove the definition of and calls to getObserverService() and use Services.obs instead.

@@ +765,5 @@
> +    try {
> +      return this._prefetchBlocklistFile(profPath);
> +    } catch (e) {}
> +
> +    var appFile = FileUtils.getFile(KEY_APPDIR, [FILE_BLOCKLIST]);

I can't see an obvious implementation of "XCurProcD" in OS.Constants.Path; Yoric?
Attachment #8419488 - Flags: review?(irving)
Attachment #8419488 - Flags: review-
Attachment #8419488 - Flags: feedback?(dteller)
(In reply to :Irving Reid from comment #10)
> Comment on attachment 8419488 [details] [diff] [review]
> Avoid main-thread IO for {profile}\blocklist.xml, v1
> 
> Review of attachment 8419488 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'd like to see unit tests for this; you could mock out OS.File like I do in
> http://dxr.mozilla.org/mozilla-central/source/toolkit/mozapps/extensions/
> test/xpcshell/test_DeferredSave.js to control the async sequence of when the
> read returns.

In your test you are accessing the module's global object using the value returned from Components.utils.import(), which comes in pretty handy. Do you have a pointer on how to access the global object of a service instead?
Flags: needinfo?(irving)
(In reply to :Irving Reid from comment #10)
> I can't see an obvious implementation of "XCurProcD" in OS.Constants.Path;
> Yoric?

Not yet, I can mentor a bug for adding it, if needed.
Attachment #8419488 - Attachment is obsolete: true
Attachment #8419488 - Flags: feedback?(dteller)
Attachment #8421027 - Flags: review?(irving)
Flags: needinfo?(irving)
Comment on attachment 8421027 [details] [diff] [review]
Avoid main-thread IO for {profile}\blocklist.xml, v2

Review of attachment 8421027 [details] [diff] [review]:
-----------------------------------------------------------------

We're trying to remove the old practice of naming test case files after the bug; please rename the test to something like test_asyncBlocklistLoad.js. Also, since this test doesn't depend on the "install add-ons unpacked" preference, please list the new test in xpcshell.ini rather than in xpcshell-shared.ini (I have a patch pending that moves a couple of other tests into that file).

::: toolkit/mozapps/extensions/test/xpcshell/test_bug988304.js
@@ +22,5 @@
> +  do_check_false(blocklist._isBlocklistLoaded());
> +  do_check_true(blocklist._isBlocklistPreloaded());
> +  blocklist._loadBlocklist();
> +  do_check_true(blocklist._isBlocklistLoaded());
> +  do_check_false(blocklist._isBlocklistPreloaded());

These are good; there's one more test I'd like to see, and that's the one that might require mocking out OS.File, to control exactly when the async OS.File.read() resolves.

- start async load
- before async load completes, do sync load
- async load finishes
Attachment #8421027 - Flags: review?(irving) → review-
Attachment #8421027 - Attachment is obsolete: true
Attachment #8421737 - Flags: review?(irving)
Comment on attachment 8421737 [details] [diff] [review]
Avoid main-thread IO for {profile}\blocklist.xml, v3

Review of attachment 8421737 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/mozapps/extensions/nsBlocklistService.js
@@ +776,5 @@
> +    } catch (e) {}
> +
> +    var appFile = FileUtils.getFile(KEY_APPDIR, [FILE_BLOCKLIST]);
> +    try{
> +      return this._preloadBlocklistFile(appFile.path);

Ah, I missed this in my previous reviews. This isn't going to catch the error the way we want, because _preloadBlocklistFile is always going to successfully return a Promise, rather than directly throwing the 'file not found' error. So, instead of trying the profile dir and then trying the app dir, we'll start both reads and the race will keep the result of the last one to succeed.

What we really want is something like:

OS.File.read(file from profile).then(
  null,  // let success fall through
  err => {
     log "Can't read blocklist from profile", err
     return OS.File.read(file from app dir)
  })
  .then(text => { ... store data if there hasn't been a sync load },
        err => { log error, handle "no blocklist data available" case });

::: toolkit/mozapps/extensions/test/xpcshell/test_asyncBlocklistLoad.js
@@ +30,5 @@
> +  // async -> sync -> async
> +  let read = scope.OS.File.read
> +  scope.OS.File.read = function(...args){
> +    blocklist._loadBlocklist();
> +    return read(...args);

This approach basically works, but the _loadBlocklist() call is going to happen while the code that calls OS.File.read() is on the stack, rather than on a separate step of the event loop. To let the stack unwind before calling _loadBlocklist() you can do:

return new Promise((resolve, reject) =>
  do_execute_soon(() => {
    blocklist._loadBlocklist();
    resolve(read(...args));
  }));

I've been fighting random test failures a lot lately, and I'm getting very sensitive to the state of the stack during callbacks in test fixtures. It might be safe in this case, but I'm advocating a blanket policy of "don't do anything except test assertions directly inside callbacks".
Attachment #8421737 - Flags: review?(irving) → review-
Attachment #8421737 - Attachment is obsolete: true
Attachment #8423098 - Flags: review?(irving)
Comment on attachment 8423098 [details] [diff] [review]
Avoid main-thread IO for {profile}\blocklist.xml, v4

Review of attachment 8423098 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/mozapps/extensions/nsBlocklistService.js
@@ +768,5 @@
> +    this._pluginEntries = null;
> +    this._preloadedBlocklistContent = null;
> +  },
> +
> +  _preloadBlocklist: Task.async(function() {

nit: function* ()

::: toolkit/mozapps/extensions/test/xpcshell/test_asyncBlocklistLoad.js
@@ +32,5 @@
> +  scope.OS.File.read = function(...args) {
> +    return new Promise((resolve, reject) => {
> +      do_execute_soon(() => {
> +        blocklist._loadBlocklist();
> +        return resolve(read(...args));

nit: the 'return' isn't necessary here; do_execute_soon throws away the return value of the inner function. Calling 'resolve' does the real work.
Attachment #8423098 - Flags: review?(irving) → review+
https://hg.mozilla.org/integration/fx-team/rev/7365e8f676e6
Flags: in-testsuite+
Keywords: checkin-needed
Whiteboard: [fixed-in-fx-team]
https://hg.mozilla.org/mozilla-central/rev/7365e8f676e6
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → mozilla32
You need to log in before you can comment on or make changes to this bug.