Closed Bug 1547693 Opened 2 years ago Closed 9 months ago

Initial handler service setup does mainthread IO on file paths containing "microsoft.windowscommunicationapps" by enumerating protocol handlers

Categories

(Firefox :: File Handling, defect, P3)

68 Branch
Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
Firefox 75
Tracking Status
firefox68 --- wontfix
firefox75 --- fixed

People

(Reporter: Gijs, Assigned: Gijs)

References

(Blocks 1 open bug)

Details

(Keywords: main-thread-io, perf, Whiteboard: [fxperf:p2] [fxperfsize:M])

Attachments

(2 files)

When we start up the handler service, it causes IO on Win10 because we need to talk to the protocol handler service:

(root) []
XREMain::XRE_main []
nsObserverService::NotifyObservers final-ui-startup []
XPCWrappedJS method call []
js::RunScript []
BG_observe [resource:///modules/BrowserGlue.jsm:720:0]
BG__beforeUIStartup [resource:///modules/BrowserGlue.jsm:1010:0]
earlyInit [resource://pdf.js/PdfJs.jsm:110:0]
checkEnabled [resource://pdf.js/PdfJs.jsm:246:0]
_isEnabled [resource://pdf.js/PdfJs.jsm:206:0]
isDefaultHandler [resource://pdf.js/PdfJs.jsm:67:0]
isDefaultHandlerApp [resource://pdf.js/PdfjsChromeUtils.jsm:332:0]
XPCWrappedJS method call []
js::RunScript []
exists [jar:file:///Z:/task_1556209659/build/application/firefox/omni.ja!/components/HandlerService.js:494:0]
_getHandlerListByHandlerInfoType [jar:file:///Z:/task_1556209659/build/application/firefox/omni.ja!/components/HandlerService.js:477:0]
js::RunScript []
get _store [jar:file:///Z:/task_1556209659/build/application/firefox/omni.ja!/components/HandlerService.js:36:0]
_ensureStoreInitialized [jar:file:///Z:/task_1556209659/build/application/firefox/omni.ja!/components/HandlerService.js:51:0]
_injectDefaultProtocolHandlersIfNeeded [jar:file:///Z:/task_1556209659/build/application/firefox/omni.ja!/components/HandlerService.js:75:0]
_injectDefaultProtocolHandlers [jar:file:///Z:/task_1556209659/build/application/firefox/omni.ja!/components/HandlerService.js:106:0]

At time of writing, this code looks like this:

  _injectDefaultProtocolHandlers() {
    let schemesPrefBranch = Services.prefs.getBranch("gecko.handlerService.schemes.");
    let schemePrefList = schemesPrefBranch.getChildList("");

    let schemes = {};

    // read all the scheme prefs into a hash
    for (let schemePrefName of schemePrefList) {
      let [scheme, handlerNumber, attribute] = schemePrefName.split(".");

      try {
        let attrData =
          schemesPrefBranch.getComplexValue(schemePrefName,
                                            Ci.nsIPrefLocalizedString).data;
        if (!(scheme in schemes)) {
          schemes[scheme] = {};
        }

        if (!(handlerNumber in schemes[scheme])) {
          schemes[scheme][handlerNumber] = {};
        }

        schemes[scheme][handlerNumber][attribute] = attrData;
      } catch (ex) {}
    }

    for (let scheme of Object.keys(schemes)) {
      let protoInfo = gExternalProtocolService.getProtocolHandlerInfo(scheme);

      // cache the possible handlers to avoid extra xpconnect traversals.
      let possibleHandlers = protoInfo.possibleApplicationHandlers;

      for (let handlerNumber of Object.keys(schemes[scheme])) {
        let handlerApp = this.handlerAppFromSerializable(schemes[scheme][handlerNumber]);
        // If there is already a handler registered with the same template
        // URL, the newly added one will be ignored when saving.
        possibleHandlers.appendElement(handlerApp);
      }

      this.store(protoInfo);
    }

The relevant protocols are webcal, irc, ircs and mailto.

It looks like we're trying to add our own handler information by eagerly getting the existing handler information for all these schemes, which is the bit that's causing IO here.

I don't know why this is necessary. Paolo, can you provide some background? In particular, I assume we're not actually duplicating the information from the OS into our own handler store file, so why can't we store only the "new" bits?

(This is currently tripped by the PDF.js code, which is bug 1545167, but I expect fixing that might just move the issue as far as Windows is concerned.)

Flags: needinfo?(paolo.mozmail)

Profile showing this: https://perfht.ml/2WadeMw .

OS: Unspecified → Windows 10
Summary: Initial handler service setup does mainthread IO by enumerating protocol handlers → Initial handler service setup does mainthread IO on file paths containing "microsoft.windowscommunicationapps" by enumerating protocol handlers
Whiteboard: [fxperf] → [fxperf:p2]

Tentatively marking this medium given that the handler service code isn't very well known at this point.

Whiteboard: [fxperf:p2] → [fxperf:p2] [fxperfsize:M]

I'm pretty sure I've mentioned this before, although it may be in some loosely related bug that I couldn't find. The nsIHandlerInfo interface offers information from the operating system and our data store in the same object, and you can't get one without the other. This has always been an issue, for example hinted at in bug 395142, and there is bug 393122 tracking related simplifications, only some of which happened already.

On the related pdf.js bug 1545167, it is probably in the same bucket as bug 1362401, which was fixed in bug 1389443. If I remember correctly this followed my suggestion of adding a preference to cache the enabled state, although it may not be effective on first startup. Some code may still be checking the actual value from the handler service in the background, which would still cause main-thread I/O, although delayed at an asynchronous moment. I haven't looked at the code in detail, so take this with a grain of salt.

There's also been some recent activity around the handler service in bug 1452278, possibly related to bug 1446549, but I'm not sure how much of it is also related to this bug or to bug 1362564.

That's it for my search of related bug, hope it helps!

Flags: needinfo?(paolo.mozmail)

The priority flag is not set for this bug.
:Gijs, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(gijskruitbosch+bugs)
Priority: -- → P3
See Also: → 1559195

(In reply to :Paolo Amadini from comment #3)

I'm pretty sure I've mentioned this before, although it may be in some loosely related bug that I couldn't find. The nsIHandlerInfo interface offers information from the operating system and our data store in the same object, and you can't get one without the other. This has always been an issue, for example hinted at in bug 395142, and there is bug 393122 tracking related simplifications, only some of which happened already.

Yeah, my problem is that what gets written to disk in handlers.json doesn't seem to contain information from the OS, I think ever (right?), so it seems like writing default protocol handlers should be possible without touching the OS or asking it things. What am I missing?

Flags: needinfo?(paolo.mozmail)

(In reply to :Gijs (he/him) from comment #5)

Yeah, my problem is that what gets written to disk in handlers.json doesn't seem to contain information from the OS, I think ever (right?)

Well, not as far as I can remember... it's a been a long time since I last looked into this, and, with multi-process added on top of the existing architecture, this really looks like spaghetti code right now, which makes it difficult to follow the logic, but that said I think this excerpt of test code can shed some light:

https://searchfox.org/mozilla-central/rev/2f09184ec781a2667feec87499d4b81b32b6c48e/uriloader/exthandler/tests/HandlerServiceTestUtils.jsm#81-121

In summary, nsIHandlerInfo objects should always be constructed by calling either getMIMEInfoFromOS or getProtocolInfoFromOS. I think this is what populates possibleApplicationHandlers with the list obtained from the OS. This is a "pre-merged" list, and code exists all around the place to check for duplicates manually at various moments. The "fillHandlerInfo" method simply adds elements to the list (https://searchfox.org/mozilla-central/rev/2f09184ec781a2667feec87499d4b81b32b6c48e/uriloader/exthandler/HandlerService.js#403-437), and also sets the preferred handler based on what is listed first in the JSON file (by convention).

Removing elements of the list that were provided by the OS doesn't have any permanent effect, because you can do that, but they would re-appear when a different nsIHandlerInfo object for the same MIME type or protocol is constructed.

What "HandlerService.js" saves to "handlers.json" is always the full "merged" list, because there is no way to tell what is the origin of each element (operating system, user addition, or something from our default handler list). This architectural problem is hinted at by the IDL comment at https://searchfox.org/mozilla-central/rev/2f09184ec781a2667feec87499d4b81b32b6c48e/uriloader/exthandler/nsIHandlerService.idl#28-70. The right solution would seem to involve two or three separate data objects plus a third object that merges the information from them, but given that this code is now intertwined across multiple processes and can be platform-specific, I don't know how realistic it is that this refactoring could happen.

The fact that properties of nsIHandlerInfo are retrieved to the OS eagerly has also prompted people to add workarounds like https://searchfox.org/mozilla-central/rev/2f09184ec781a2667feec87499d4b81b32b6c48e/uriloader/exthandler/HandlerService.js#292-320.

In a sense, with a lot of careful thought you may be able to lazify the addition of the our browser default handlers in a similar way, waiting for the time when fillHandlerInfo is called. You'd have to ensure that updates across browser versions and different locales are handled consistently, but because of another quirk of the interface, you would also have to proxy the "exists" method which is always called before fillHandlerInfo...

In summary, cleaning up these architectural issues is a big project by itself that could result in a lot of simplification. If I also remember correctly, bug 1362564 is a big architectural issue related to this, that I believe still blocks the removal of significant synchronous cross-process calls and I/O, and forces all of the nsIHandlerInfo APIs discussed above to be synchronous.

Hope this helps!

Flags: needinfo?(paolo.mozmail)
Assignee: nobody → gijskruitbosch+bugs
Status: NEW → ASSIGNED
Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/f74a32d5753a
do not bother looking up protocol information with the OS just to store the default shipped options, r=florian
Flags: needinfo?(gijskruitbosch+bugs)
Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/f903fb5ddc62
do not bother looking up protocol information with the OS just to store the default shipped options, r=florian
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 75

Thunderbird has no default handlers so test_check_defaults_get_added does
nothing and test_check_default_modification fails looking for a mailto handler.

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/integration/autoland/rev/e48bb64b5188
Disable new test on Thunderbird where it won't pass. r=Gijs
You need to log in before you can comment on or make changes to this bug.