resource:///modules/WindowsJumpLists.jsm still does main thread I/O
Categories
(Firefox :: Shell Integration, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox122 | --- | fixed |
People
(Reporter: florian, Assigned: mconley)
References
(Blocks 1 open bug)
Details
(Keywords: main-thread-io, perf, perf:responsiveness, Whiteboard: [bhr:JumpList])
Attachments
(6 files, 1 obsolete file)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
See this profile captured on the 2018 ref hardware: https://perfht.ml/2DXOTBP
Many main thread I/O syscalls caused by our calls to CShellLink::SetPath, and some more caused by mozilla::widget::FaviconHelper::ObtainCachedIconFile calling nsLocalFile::Exists.
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Reporter | ||
Comment 1•5 years ago
|
||
I thought the JumpList slowness was only due to main thread I/O, but I captured yesterday a profile on average user hardware with an SSD where mozilla::widget::JumpListBuilder::AddListToBuild blocked for almost 3s: https://perfht.ml/2Lxx9SC
Comment 2•5 years ago
|
||
(In reply to Florian Quèze [:florian] from comment #1)
I thought the JumpList slowness was only due to main thread I/O, but I captured yesterday a profile on average user hardware with an SSD where mozilla::widget::JumpListBuilder::AddListToBuild blocked for almost 3s: https://perfht.ml/2Lxx9SC
Looks like Windows does some IO behind the scenes, as well as blocking on communicating with some other thread...
Comment 3•4 years ago
|
||
Sylvestre, I was talking to Jim Mathies who originally worked on a lot of this code. He suggested gcp might have an opinion about who could look at this, he suggested Chris Martin. It certainly seems like this is an OS integration issue so I'm referring this to you for now, so we can hopefully move this forward. This is frequently seen in BHR data and this type of main thread I/O is something we really would prefer not to see.
Comment 4•4 years ago
|
||
How important it is ?
As it has been occurring for a long time, I am wondering what should be the timeline here.
thanks
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
Updated•3 years ago
|
Comment 5•3 years ago
|
||
He suggested gcp might have an opinion about who could look at this, he suggested Chris Martin. It certainly seems like this is an OS integration issue
FWIW this kind of shell integration stuff is a bit closer to the Desktop Integration team from Amir. So it's actually with the right triage owner in Bugzilla. Amir told me they can take it but they're not clear how to prioritize. Looking at bug 1667807, it looks like we considered getting that data but haven't yet.
Comment 6•3 years ago
|
||
Arguably bug 1667807 could be a good first/second bug.
Reporter | ||
Comment 7•3 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #5)
Amir told me they can take it but they're not clear how to prioritize.
From BHR data (covering only the Nightly user population) we can say that:
- this is consistently among the 10 most frequent hangs (depending on the day it seems to be somewhere between #6 and #9).
- this represents about 1% of the total hang time
- the average hang time is between 500ms and 1s (but the average isn't really meaningful; the median would be more interesting but this data isn't easily accessible). We only collect hang reports when the main thread has been blocked for more than 128ms.
- for about 30-40% of the hang reports, the user tried to interact with the browser while it was unresponsive (Ie. this is causing actual user pain.)
Updated•3 years ago
|
Comment 8•3 years ago
|
||
Mike, you were poking at this recently - are you planning to fix this?
Assignee | ||
Comment 9•3 years ago
|
||
Yes, I was wanting to look at this during my streams, but only if it's not in the critical path. If somebody else wants to look at this and they're on a tight timeline, they're welcome to take this.
Comment 10•3 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #9)
Yes, I was wanting to look at this during my streams, but only if it's not in the critical path. If somebody else wants to look at this and they're on a tight timeline, they're welcome to take this.
Go right ahead :). I was just asking because this came up in triage.
Updated•3 years ago
|
Assignee | ||
Comment 11•2 years ago
|
||
Assignee | ||
Comment 12•2 years ago
|
||
Hey mhowell! If you have a moment, I'm hoping for a quick consult on something I've been working on off the side of my desk (the patch for this bug).
What I have here is a WIP patch that adds a simpler interface for updating the Windows jump list off of the main thread. The old mechanism basically exposed the details of the win32 jump list API over XPCOM. My new mechanism abstracts over a lot of that complexity, and this makes it so that (in theory) it can run entirely off of the main thread.
So it needs a bit of a cleanup still, but one of the things I've been mulling over here is testing. I want to write a GTest for this stuff to make sure that when I call populateJumpList on nsIJumpListBuilder, the right things get called on the win32 API to actually write to the jump list.
Unfortunately (as you may know), it doesn't seem like there's a great way of mocking / stubbing out the jump list API, nor instrumenting it (at least not that I can tell).
What I'm doing instead is creating a thin wrapper around the API calls with a class called JumpListManagerBase, and then passing the arguments through to the underlying API outside of the test. During my GTest, what I'm hoping to do is to have an implementation that simply records what is being passed to those methods so that the GTest can then get a dump of those things to do assertions on.
So my questions are:
- Is there a better way of designing a test for this? Am I going about this all wrong?
- I still want to keep the old API around until I'm confident we can switch over. This means that there's mechanisms that might attempt to write to the jump list on the main thread, and ones that don't. I'm (re)using a lock for this, and trying to pass that lock down into my JumpListManagerBase as a way of reducing lock acquisition overhead. Is there a better way of doing this?
- Originally, I was constructing the "real" JumpListManagerBase implementation with the underlying IID_ICustomDestinationList thing in a constructor. The problem is that the IID_ICustomDestinationList thing is constructed off of the main thread, and so I don't think I can destroy it on the main thread (unless I transfer ownership of it to the main thread?)... so I have this SetNativeDestinationList thing now so that I can construct the
JumpListManagerBase
on the main thread, but then poke in the reference to the IID_ICustomDestinationList after the fact.
Anyhow, this all feels pretty brittle and scary, and I was wondering if you had any ideas about how I could make this whole thing better?
Comment 13•2 years ago
•
|
||
- Ideally, we would have a good way to mock the OS API itself at the OS level (that is, in the registry) and substitute our own implementations of those interfaces that just do logging. I don't think we're going to find an easy and non-extremely hacky way to do that though. So I think what you've got is probably a good move, to just use our own wrapper thing that we dependency inject an implementation of.
2 and 3. So, we should only need locking here to protect our own objects; I looked up the registrations for these MSCOM objects and they're registered with ThreadingModel=Apartment, which means the COM runtime is already synchronizing/queuing those calls for us. It should not matter whether we're making calls from the main thread or not (provided whatever thread we're on has had a good CoInitialize call done for it already). Consequently, I also don't think it matters which thread issues the final Release call, it should always get marshaled to the object's apartment thread.
Assignee | ||
Comment 14•2 years ago
|
||
Assignee | ||
Comment 15•2 years ago
|
||
Depends on D181674
Assignee | ||
Comment 16•1 year ago
|
||
Depends on D181675
Assignee | ||
Comment 17•1 year ago
|
||
We use a mock JumpListBackend to act as a stand-in for the win32 API
for populating the JumpList, and test that it gets called with the
expected arguments for each major function of JumpListBuilder.
Depends on D189406
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Assignee | ||
Comment 18•1 year ago
|
||
According to the Microsoft documentation, this is the maximum length
of the string allowed for the description property of an IShellLinkW.
This also adds a test for long descriptions to ensure that they are
properly truncated.
This patch also:
-
Adjusts the tests a little bit to be less race-y, particularly with the
SetAppID routine in the JumpListBuilder constructor. -
Fixes the issue where the tests weren't properly copying in the intended
strings for test titles and descriptions.
Depends on D189407
Assignee | ||
Comment 19•1 year ago
|
||
Depends on D193693
Comment 20•1 year ago
|
||
Comment 21•1 year ago
|
||
Backed out for causing build bustages.
Backout link: https://hg.mozilla.org/integration/autoland/rev/8e73161375522f85a0ae7bd34a7eeed4b2c88703
Comment 22•1 year ago
|
||
Assignee | ||
Updated•1 year ago
|
Comment 23•1 year ago
|
||
Backed out for causing windows non-unified build bustages.
Failure log: https://treeherder.mozilla.org/logviewer?job_id=437360817&repo=autoland
Backout link: https://hg.mozilla.org/integration/autoland/rev/b75265b3f7ab70c294cf282b27e1bf8da8ff9906
Assignee | ||
Updated•1 year ago
|
Comment 24•1 year ago
|
||
Comment 25•1 year ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/129269188fef
https://hg.mozilla.org/mozilla-central/rev/d9a6f2ae662c
https://hg.mozilla.org/mozilla-central/rev/aca991ef2a24
https://hg.mozilla.org/mozilla-central/rev/03e7a619a2e5
https://hg.mozilla.org/mozilla-central/rev/0ed4c3d6d3e5
https://hg.mozilla.org/mozilla-central/rev/fb474adbcaab
Description
•