Closed Bug 1363398 Opened 7 years ago Closed 7 years ago

GetShellFolderPath is expensive during startup

Categories

(Toolkit :: Startup and Profile System, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla55
Performance Impact high
Tracking Status
firefox55 --- fixed

People

(Reporter: florian, Assigned: florian)

References

(Blocks 1 open bug)

Details

(Whiteboard: [tpi:+])

Attachments

(2 files)

See this startup profile captured on the quantum reference device where the GetShellFolderPath call takes 366ms (13% of the time before we show the first window) : https://perfht.ml/2qO3cnP

Interestingly, 92.9% of these 366ms is spent in SHParseDisplayName (which triggers main thread IO waiting for a DLL to load), which seems entirely useless for our purpose.

The Windows API called by GetShellFolderPath is SHGetSpecialFolderLocation.
MSDN says "SHGetSpecialFolderLocation is not supported and may be altered or unavailable in the future. Instead, use SHGetFolderLocation" https://msdn.microsoft.com/en-us/library/windows/desktop/bb762203(v=vs.85).aspx

The SHGetFolderLocation page says "Deprecated. [...] As of Windows Vista, this function is merely a wrapper for SHGetKnownFolderIDList."
https://msdn.microsoft.com/en-us/library/windows/desktop/bb762180(v=vs.85).aspx

And finally, the SHGetKnownFolderIDList documentation at:
https://msdn.microsoft.com/en-us/library/windows/desktop/bb762187(v=vs.85).aspx
shows that this function supports a "dwFlags" parameter. By default this is 0, but https://msdn.microsoft.com/en-us/library/windows/desktop/dd378447(v=vs.85).aspx has the list of possible values.
This one seems interesting:

KF_FLAG_SIMPLE_IDLIST

    0x00000100. Build a simple IDList (PIDL) This value can be used when you want to retrieve the file system path but do not specify this value if you are retrieving the localized display name of the folder because it might not resolve correctly.
It will probably be impractical to do anything about Windows loading a DLL under GetShellFolderPath().  Our best option here would be to figure out which DLL is being loaded here and kick off loading it ourselves off the main thread early at startup, I think...
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #1)
> It will probably be impractical to do anything about Windows loading a DLL
> under GetShellFolderPath().  Our best option here would be to figure out
> which DLL is being loaded here and kick off loading it ourselves off the
> main thread early at startup, I think...

It's the SHParseDisplayName call that loads the DLL. We don't need this display name at all.
(In reply to Florian Quèze [:florian] [:flo] from comment #2)
> (In reply to :Ehsan Akhgari (super long backlog, slow to respond) from
> comment #1)
> > It will probably be impractical to do anything about Windows loading a DLL
> > under GetShellFolderPath().  Our best option here would be to figure out
> > which DLL is being loaded here and kick off loading it ourselves off the
> > main thread early at startup, I think...
> 
> It's the SHParseDisplayName call that loads the DLL. We don't need this
> display name at all.

Sure, but the function that _we_ call is GetShellFolderPath() and we can't control what that function does!  And that function isn't even loading a DLL directly, it's just calling a function in a delay loaded DLL, that just gets triggered to load on the spot (see how __delayLoadHelper2 is on the stack).  So it's all just a series of unfortunate events really...
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #3)
> (In reply to Florian Quèze [:florian] [:flo] from comment #2)
> > (In reply to :Ehsan Akhgari (super long backlog, slow to respond) from
> > comment #1)
> > > It will probably be impractical to do anything about Windows loading a DLL
> > > under GetShellFolderPath().  Our best option here would be to figure out
> > > which DLL is being loaded here and kick off loading it ourselves off the
> > > main thread early at startup, I think...
> > 
> > It's the SHParseDisplayName call that loads the DLL. We don't need this
> > display name at all.
> 
> Sure, but the function that _we_ call is GetShellFolderPath() and we can't
> control what that function does!

I'm still unsure if you have fully read comment 0 or not. What I'm saying there is that we should not call that deprecated function, and instead call the modern one that lets us provide a flag specifying that we want only the path and don't need the display name.
Priority: -- → P1
Whiteboard: [qf] → [qf][tpi:+]
Here is a profile where this specific disk I/O is blocking startup for 38s: https://perfht.ml/2psJ4b8
(In reply to Florian Quèze [:florian] [:flo] from comment #4)
> I'm still unsure if you have fully read comment 0 or not. What I'm saying
> there is that we should not call that deprecated function, and instead call
> the modern one that lets us provide a flag specifying that we want only the
> path and don't need the display name.

I'm not sure that would help - from what I see of the implementation of GetShellFolderPath, it would simply be delaying the inevitable.

Once we've called SHGetKnownFolderIDList, we still need to convert that to a string via SHGetPathFromIDList, where you'd probably be forced through that expensive code path anyway. Feel free to prove me wrong with data ;-)

(As an aside, I don't know why we are using two shell API calls instead of a single call to SHGetKnownFolderPath, but that's a separate issue)
Startup main thread I/O = p1
Whiteboard: [qf][tpi:+] → [qf:p1][tpi:+]
Attached patch PatchSplinter Review
Here's a patch following the approach I proposed in comment 0.
Attachment #8869632 - Flags: review?(jmathies)
Assignee: nobody → florian
Status: NEW → ASSIGNED
Component: Widget: Win32 → Startup and Profile System
Product: Core → Toolkit
Comment on attachment 8869632 [details] [diff] [review]
Patch

Review of attachment 8869632 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/xre/nsXREDirProvider.cpp
@@ +1133,5 @@
>    nsresult rv = NS_OK;
>  
>    LPITEMIDLIST pItemIDList = nullptr;
>  
> +  if (SUCCEEDED(SHGetKnownFolderIDList(folder, KF_FLAG_SIMPLE_IDLIST,

Maybe we want to add KF_FLAG_DONT_VERIFY here as well?
Attachment #8869632 - Flags: review?(jmathies) → review+
(In reply to Jim Mathies [:jimm] from comment #9)

> Maybe we want to add KF_FLAG_DONT_VERIFY here as well?

Good catch, thanks!

https://treeherder.mozilla.org/#/jobs?repo=try&revision=28fb8eccec4fac81d3aeed4883ba85d363ac1e42
Pushed by florian@queze.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/94218d5a75ca
GetShellFolderPath shouldn't fetch folder display names, r=jimm.
https://hg.mozilla.org/mozilla-central/rev/94218d5a75ca
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Unfortunately, this patch didn't help, as this profile shows: https://perfht.ml/2qgmLl6

We are now paying the cost of SHParseDisplayName from SHILAliasTranslate that used to be cheap in the profile from comment 0.

I wonder if adding the KF_FLAG_NO_ALIAS flag would skip the SHILAliasTranslate call.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #8870838 - Flags: review?(jmathies)
(In reply to Florian Quèze [:florian] [:flo] from comment #13)
> Unfortunately, this patch didn't help, as this profile shows:
> https://perfht.ml/2qgmLl6
> 
> We are now paying the cost of SHParseDisplayName from SHILAliasTranslate
> that used to be cheap in the profile from comment 0.
> 
> I wonder if adding the KF_FLAG_NO_ALIAS flag would skip the
> SHILAliasTranslate call.

Does it? A try push and a profile of that build would tell us if this is worth landing.

(In reply to Florian Quèze [:florian] [:flo] from comment #14)
> Created attachment 8870838 [details] [diff] [review]
> add the KF_FLAG_NO_ALIAS flag

Seems like this would be slower since you're asking for a path that has %aliased% strings removed.
(In reply to Jim Mathies [:jimm] from comment #15)
> (In reply to Florian Quèze [:florian] [:flo] from comment #13)
> > Unfortunately, this patch didn't help, as this profile shows:
> > https://perfht.ml/2qgmLl6
> > 
> > We are now paying the cost of SHParseDisplayName from SHILAliasTranslate
> > that used to be cheap in the profile from comment 0.
> > 
> > I wonder if adding the KF_FLAG_NO_ALIAS flag would skip the
> > SHILAliasTranslate call.
> 
> Does it? A try push and a profile of that build would tell us if this is
> worth landing.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ec215281081a0c111e3923f23a0b1f83e523d0af

It may take more than one profile to be sure, as the IO when loading the library is a bit random.

> (In reply to Florian Quèze [:florian] [:flo] from comment #14)
> > Created attachment 8870838 [details] [diff] [review]
> > add the KF_FLAG_NO_ALIAS flag
> 
> Seems like this would be slower since you're asking for a path that has
> %aliased% strings removed.

Well, my guess was that it starts with the actual file paths and then SHILAliasTranslate inserts the alias patterns... but that's not documented anywhere, so it's only a guess that could be completely wrong.
I tried cold startup 3 times with the try server build from comment 16, and this was cheap the 3 times. SHParseDisplayName was visible in only one of the profiles, and with a single record (1ms). So either this additional patch fixes the bug, or I've been very lucky in these 3 profiles, or... the lack of xul.pdb symbols for the try build confused the profiler and caused me to not see the right things. So, I would like to try landing this patch.

That said, if I had rtfm'ed a bit more earlier, I would have seen that SHParseDisplayName isn't related at all to the display names, but is "to convert a string to a pointer to an item identifier list (PIDL)" (https://msdn.microsoft.com/fr-fr/library/windows/desktop/bb762236(v=vs.85).aspx).

So the way to avoid SHParseDisplayName would be to call SHGetKnownFolderPath instead of SHGetKnownFolderIDList. I no longer think SHParseDisplayName is what's expensive though; it's just a function that happens to be in the stack.
Attachment #8870838 - Flags: review?(jmathies) → review+
Pushed by florian@queze.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6feb3d6c9b10
add the KF_FLAG_NO_ALIAS flag to the SHGetKnownFolderIDList call, r=jimm.
https://hg.mozilla.org/mozilla-central/rev/6feb3d6c9b10
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Here are a few new cold startup profiles:
https://perfht.ml/2qLjNVu
https://perfht.ml/2qLzhsi
https://perfht.ml/2rgJBvW
https://perfht.ml/2rgJGzU

GetShellFolderPath is still more expensive than I would like, but I don't see any obvious expensive main thread IO anymore. SHParseDisplayName seems to be about a third of what remains, so calling SHGetKnownFolderPath instead of SHGetKnownFolderIDList may be a good thing to do in a follow-up.
(In reply to Florian Quèze [:florian] [:flo] from comment #20)
> GetShellFolderPath is still more expensive than I would like, but I don't
> see any obvious expensive main thread IO anymore. SHParseDisplayName seems
> to be about a third of what remains, so calling SHGetKnownFolderPath instead
> of SHGetKnownFolderIDList may be a good thing to do in a follow-up.

Have you already filed a bug for this?
Depends on: 1372152
(In reply to Marco Castelluccio [:marco] from comment #21)
> (In reply to Florian Quèze [:florian] [:flo] from comment #20)
> > GetShellFolderPath is still more expensive than I would like, but I don't
> > see any obvious expensive main thread IO anymore. SHParseDisplayName seems
> > to be about a third of what remains, so calling SHGetKnownFolderPath instead
> > of SHGetKnownFolderIDList may be a good thing to do in a follow-up.
> 
> Have you already filed a bug for this?

Just filed bug 1372152 and attached a patch.
Performance Impact: --- → P1
Whiteboard: [qf:p1][tpi:+] → [tpi:+]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: