Closed Bug 1337058 Opened 3 years ago Closed 3 years ago

Switch the PContent FindPlugins message to happen asynchronously

Categories

(Core :: Plug-ins, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: ehsan, Assigned: qdot)

References

(Blocks 1 open bug)

Details

(Whiteboard: [qf:investigate][bhr])

Attachments

(1 file)

PContent::Msg_FindPlugins can take a long time:

<https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2017-02-05&keys=PContent%253A%253AMsg_FindPlugins!PContent%253A%253AMsg_RpcMessage!PCookieService%253A%253AMsg_GetCookieString!PAPZCTreeManager%253A%253AMsg_ReceiveMouseInputEvent&max_channel_version=nightly%252F54&measure=IPC_SYNC_LATENCY_MS&min_channel_version=null&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2017-02-02&table=0&trim=1&use_submission_date=0>

	End	IPC_SYNC_LATENCY_MS Count
0	1	0 (0%)
1	3	164.87k (58.95%)
3	8	42.98k (15.37%)
8	20	27.85k (9.96%)
20	50	18.06k (6.46%)
50	126	16.97k (6.07%)
126	317	6.79k (2.43%)
317	796	1.31k (0.47%)
796	2k	626 (0.22%)
2k	Infinity	216 (0.08%)

Now that we only support Flash, we should be able to eliminate this cost by only finding that one plugin asynchronously.
Once you need Flash, we pretty much have to have it synchronously, for webcompat reasons that are very hard to fix. But I do think what we can do is find it at/near startup and send it to content asynchronously, so that by the time we decide we need it we already have the information.
We can also send the information on the command line if we suspect to need it pretty early at startup...
Note the slow message is Msg_FindPlugins, not Msg_LoadPlugin.
Whiteboard: [qf:p1]
qDot, can you spend a bit of time to see if a page checking for the presence or existence of the Flash plugin invokes this slow sync IPC message? Benjamin and Ehsan had an email thread going with some ideas that they'll forward to you.

(Not trying to take this away from Benjamin but he's got lots of other things on the go)
Flags: needinfo?(kyle)
Whiteboard: [qf:p1] → [qf:investigate]
Copying in details from an email thread:

The hard part here is doing this in a way that actually improves startup performance. Here's the chain of relatively blocking work:

* Content needs block on knowing whether there is a Flash plugin and what to do with PDF loads in the following cases:
** the Flash or PDF mime type is loaded into a frame
** the page access navigator.mimeTypes or navigator.plugins
** the page uses an <embed> or <object> with those MIME types

If we naively just send this data as we launch the content process, that means we can't launch a content process until we scan for plugins in chrome. We shouldn't be doing this right at startup (although we might be! I just debugged a stupid early init of the plugin code) because it's currently main thread I/O and really expensive.

And at the same time, we currently have this arbitrary mechanism that assumes plugins can load *any* MIME type, when in fact they can only possibly load Flash and PDF. So we should hardcode that and short-circuit early plugin init when it's not necessary.

So what we really should do is launch content immediately, load the plugin list asynchronously in chrome, send it async from chrome->content, and only block on it in content if we hit a site that needs to block. With Flash blocking in place in 54/55 most sites won't be blocking on Flash. Otherwise I think we'll merely end up moving the delay around but not actually making much better.

Hmm, one question though.  Even with the Flash blocking in place, we still expect this code path to be hit if the page accesses navigator.plugins/mimeTypes, right?  If that is true, then we shouldn't expect the Flash blocking going into place to help with this performance problem.  :(
	
<ehsan>    Hmm, one question though.  Even with the Flash blocking in place, we still expect this code path to be hit if the page accesses navigator.plugins/mimeTypes, right?  If that is true, then we shouldn't expect the Flash blocking going into place to help with this performance problem.  :(

<bsmedberg> *If* the site is on the Flash blacklist (hopefully most advertising), then it wouldn't trigger loading plugins.

We don't know how many sites that's going to leave touching navigator.plugins/mimeTypes. So yeah, I don't know how much that delay the problem or whether that will be enough.
Assinging while I take a look at how much work this would be. If nothing else, fixing that MIME type check might be worth something.
Assignee: nobody → kyle
Flags: needinfo?(kyle)
NI?'ing myself just to keep bug on list.
Flags: needinfo?(kyle)
This profile sort of suggests to me that this sync IPC may happen when HTMLObjectElement tries to figure out whether the plugin type is supported :( https://perfht.ml/2neBTk6
See Also: → 1347666
Just once though, right?
Flags: needinfo?(ehsan)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #9)
> Just once though, right?

Yes, one time on this code path.  The more interesting question is, how bad is this sync IPC based on our telemetry data, as this profile may not cover all of the code paths in which this may get hit.  Right now, this is #29 worst sync IPC that we have, with a median time of 13ms (roughly equal to a frame budget at 60fps.)
Flags: needinfo?(ehsan)
The BHR data like this https://people-mozilla.org/~mlayzell/bhr/20170405/164.html#1 and this https://people-mozilla.org/~mlayzell/bhr/20170405/36.html#0 suggests that the Msg_LoadPlugin message can block both main and content process for more than 8 seconds. I'm not sure if this is one time cost or we have to load the plugin many times. If it's one time cost per-content process then I think it is still bad experience.
Hmm, this is weird. We have telemetry for time spent blocking when doing a module/stream/etc loading (https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2017-03-29&keys=Shockwave%2520Flash25.0.0.127!Shockwave%2520Flash24.0.0.221!Shockwave%2520Flash25.0.0.148!Shockwave%2520Flash24.0.0.194&max_channel_version=release%252F52&measure=BLOCKED_ON_PLUGIN_INSTANCE_INIT_MS&min_channel_version=null&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2017-02-27&table=0&trim=1&use_submission_date=0 for instance), and the majority of our blocks there are in the < 100ms range. Still not great, but nothing as drastic as 8s.

In terms of getting this fixed, I spent a while fixing calls to FindPlugin so that it only happens on the extremely small set of mime types we support. I think we're going to run into problems optimizing navigator.mimetypes/navigator.plugins because a LOT of analytics libraries call those, so you end up getting plugin list fetches any time you visit google.com or whatever. I suppose we can try to load async early still and block on that if absolutely necessary, just to try to get those times down a bit more.
Kan-Ru, what do you think of comment 12?
Flags: needinfo?(kchen)
The telemetry shows that 0.06% population can fall into the 8s range. The data from BHR seems also only have small number of reports. But when we switch the BHR to collect shorter hangs I think this will still popup.

I'm not familiar with plugin loading so I will defer this to qDot. I agree navigator.mimetypes and navigator.plugins will be read a lot more often than actually use the plugin. Does reading navigator.plugins require load the plugin? Assuming this is one time cost per-page-load it would make sense to optimize for page load.
Flags: needinfo?(kchen)
Ok. We need to divide this out into 2 bugs, keeping this one for Msg_FindPlugins, which is the sync message called by navigator.mimetypes/plugins, and making a new one for Msg_LoadPlugin, which is the slow message we're seeing in that bhr stack. I'll do that and change the title of this bug, since this has been confusing all along.

For FindPlugins, we're certainly calling it less now that we're only looking for one MIME type, but we still need to do a startup check for flash and cache that information to send to all new content processes. I figure that patch set will finish out this bug, and I can look at LoadPlugins after that (though as Comment #1 says, we may not get very far with that).
Flags: needinfo?(kyle)
Summary: Switch nsPluginHost::LoadPlugins to happen asynchronously → Switch the PContent FindPlugins message to happen asynchronously
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #14)
> I'm not familiar with plugin loading so I will defer this to qDot. I agree
> navigator.mimetypes and navigator.plugins will be read a lot more often than
> actually use the plugin. Does reading navigator.plugins require load the
> plugin? Assuming this is one time cost per-page-load it would make sense to
> optimize for page load.

Did you see comment 8?

See this call stack from that profile:
mozilla::dom::PContentChild::SendFindPlugins(unsigned int const &,nsresult *,nsTArray<mozilla::plugins::PluginTag> *,unsigned int *)
PContent::Msg_FindPlugins
nsPluginHost::FindPluginsInContent(bool,bool *)
nsPluginHost::FindPlugins(bool,bool *)
nsPluginHost::ReloadPlugins()
nsWebNavigationInfo::IsTypeSupported(nsACString const &,nsIWebNavigation *,unsigned int *)
nsObjectLoadingContent::IsSupportedDocument(nsCString const &)
nsObjectLoadingContent::GetTypeOfContent(nsCString const &)
nsObjectLoadingContent::UpdateObjectParameters(bool)
nsObjectLoadingContent::LoadObject(bool,bool,nsIRequest *)
mozilla::dom::HTMLSharedObjectElement::StartObjectLoad(bool,bool)
nsObjectLoadingContent::LoadFallback(nsObjectLoadingContent::FallbackType,bool)
nsObjectLoadingContent::LoadObject(bool,bool,nsIRequest *)
mozilla::dom::HTMLObjectElement::StartObjectLoad(bool,bool)
mozilla::dom::HTMLObjectElement::DoneAddingChildren(bool)
nsHtml5TreeBuilder::elementPopped(int,nsIAtom *,void *)
nsHtml5TreeBuilder::pop()
nsHtml5TreeBuilder::endTag(nsHtml5ElementName *)
nsHtml5Tokenizer::emitCurrentTagToken(bool,int)
nsHtml5Tokenizer::stateLoop<nsHtml5SilentPolicy>(int,char16_t,int,char16_t *,bool,int,int)
nsHtml5Tokenizer::tokenizeBuffer(nsHtml5UTF16Buffer *)
nsHtml5StringParser::Tokenize(nsAString const &,nsIDocument *,bool)
nsContentUtils::ParseFragmentHTML(nsAString const &,nsIContent *,nsIAtom *,int,bool,bool)
mozilla::dom::FragmentOrElement::SetInnerHTMLInternal(nsAString const &,mozilla::ErrorResult &)
mozilla::dom::ElementBinding::set_innerHTML
mozilla::dom::GenericBindingSetter(JSContext *,unsigned int,JS::Value *)
js::InternalCallOrConstruct(JSContext *,JS::CallArgs const &,js::MaybeConstruct)
js::Call(JSContext *,JS::Handle<JS::Value>,JS::Handle<JS::Value>,js::AnyInvokeArgs const &,JS::MutableHandle<JS::Value>)
js::CallSetter(JSContext *,JS::Handle<JS::Value>,JS::Handle<JS::Value>,JS::Handle<JS::Value>)
SetExistingProperty
js::NativeSetProperty(JSContext *,JS::Handle<js::NativeObject *>,JS::Handle<jsid>,JS::Handle<JS::Value>,JS::Handle<JS::Value>,js::QualifiedBool,JS::ObjectOpResult &)
SetPropertyOperation
Interpret
js::RunScript(JSContext *,js::RunState &)
a.a.ae/<
k
a.e.a
a.a.ae
http://o.aolcdn.com/os/moat/prod/moatuac.js
k
a.e.a
http://o.aolcdn.com/os/moat/prod/moatuac.js
http://o.aolcdn.com/os/moat/prod/moatuac.js
js::RunScript
js::ExecuteKernel(JSContext *,JS::Handle<JSScript *>,JSObject &,JS::Value const &,js::AbstractFramePtr,JS::Value *)
js::Execute(JSContext *,JS::Handle<JSScript *>,JSObject &,JS::Value *)
ExecuteScript
ExecuteScript
JS_ExecuteScript(JSContext *,JS::AutoObjectVector &,JS::Handle<JSScript *>)
nsJSUtils::EvaluateString(JSContext *,JS::SourceBufferHolder &,JS::Handle<JSObject *>,JS::CompileOptions &,nsJSUtils::EvaluateOptions const &,JS::MutableHandle<JS::Value>,void * *)
nsJSUtils::EvaluateString
nsJSUtils::EvaluateString(JSContext *,JS::SourceBufferHolder &,JS::Handle<JSObject *>,JS::CompileOptions &,void * *)
nsScriptLoader::EvaluateScript(nsScriptLoadRequest *)
nsScriptLoader::ProcessRequest(nsScriptLoadRequest *)
nsScriptLoader::ProcessOffThreadRequest(nsScriptLoadRequest *)
`anonymous namespace'::NotifyOffThreadScriptLoadCompletedRunnable::Run
mozilla::ValidatingDispatcher::Runnable::Run()
nsThread::ProcessNextEvent(bool,bool *)
mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate *)
mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate *)
MessageLoop::RunHandler()
MessageLoop::Run()
nsBaseAppShell::Run()
nsAppShell::Run()
XRE_RunAppShell()
mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate *)
MessageLoop::RunHandler()
MessageLoop::Run()
XRE_InitChildProcess(int,char * * const,XREChildData const *)
Startup::XRE_InitChildProcess
content_process_main(mozilla::Bootstrap *,int,char * * const)
NS_internal_main(int,char * *,char * *)
wmain
__scrt_common_main_seh
BaseThreadInitThunk
RtlUserThreadStart
(root)
Whiteboard: [qf:investigate] → [qf:investigate][bhr]
For quantum I am much more worried about FindPlugins than LoadPlugin. LoadPlugin will only happen when somebody *uses* a plugin, which will be much less often when we ship CtP which is a blocker for quantum release.
What counts as a plugin here?  Does that include jsplugins and PDF.js for example?
AFAIK, just NPAPI plugins. pdf.js has different loading logic.
Right now, we're down to Flash as the last NPAPI plugin, though gecko doesn't completely reflect it yet. The only thing that seems like it's difficult about FindPlugins is the instance where Flash is installed/uninstalled during a browser session.

So, can we just detect flash on browser startup, send that info on content process creation, then if there's some sort of successful plugin uninstall/install event, we just fire a async parent -> content message to update the cached value accordingly? I'm not familiar with the plugin un/install process at all, so I don't know if this is feasible or will work in all cases. If this solution does work, it seems like it will clear up the constant queries at least.
Flags: needinfo?(benjamin)
Yes, that sounds like a good plan.
Flags: needinfo?(benjamin)
I now have LoadPlugins running on content process startup, and from there we just send a list of plugins to all live ContentChild objects. Outside of the possible issue of affecting browser startup time, this seems to work well. Here's the telemetry for FindPlugins timing in 52:

(https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2017-03-28&keys=__none__!__none__!__none__&max_channel_version=release%252F52&measure=FIND_PLUGINS&min_channel_version=null&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2017-03-02&table=0&trim=1&use_submission_date=0)

So hopefully we're not delaying most startups more than 100ms, though the calls that require 1-3s are somewhat worrisome. 

After startup, I'm not actually sure there's a way to add/remove flash without completely shutting down the browser now. On Windows and macOS, the NPAPI installer requires the process to be stopped before installation and removal, and on linux, it's either package or manual installation anyways so we can figure the user have a vague idea of what they're doing. I'm not sure there's going to be a case in the browser session where the pluginlist can change now. If there aren't, that'll make this code pretty simple. Any thoughts, bsmedberg?
Flags: needinfo?(benjamin)
(In reply to Kyle Machulis [:qdot] [:kmachulis]  (if a patch has no decent commit message, automatic r-) from comment #22)

> After startup, I'm not actually sure there's a way to add/remove flash
> without completely shutting down the browser now.

There absolutely is. Flash updater on Windows runs in the background while Firefox is running. That is in fact why the Flash player is shipped with versioned DLL names: NPSWF32_25_0_0_171.dll etc so that the updater installs the new one but doesn't remove the old one. We see/load/start using the new one in the following cases:

page runs navigator.plugins.refresh
user opens the addon manager plugins pane
user open about:plugins

At that point there are multiple plugin tags for Flash, and we pick the one with the newer version here: http://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/dom/plugins/base/nsPluginHost.cpp#1092
Flags: needinfo?(benjamin)
Ugh, I kept thinking my solution seemed too easy.

On top of the cases you listed, we call ReloadPlugins() every time we get a call to nsWebNavigationInfo::IsTypeSupported, which basically happens on all tags where we load a plugin. NPAPI itself seems to have a reload call too.

Making this async is going to be difficult, because the expectation is that we'll look for/replace plugins, find the new one, and return it as needed synchronously. I could make ReloadPlugins asynchronous, but that'd just make it so that the /next/ load of the plugin after the one we're doing would load the correct one. I'm not sure what the flash updater UI is like either. Is this the user notified that a new version of flash has been installed?
Flags: needinfo?(benjamin)
In general we don't need to be precise about it, and I really don't think that in general we should be calling ReloadPlugins from IsTypeSupported. So I'd support making most of these cases lazy and the next load would pick up an update.

The only exception might be an explicit call to navigator.plugins.refresh(). If we could make that block and nothing else I'd be fine with it.
Flags: needinfo?(benjamin)
The Flash updater is silent by default.
Ah, ok, cool. I was just worried about the case of the updater showing that there was a new plugin installed and it not actually coming up. I'll decouple reloadplugins/findplugins on content then.
While this passes try at the moment (or at least did yesterday), I'm not sure how we're going to test this. I see there are already tests for doing ReloadPlugins calls and what not, but I wasn't seeing anything for doing background plugin upgrade tests, and I don't think that'd be easy to implement now. If I'm missing something more trivial, just let me know.
Comment on attachment 8870564 [details]
Bug 1337058 - Remove FindPlugins IPC message;

https://reviewboard.mozilla.org/r/142012/#review146238

Going to mark r- based on not-understanding, which means this at least needs a detailed comment somewhere explaining the flow.

::: dom/ipc/ContentChild.cpp:3486
(Diff revision 1)
>    screenManager.Refresh(Move(aScreens));
>    return IPC_OK();
>  }
>  
> +mozilla::ipc::IPCResult
> +ContentChild::RecvSetPluginList(const uint32_t& aPluginEpoch, nsTArray<plugins::PluginTag>&& aPluginTags)

Why is this a different message from RecvReloadPlugins? If we have a new plugin epoch, wouldn't the plugin host automatically reload?

I think I'm missing some subltety here.

::: dom/plugins/base/nsPluginHost.cpp:445
(Diff revision 1)
> +  if (XRE_IsParentProcess())
> +  {
> +    // If the plugin list changed, update content.
> +    SendPluginsToContent();
> +    nsTArray<dom::ContentParent*> parents;
> +    dom::ContentParent::GetAll(parents);

Same subtlety here, this feels like it should be completely unnecessary if we've already done SendSetPluginList.
Attachment #8870564 - Flags: review?(benjamin) → review-
Ah, you're totally right. I implemented the SetPluginList IPC, then hit the reload problem and spent a while on it because it touched more places in the browser. Didn't connect that I'd basically done the same thing twice. I'll get that cleaned up. Thanks!
Attachment #8870564 - Flags: review?(benjamin)
Clearing review request. This bug conflicted hard with the landing of bug 558184, and my conflict resolution caused some errors that try just caught. Working on fixing those now.
Comment on attachment 8870564 [details]
Bug 1337058 - Remove FindPlugins IPC message;

https://reviewboard.mozilla.org/r/142012/#review149224

r=me if you can move that logic to the constructor, or ping me so I understand that detail.

::: dom/plugins/base/nsPluginHost.cpp:331
(Diff revision 3)
>    if (!sInst) {
>      sInst = new nsPluginHost();
>      if (!sInst)
>        return nullptr;
>      NS_ADDREF(sInst);
> +    if (XRE_IsParentProcess())

It feels weird to be doing this in the GetInst method instead of the constructor. Is there a reason this has to be separate?
Attachment #8870564 - Flags: review?(benjamin) → review+
Comment on attachment 8870564 [details]
Bug 1337058 - Remove FindPlugins IPC message;

https://reviewboard.mozilla.org/r/142012/#review149224

> It feels weird to be doing this in the GetInst method instead of the constructor. Is there a reason this has to be separate?

Oh, yeah, can be moved to end of constructor, and replaces epoch reset since we'll be setting the new epoch as part of the LoadPlugins call. Fixed.
Pushed by kmachulis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0ff4ad45f99c
Remove FindPlugins IPC message; r=bsmedberg
https://hg.mozilla.org/mozilla-central/rev/0ff4ad45f99c
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Depends on: 1370237
Backed out for causing bug 1370237.
https://hg.mozilla.org/mozilla-central/rev/3cfe81678881
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla55 → ---
Pushed by kmachulis@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/42c1a8b6c897
Remove FindPlugins IPC message; r=bsmedberg
https://hg.mozilla.org/mozilla-central/rev/42c1a8b6c897
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.