Closed
Bug 1312540
Opened 7 years ago
Closed 7 years ago
PContent::Msg_GetGMPPluginVersionForAPI sends a sync IPC message to the parent and causes main thread IO
Categories
(Core :: Audio/Video: GMP, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla52
Tracking | Status | |
---|---|---|
firefox52 | --- | fixed |
People
(Reporter: mconley, Assigned: cpearce)
References
Details
Attachments
(4 files)
58 bytes,
text/x-review-board-request
|
mozbugz
:
review+
billm
:
review+
gchang
:
approval-mozilla-beta-
|
Details |
58 bytes,
text/x-review-board-request
|
mozbugz
:
review+
gchang
:
approval-mozilla-beta-
|
Details |
58 bytes,
text/x-review-board-request
|
mozbugz
:
review+
gchang
:
approval-mozilla-beta-
|
Details |
58 bytes,
text/x-review-board-request
|
mconley
:
review+
gchang
:
approval-mozilla-beta-
|
Details |
In bug 1310255, we've been examining stacks from BHR for hangs that bog down tab switch. This (pseudo)stack: Startup::XRE_InitChildProcess nsInputStreamPump::OnInputStreamReady nsInputStreamPump::OnStateStart PContent::Msg_GetGMPPluginVersionForAPI Showed up a bunch. As of this writing (and it's only a day's worth of data, but still), this is #3 in things stalling tab switch. We should avoid sending a sync message here if we can avoid it - and if this is causing main thread IO in the parent process (I haven't looked but I trust billm's analysis), we should definitely avoid that as well.
Reporter | ||
Comment 1•7 years ago
|
||
Hey peterv, I believe you added the sync GetGMPPluginVersionForAPI message in bug 1057908. Any ideas on how we could avoid these sync messages entirely?
Flags: needinfo?(peterv)
Reporter | ||
Updated•7 years ago
|
Summary: PContent::Msg_GetGMPPluginVersionForAPI sends a sync message to the parent and causes main thread IO → PContent::Msg_GetGMPPluginVersionForAPI sends a sync IPC message to the parent and causes main thread IO
Comment 2•7 years ago
|
||
(In reply to Mike Conley (:mconley) - (high latency on reviews and needinfos) from comment #1) > I believe you added the sync GetGMPPluginVersionForAPI message in bug > 1057908. Any ideas on how we could avoid these sync messages entirely? I don't know, IIRC a lot of the callers needed the answer synchronously. Chris, any idea? We might be able to cache some of this data in both processes (in child process for avoiding IPC and in the main process for avoiding main-thread IO).
Flags: needinfo?(peterv) → needinfo?(cpearce)
Assignee | ||
Comment 3•7 years ago
|
||
The callers do need an answer synchronously. We could certainly build caches in each process of the GMP plugin version info. There is a few infrastructure changes we'd need to make in the media stack to make that work. The question is, how urgent is this? I understand tab spinners are a bad thing, but I'd like to understand how harmful a #3 tab spinner is? That will help us decide how urgently to fix this.
Flags: needinfo?(cpearce)
Reporter | ||
Comment 4•7 years ago
|
||
(In reply to Chris Pearce (:cpearce) from comment #3) > The question is, how urgent is this? I understand tab spinners are a bad > thing, but I'd like to understand how harmful a #3 tab spinner is? That will > help us decide how urgently to fix this. Sure, I can try to answer that. For the samples we've collected over the past few days, the #1 cause for tab spinners is not very illuminating: "Startup::XRE_InitChildProcesses" is the lone pseudostack entry, which essentially tells us that the content process is indeed running, but that we're running inside something that has no pseudostack labels. Not very useful. This accounts for 60% of the stacks, but once we get the right labels landed, this could either resolve to be one thing, or (more likely) several things that accumulate to 60% of the top 3 tab switch hangs (and ~28% of all hangs we've gathered). The #2 cause for tab switch spinners appear to be things kicked off by unknown chrome scripts. Because we have JS interruption to force painting, I have to assume these chrome scripts are calling into something that is not necessarily interruptible (Maybe js-ctypes for example). Hard to tell at this point, but I'm hoping that once bug 1312597 is in Nightly, it'll become a bit clearer. These unknown chrome scripts account for 20% of the top 3 hangs (and ~9% of all hangs we've gathered) #3 is PContent::Msg_GetGMPPluginVersionForAPI, and this accounts for the last 20% of the top 3 tab switch hangs (and ~9% of all hangs we've gathered). So we're still refining BHR and our instrumentation to help us figure out the top 2 causes. PContent::Msg_GetGMPPluginVersionForAPI is currently the only one that we feel is actionable at this point until we get more data. Does that help illuminate how urgent this is? It's definitely not the sole cause, but it's also currently one of the top causes, and the only one that is currently actionable.
Flags: needinfo?(cpearce)
Assignee | ||
Comment 5•7 years ago
|
||
(In reply to Mike Conley (:mconley) - (high latency on reviews and needinfos) from comment #4) > Does that help illuminate how urgent this is? It's definitely not the sole > cause, but it's also currently one of the top causes, and the only one that > is currently actionable. So 20% of tab spinners are caused by PContent::Msg_GetGMPPluginVersionForAPI. Do you have a feel for how common tab spinners are? That is, do you have a graph that says (say) 5% of all tab switches hit a spinner, and of that 20% are caused by PContent::Msg_GetGMPPluginVersionForAPI?
Flags: needinfo?(cpearce) → needinfo?(mconley)
Reporter | ||
Comment 6•7 years ago
|
||
(In reply to Chris Pearce (:cpearce) from comment #5) > (In reply to Mike Conley (:mconley) - (high latency on reviews and > needinfos) from comment #4) > > Does that help illuminate how urgent this is? It's definitely not the sole > > cause, but it's also currently one of the top causes, and the only one that > > is currently actionable. > > So 20% of tab spinners are caused by > PContent::Msg_GetGMPPluginVersionForAPI. > > Do you have a feel for how common tab spinners are? That is, do you have a > graph that says (say) 5% of all tab switches hit a spinner We have a dashboard here: https://chutten.github.io/bug1310250/ Currently, the data suggests that 4% of all tab switches across all telemetry pings we've received, result in a spinner. I don't know enough about your current slate of work to tell you how urgent it is in comparison to all of the other stuff you're working on (that might be above my paygrade). Roping in blassey.
Flags: needinfo?(mconley) → needinfo?(blassey.bugs)
Assignee | ||
Comment 7•7 years ago
|
||
So telemetry indicates that 0.8% of tab switches hit a spinner due to this. That seems pretty significant. Are you looking to uplift these fixes?
Flags: needinfo?(mconley)
Reporter | ||
Comment 8•7 years ago
|
||
(In reply to Chris Pearce (:cpearce) from comment #7) > So telemetry indicates that 0.8% of tab switches hit a spinner due to this. > That seems pretty significant. Are you looking to uplift these fixes? I guess it depends on how extensive the changes are. Tab switch spinners are bad. Under-tested GMP support may be worse - but I might let Product make that call if it comes to it.
Flags: needinfo?(mconley)
Updated•7 years ago
|
Rank: 25
Priority: -- → P2
Assignee | ||
Comment 9•7 years ago
|
||
I reckon this must be caused by the HasGMPFor() calls in GMPDecoderModule::UpdateUsableCodecs(). All other places that call into GetGMPPluginVersionForAPI come from API calls from EME and WebRTC, and I struggle to believe they're common enough to show up this frequently. GMPDecoderModule::UpdateUsableCodecs() is called every time a GMP is added or removed to/from the GMPService. Which happens on startup a few times, or when a new GMP update downloads. GMPDecoderModule::UpdateUsableCodecs() supports audio/video decoding via GMPs, which is a feature we have preffed off, and I was planning on removing after we drop XP support. A simple way to test my theory would be to move the body of GMPDecoderModule::UpdateUsableCodecs() behind a check of media.gmp.decoder.enabled==true. Or even just not send the PContent.NotifyGMPsChanged message. That would be a simple patch that would be safe to uplift to beta.
Priority: P2 → P1
Assignee | ||
Comment 10•7 years ago
|
||
We need gmp-changed for MediaKeySystemAccessManager::Observe() so that CDM requests awaiting a download to complete resolve. So preffing out the body of GMPDecoderModule::UpdateUsableCodecs() seems like it.
Assignee | ||
Comment 11•7 years ago
|
||
Gah, blanking out GMPDecoderModule::UpdateUsableCodecs() if media.gmp.decoder.enabled is off won't work as EMEDecoderModule::SupportsMimeType() relies on the structure that GMPDecoderModule::UpdateUsableCodecs() populates.
Updated•7 years ago
|
Flags: needinfo?(blassey.bugs)
Chris, would bug 1313200 help you here? I haven't followed the recent comments, but I thought I'd ask. If it would help, I can bump the priority of it.
Flags: needinfo?(cpearce)
Reporter | ||
Comment 13•7 years ago
|
||
With bug 1312597 landed, the #2 top frame set I mentioned in comment 4 has been split apart into smaller, more actionable bits. Doing so has also moved PContent::Msg_GetGMPPluginVersionForAPI into the #2 spot.
Assignee | ||
Comment 14•7 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #12) > Chris, would bug 1313200 help you here? I haven't followed the recent > comments, but I thought I'd ask. If it would help, I can bump the priority > of it. I don't think rushing that would necessarily help (though doing it still sounds like a good idea!), as the API here needs to be synchronous and some call sites run from JS.
Flags: needinfo?(cpearce)
Assignee | ||
Comment 15•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6d4c0d75ac86
Assignee | ||
Comment 16•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d5c2d804fb96
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Updated•7 years ago
|
Assignee: nobody → cpearce
Comment 21•7 years ago
|
||
mozreview-review |
Comment on attachment 8806257 [details] Bug 1312540 - Maintain a cache of GMPs capabilities in content processes. https://reviewboard.mozilla.org/r/89762/#review89214 ::: dom/ipc/ContentChild.h:1 (Diff revision 1) > /* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */ In the commit description: 'subseuent' -> 'subsequent'. ::: dom/ipc/PContent.ipdl:340 (Diff revision 1) > nsCString url; > PBlob blob; > Principal principal; > }; > > +struct GMPAPITags { Style: '{' on new line. ::: dom/ipc/PContent.ipdl:345 (Diff revision 1) > +struct GMPAPITags { > + nsCString api; > + nsCString[] tags; > +}; > + > +struct GMPCapabilityData { Style: '{' on new line. ::: dom/media/gmp/GMPServiceChild.cpp:136 (Diff revision 1) > } > > +typedef mozilla::dom::GMPCapabilityData GMPCapabilityData; > +typedef mozilla::dom::GMPAPITags GMPAPITags; > + > +struct GMPCapabilityAndVersion { Style: '{' on new line. ::: dom/media/gmp/GMPServiceChild.cpp:137 (Diff revision 1) > > +typedef mozilla::dom::GMPCapabilityData GMPCapabilityData; > +typedef mozilla::dom::GMPAPITags GMPAPITags; > + > +struct GMPCapabilityAndVersion { > + Unnecessary empty line. ::: dom/media/gmp/GMPServiceChild.cpp:152 (Diff revision 1) > + } > + mCapabilities.AppendElement(Move(cap)); > + } > + } > + > + nsCString ToString() const { Style: '{' on new line. ::: dom/media/gmp/GMPServiceParent.cpp:877 (Diff revision 1) > + bool found = false; > + for (const GMPCapabilityData& cap : caps) { > + if (cap.name().Equals(name)) { > + found = true; > + } > + } > + if (found) { > + continue; > + } No need for 'found' here, you could simply do: for (...) { if (...) { continue; } } ::: dom/media/gmp/GMPServiceParent.cpp:889 (Diff revision 1) > + continue; > + } > + GMPCapabilityData x; > + x.name() = name; > + x.version() = gmp->GetVersion(); > + for (const GMPCapability& tag: gmp->GetCapabilities()) { Style: Add space before ':'. ::: dom/media/gmp/GMPServiceParent.cpp:895 (Diff revision 1) > + x.capabilities().AppendElement(GMPAPITags(tag.mAPIName, tag.mAPITags)); > + } > + caps.AppendElement(Move(x)); > + } > + } > + nsTArray<ContentParent*> children; Children are parents? :-P
Attachment #8806257 -
Flags: review?(gsquelart) → review+
Comment 22•7 years ago
|
||
mozreview-review |
Comment on attachment 8806258 [details] Bug 1312540 - Simplify behaviour of GMP::GetPluginVersionForAPI. https://reviewboard.mozilla.org/r/89764/#review89228
Attachment #8806258 -
Flags: review?(gsquelart) → review+
Comment 23•7 years ago
|
||
mozreview-review |
Comment on attachment 8806259 [details] Bug 1312540 - Use GMP caps cached in content process for GetPluginVersion. https://reviewboard.mozilla.org/r/89766/#review89232 ::: dom/media/gmp/GMPParent.h:121 (Diff revision 1) > void Shutdown(); > > // This must not be called while we're in the middle of abnormal ActorDestroy > void DeleteProcess(); > > bool SupportsAPI(const nsCString& aAPI, const nsCString& aTag); You should remove this orphaned declaration. ::: dom/media/gmp/GMPParent.cpp:587 (Diff revision 1) > +bool > +GMPCapability::Supports(const nsTArray<GMPCapability>& aCapabilities, > + const nsCString& aAPI, > + const nsCString& aTag) > +{ > + for (uint32_t i = 0; i < aCapabilities.Length(); i++) { I think you could use a range-for here and for the next look, but it may be too much of a tangent for this patch, so up to you if you'd prefer not to make this change now.
Attachment #8806259 -
Flags: review?(gsquelart) → review+
Reporter | ||
Comment 24•7 years ago
|
||
mozreview-review |
Comment on attachment 8806260 [details] Bug 1312540 - Remove GetGMPPluginVersionForAPI IPC. https://reviewboard.mozilla.org/r/89768/#review89350 LGTM - though a heads up that your try build has gone red.
Attachment #8806260 -
Flags: review?(mconley) → review+
Comment 25•7 years ago
|
||
mozreview-review |
Comment on attachment 8806257 [details] Bug 1312540 - Maintain a cache of GMPs capabilities in content processes. https://reviewboard.mozilla.org/r/89762/#review89344 ::: dom/ipc/ContentParent.cpp:1418 (Diff revision 1) > mGatherer = static_cast<ProfileGatherer*>(gatherer.get()); > > StartProfiler(currentProfilerParams); > } > #endif > + RefPtr<GeckoMediaPluginServiceParent> gmps(GeckoMediaPluginServiceParent::GetSingleton()); Please put a newline before this. ::: dom/media/gmp/GMPServiceChild.cpp:185 (Diff revision 1) > + > +static nsCString > +ToString(const nsTArray<GMPCapabilityData>& aCapabilities) > +{ > + nsCString s; > + for (const GMPCapabilityAndVersion& gmp : *sGMPCapabilities) { You're iterating over sGMPCapabilities and never using aCapabilities. Pretty confusing when reading the code. ::: dom/media/gmp/GMPServiceParent.cpp:859 (Diff revision 1) > +void > +GeckoMediaPluginServiceParent::UpdateContentProcessGMPCapabilities() > +{ > + if (!NS_IsMainThread()) { > + RefPtr<GeckoMediaPluginServiceParent> self(this); > + NS_DispatchToMainThread(NS_NewRunnableFunction([self]() -> void { Any reason not to use NewRunnableMethod here? ::: dom/media/gmp/GMPServiceParent.cpp:872 (Diff revision 1) > + typedef mozilla::dom::ContentParent ContentParent; > + > + nsTArray<GMPCapabilityData> caps; > + { > + MutexAutoLock lock(mMutex); > + for (const RefPtr<GMPParent>& gmp : mPlugins) { How many plugins are we likely to see here? I'm a little worried about the n^2 algorithm. ::: dom/media/gmp/GMPServiceParent.cpp:897 (Diff revision 1) > + caps.AppendElement(Move(x)); > + } > + } > + nsTArray<ContentParent*> children; > + ContentParent::GetAll(children); > + for (ContentParent* child : children) { Slightly nicer to do: for (auto* cp : ContentParent::AllProcesses(ContentParent::eLive))
Attachment #8806257 -
Flags: review?(wmccloskey) → review+
Assignee | ||
Comment 26•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8806257 [details] Bug 1312540 - Maintain a cache of GMPs capabilities in content processes. https://reviewboard.mozilla.org/r/89762/#review89214 > No need for 'found' here, you could simply do: > for (...) { > if (...) { > continue; > } > } That's a double for loop. So a continue in the inner loop won't cause the outer loop to continue, and its the outer loop we want to continue. So I need to 'break' out of the inner loop and 'continue' in the outer loop. So I'll add a break here. > Children are parents? :-P Confusingly, yes. These are the parent side actors of the parent<->child relationship of the chrome process being the parent of the child content processes. So our connection to the child process in the chrome process happens over the "parent" actors.
(In reply to Chris Pearce (:cpearce) from comment #26) > > Children are parents? :-P > > Confusingly, yes. These are the parent side actors of the parent<->child > relationship of the chrome process being the parent of the child content > processes. So our connection to the child process in the chrome process > happens over the "parent" actors. By this reasoning we would be justified in calling ContentChild instances "parents" because they are the child side reflections of ContentParents! Standard practice is to call ContentParent instances "parents" and ContentChild instances "children".
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 34•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8806257 [details] Bug 1312540 - Maintain a cache of GMPs capabilities in content processes. https://reviewboard.mozilla.org/r/89762/#review89214 > That's a double for loop. So a continue in the inner loop won't cause the outer loop to continue, and its the outer loop we want to continue. So I need to 'break' out of the inner loop and 'continue' in the outer loop. So I'll add a break here. Oops, sorry for giving a wrong advice, this is embarrassing. Yes, a 'break' in the inner loop would help. Or you could 'goto' the bottom of the outer loop block! :-P Or use std::none_of. (I've been meaning to use <algorithm>, but never got the courage to start.) But I'm digging myself deeper, ignore me.
Comment 35•7 years ago
|
||
Pushed by cpearce@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/97147b709fd8 Maintain a cache of GMPs capabilities in content processes. r=billm,gerald https://hg.mozilla.org/integration/autoland/rev/8d38f007c911 Simplify behaviour of GMP::GetPluginVersionForAPI. r=gerald https://hg.mozilla.org/integration/autoland/rev/255aac711339 Use GMP caps cached in content process for GetPluginVersion. r=gerald https://hg.mozilla.org/integration/autoland/rev/c6e7c38bd386 Remove GetGMPPluginVersionForAPI IPC. r=mconley
Assignee | ||
Comment 36•7 years ago
|
||
I found a minor regression in the EME UI introduced by these patches. Will fix in bug 1314797.
Comment 37•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/97147b709fd8 https://hg.mozilla.org/mozilla-central/rev/8d38f007c911 https://hg.mozilla.org/mozilla-central/rev/255aac711339 https://hg.mozilla.org/mozilla-central/rev/c6e7c38bd386
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox52:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Assignee | ||
Comment 38•7 years ago
|
||
Mike: When will we get, or do we now have, data confirming that tab switches have improved due to the work in this bug?
Flags: needinfo?(mconley)
Reporter | ||
Comment 39•7 years ago
|
||
The dashboard I've been monitoring is https://chutten.github.io/bug1310250/. It's still a bit early to draw any conclusions yet - not enough data has come in just yet. We'll need a few more days for there to be enough "saturation", I think.
Flags: needinfo?(mconley)
Assignee | ||
Comment 40•7 years ago
|
||
Beta try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b3f64b6f223d05f07333569a85fc8aa47ddf3bfb
Assignee | ||
Comment 41•7 years ago
|
||
Comment on attachment 8806257 [details] Bug 1312540 - Maintain a cache of GMPs capabilities in content processes. Approval Request Comment [Feature/regressing bug #]: EME/GMP decoding [User impact if declined]: We think that 0.4% of all tab switches get jank because of a problem fixed in this patch. So if we don't uplift this, we'll continue to get jank. [Describe test coverage new/current, TreeHerder]: We have lots of tests that depend on the code I change in this patch. [Risks and why]: Low; it would be quite obvious if this code didn't work! [String/UUID change made/needed]: None.
Attachment #8806257 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 42•7 years ago
|
||
Comment on attachment 8806258 [details] Bug 1312540 - Simplify behaviour of GMP::GetPluginVersionForAPI. Approval Request Comment [Feature/regressing bug #]: EME/GMP decoding [User impact if declined]: We think that 0.4% of all tab switches get jank because of a problem fixed in this patch. So if we don't uplift this, we'll continue to get jank. [Describe test coverage new/current, TreeHerder]: We have lots of tests that depend on the code I change in this patch. [Risks and why]: Low; it would be quite obvious if this code didn't work! [String/UUID change made/needed]: None.
Attachment #8806258 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 43•7 years ago
|
||
Comment on attachment 8806259 [details] Bug 1312540 - Use GMP caps cached in content process for GetPluginVersion. Approval Request Comment [Feature/regressing bug #]: EME/GMP decoding [User impact if declined]: We think that 0.4% of all tab switches get jank because of a problem fixed in this patch. So if we don't uplift this, we'll continue to get jank. [Describe test coverage new/current, TreeHerder]: We have lots of tests that depend on the code I change in this patch. [Risks and why]: Low; it would be quite obvious if this code didn't work! [String/UUID change made/needed]: None.
Attachment #8806259 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 44•7 years ago
|
||
Comment on attachment 8806260 [details] Bug 1312540 - Remove GetGMPPluginVersionForAPI IPC. Approval Request Comment [Feature/regressing bug #]: EME/GMP decoding [User impact if declined]: We think that 0.4% of all tab switches get jank because of a problem fixed in this patch. So if we don't uplift this, we'll continue to get jank. [Describe test coverage new/current, TreeHerder]: We have lots of tests that depend on the code I change in this patch. [Risks and why]: Low; it would be quite obvious if this code didn't work! [String/UUID change made/needed]: None.
Attachment #8806260 -
Flags: approval-mozilla-beta?
Comment 45•7 years ago
|
||
Comment on attachment 8806257 [details] Bug 1312540 - Maintain a cache of GMPs capabilities in content processes. After discussing with cpearce, we decide to let this ride the train on 52.
Attachment #8806257 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
Updated•7 years ago
|
Attachment #8806258 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
Updated•7 years ago
|
Attachment #8806259 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
Updated•7 years ago
|
Attachment #8806260 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
You need to log in
before you can comment on or make changes to this bug.
Description
•