Closed
Bug 1222972
Opened 10 years ago
Closed 8 years ago
Compare e10s vs non-e10s hang stacks from Telemetry pings
Categories
(Toolkit :: Telemetry, defect, P3)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
| Tracking | Status | |
|---|---|---|
| e10s | + | --- |
People
(Reporter: birunthan, Unassigned)
References
Details
For the initial version, we should find the top 10 e10s and non-e10s hang stacks.
Summary: Compare e10s vs non-e10s hang stacks from Telemetry crash pings → Compare e10s vs non-e10s hang stacks from Telemetry pings
Updated•10 years ago
|
tracking-e10s:
--- → +
Results: http://nbviewer.ipython.org/github/vitillo/e10s_analyses/blob/master/aurora/e10s_top_hang_stacks.ipynb
Some findings from Roberto:
- IPDL::PCookieService::RecvGetCookieString is present in 7% of the e10s stacks (and 0% in non-e10s). The seems like something we could and should fix.
- Startup::XRE_Main is the topmost frame for 33% of e10s stacks (and 18% for non-e10s). Perhaps we need to add more pseudostack markers?
- (content script) is the topmost frame in 13% of non-e10s and 0.1% of e10s stacks. That seems like a big win for e10s, but is this an expected result?
Any thoughts?
Flags: needinfo?(jmathies)
Comment 2•10 years ago
|
||
This doesn't compare well with what I see here -
https://telemetry.mozilla.org/chromehangs/
This data is coming from nightly and doesn't account for e10s processes. It doesn't match to your non-e10s data though, which has me wondering if this analysis is missing something.
For example the top stack for non-e10s with a ping count of 2570 is listed below. I would expect that to show up in our aurora data as well.
PR_WaitCondVar (in nss3.pdb)
mozilla::CondVar::Wait(unsigned int) (in xul.pdb)
mozilla::ipc::GeckoChildProcessHost::WaitUntilConnected(int) (in xul.pdb)
mozilla::plugins::PluginProcessParent::WaitUntilConnected(int) (in xul.pdb)
mozilla::plugins::PluginModuleChromeParent::LoadModule(char const *,unsigned int,nsPluginTag *) (in xul.pdb)
GetNewPluginLibrary(nsPluginTag *) (in xul.pdb)
nsPluginHost::EnsurePluginLoaded(nsPluginTag *) (in xul.pdb)
nsPluginHost::GetPluginForContentProcess(unsigned int,nsNPAPIPlugin * *) (in xul.pdb)
mozilla::plugins::SetupBridge(unsigned int,mozilla::dom::ContentParent *,bool,nsresult *,unsigned int *) (in xul.pdb)
mozilla::dom::ContentParent::RecvLoadPlugin(unsigned int const &,nsresult *,unsigned int *) (in xul.pdb)
mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const &,IPC::Message * &) (in xul.pdb)
mozilla::ipc::MessageChannel::DispatchSyncMessage(IPC::Message const &,IPC::Message * &) (in xul.pdb)
mozilla::ipc::MessageChannel::DispatchMessageW(IPC::Message const &) (in xul.pdb)
mozilla::ipc::MessageChannel::OnMaybeDequeueOne() (in xul.pdb)
MessageLoop::DoWork() (in xul.pdb)
Flags: needinfo?(jmathies)
Comment 3•10 years ago
|
||
If this is what we have, this data isn't actionable, for example -
- 43.2056% (43.2310%):
Timer::Fire
Startup::XRE_Main
- 33.5110% (18.4398%):
Startup::XRE_Main
- 7.3378% (0.0041%):
IPDL::PCookieService::RecvGetCookieString
Startup::XRE_Main
You might chat with vladan or bsmedberg, I think he wrote the original telemetry stack collection code.
(In reply to Jim Mathies [:jimm] from comment #2)
> This doesn't compare well with what I see here -
>
> https://telemetry.mozilla.org/chromehangs/
>
> This data is coming from nightly and doesn't account for e10s processes. It
> doesn't match to your non-e10s data though, which has me wondering if this
> analysis is missing something.
This analysis is based on pseudostacks. It seems like the chromehangs analysis uses actual stacks.
Updated•10 years ago
|
Flags: needinfo?(vladan.bugzilla)
Comment 5•10 years ago
|
||
BHR uses pseudostack frames, chrome hangs use real C++ stacks. We can do a chromehangs e10s comparison as well, but I think the bigger issue here is that BHR lacks coverage for non-gfx code. I filed bug 1224374 to expand its coverage (based on top chromehang stack frames)
Depends on: 1224374
Flags: needinfo?(vladan.bugzilla)
Comment 6•10 years ago
|
||
(In reply to Birunthan Mohanathas [:poiru] from comment #1)
> - (content script) is the topmost frame in 13% of non-e10s and 0.1% of e10s
> stacks. That seems like a big win for e10s, but is this an expected result?
There's an issue with missing child-process BHR stacks: bug 1228437
Updated•10 years ago
|
Blocks: e10s-responsiveness
Updated•10 years ago
|
Priority: -- → P3
Comment 7•8 years ago
|
||
This was done as part of the e10s analyses done by dzeber and bmiroglio.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•