Note: There are a few cases of duplicates in user autocompletion which are being worked on.

Investigate missing BHR stats from e10s child process

RESOLVED FIXED in Firefox 45

Status

()

Toolkit
Telemetry
P1
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: vladan, Assigned: vladan)

Tracking

(Blocks: 3 bugs)

Trunk
mozilla46
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(e10s+, firefox45 fixed, firefox46 fixed)

Details

Attachments

(2 attachments)

(Assignee)

Description

2 years ago
I can't generate BHR child process hang stats in my local Nightly or Aurora and poiru is not seeing them server-side either https://bugzilla.mozilla.org/show_bug.cgi?id=1222972#c1
(Assignee)

Updated

2 years ago
Blocks: 1222972
(Assignee)

Updated

2 years ago
Blocks: 1182637
The e10s threadHangStats (under payload/childPayloads) looks like this:

  [{u'activity': {u'bucket_count': 33,
                  u'histogram_type': 0,
                  u'log_sum': 0,
                  u'log_sum_squares': 0,
                  u'range': [1, 0],
                  u'sum': 0,
                  u'values': {}},
    u'hangs': {},
    u'name': u''},
   {u'activity': {u'bucket_count': 33,
                  u'histogram_type': 0,
                  u'log_sum': 0,
                  u'log_sum_squares': 0,
                  u'range': [1, 0],
                  u'sum': 0,
                  u'values': {u'0': 0, u'1': 15, u'3': 0}},
    u'hangs': {},
    u'name': u'ImageBridgeChild'},
   {u'activity': {u'bucket_count': 33,
                  u'histogram_type': 0,
                  u'log_sum': 0,
                  u'log_sum_squares': 0,
                  u'range': [1, 0],
                  u'sum': 0,
                  u'values': {u'0': 0, u'1': 13, u'3': 0}},
    u'hangs': {},
    u'name': u'ProcessHangMonitor'}]

So some data is collected about the ProcessHangMonitor and ImageBridgeChild threads, but the main thread seems to be missing. There is also a mysterious unnamed thread.
(In reply to Birunthan Mohanathas [:poiru] from comment #1)
> The e10s threadHangStats (under payload/childPayloads) looks like this:
...
> So some data is collected about the ProcessHangMonitor and ImageBridgeChild
> threads, but the main thread seems to be missing. There is also a mysterious
> unnamed thread.

Is this valid for the majority of e10s pings?
The unnamed thread in comment 2 is actually the content process main thread. It's unnamed because this is never executed on the content process: https://dxr.mozilla.org/mozilla-central/rev/74c7941a9e22d50057800771ebae07f69deecc9f/xpcom/build/XPCOMInit.cpp#536-540

(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #2)
> Is this valid for the majority of e10s pings?

I checked the e10s-enabled Nightly pings that were submitted yesterday (with `fraction=0.1`) and none of them had hang stacks under childPayloads.
(Assignee)

Updated

2 years ago
Blocks: 1229104

Updated

2 years ago
tracking-e10s: --- → +
(Assignee)

Comment 4

2 years ago
Created attachment 8701925 [details] [diff] [review]
Enable BHR and chromeHangs in content processes, draft

I tested that this works, but I feel like this approach might be overlooking hazards?
Attachment #8701925 - Flags: feedback?(nchen)
(Assignee)

Updated

2 years ago
Assignee: nobody → vladan.bugzilla
Comment on attachment 8701925 [details] [diff] [review]
Enable BHR and chromeHangs in content processes, draft

Review of attachment 8701925 [details] [diff] [review]:
-----------------------------------------------------------------

I think this would be fine. I don't think UI activity tracking in HangMonitor is relevant to content processes, but it looks harmless. Otherwise, I didn't see anything that wouldn't work in content processes.
Attachment #8701925 - Flags: feedback?(nchen) → feedback+
(Assignee)

Comment 6

2 years ago
Comment on attachment 8701925 [details] [diff] [review]
Enable BHR and chromeHangs in content processes, draft

Approval Request Comment
[Feature/regressing bug #]: Measurement missing since e10s became a thing
[User impact if declined]: We won't get this BHR responsiveness measurement from the e10s content process, preventing us from accurately comparing e10s vs non-e10s performance in our Beta 44 experiment, needed for making a recommendation for shipping or not shipping e10s by default in Release 45
[Describe test coverage new/current, TreeHerder]: small change to measurement code, manual local testing
[Risks and why]: Seems very low. The change is small + it just adds a measurement that is missing in the content process
[String/UUID change made/needed]: None
Attachment #8701925 - Flags: approval-mozilla-beta?
Attachment #8701925 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 7

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/6daddfb64d72

Comment 8

2 years ago
Backout:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3505bec8eac5
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/3505bec8eac5 for timing out in test_NuwaProcessCreation.html on B2G ICS emulator debug:

backout job: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=3505bec8eac5
first failing job: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=6daddfb64d72

Log: https://treeherder.mozilla.org/logviewer.html#?job_id=19022879&repo=mozilla-inbound

19:08:59     INFO -  99 INFO TEST-START | dom/ipc/tests/test_NuwaProcessCreation.html
19:09:49     INFO -  ###################################### forms.js loaded
19:09:49     INFO -  ############################### browserElementPanning.js loaded
19:09:49     INFO -  ###################################### BrowserElementCopyPaste.js loaded
19:09:50     INFO -  ######################## BrowserElementChildPreload.js loaded
19:16:16     INFO -  100 INFO Set up preferences for testing the Nuwa process.
19:16:16     INFO -  101 INFO Shut down processes by disabling process prelaunch
19:16:16     INFO -  102 INFO Launch the Nuwa process
19:16:16     INFO -  103 INFO TEST-UNEXPECTED-FAIL | dom/ipc/tests/test_NuwaProcessCreation.html | Test timed out.
19:16:16     INFO -      reportError@SimpleTest/TestRunner.js:114:7
19:16:16     INFO -      TestRunner._checkForHangs@SimpleTest/TestRunner.js:134:7
19:16:17     INFO -  MEMORY STAT | vsize 153MB | residentFast 70MB | heapAllocated 20MB
19:16:18     INFO -  104 INFO TEST-OK | dom/ipc/tests/test_NuwaProcessCreation.html | took 438607ms
BackgroundHangMonitor needs to add https://dxr.mozilla.org/mozilla-central/source/widget/gonk/GonkMemoryPressureMonitoring.cpp#26-28 and https://dxr.mozilla.org/mozilla-central/source/widget/gonk/GonkMemoryPressureMonitoring.cpp#122-126 for b2g.
Comment on attachment 8701925 [details] [diff] [review]
Enable BHR and chromeHangs in content processes, draft

Please renominate once the reasons for backout are addressed. Thanks!
Attachment #8701925 - Flags: approval-mozilla-beta?
Attachment #8701925 - Flags: approval-mozilla-aurora?

Comment 12

2 years ago
adding this to e10s m8 for tracking purposes. we are blocked on this waiting on good data for perf bug 1182637.
tracking-e10s: + → m8+
Blocks: 1063169
tracking-e10s: m8+ → +
Priority: -- → P1
Created attachment 8708235 [details] [diff] [review]
patch v2

Addressed failed Nuwa test cases. Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f4b4940a25f4

Comment 14

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/28731460915c
(Assignee)

Comment 15

2 years ago
Re-landed with Ting-Yu's fix. Thank you!
(Assignee)

Comment 16

2 years ago
Comment on attachment 8708235 [details] [diff] [review]
patch v2

Approval Request Comment
[Feature/regressing bug #]: none
[User impact if declined]: we won't get good performance (BHR) data from the Beta 45 A/B experiment
[Describe test coverage new/current, TreeHerder]: no issues on TreeHerder, tested locally
[Risks and why]: not much, as far as I can tell. the overhead of content-process BHR monitoring is minimal and the equivalent monitoring is already done in non-e10s
[String/UUID change made/needed]: none
Attachment #8708235 - Flags: approval-mozilla-aurora?
(Assignee)

Updated

2 years ago
Blocks: 1240648
status-firefox46: --- → affected
https://hg.mozilla.org/mozilla-central/rev/28731460915c
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox46: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Whiteboard: [e10s-45-uplift]
(Assignee)

Updated

2 years ago
Blocks: 1241336
Comment on attachment 8708235 [details] [diff] [review]
patch v2

Better metrics, taking it.
Attachment #8708235 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(Assignee)

Comment 19

2 years ago
https://hg.mozilla.org/releases/mozilla-aurora/rev/30d486237af3
status-firefox45: affected → fixed
Whiteboard: [e10s-45-uplift]
You need to log in before you can comment on or make changes to this bug.