Closed Bug 1228437 Opened 10 years ago Closed 10 years ago

Investigate missing BHR stats from e10s child process

Categories

(Toolkit :: Telemetry, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla46
Tracking Status
e10s + ---
firefox45 --- fixed
firefox46 --- fixed

People

(Reporter: vladan, Assigned: vladan)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files)

I can't generate BHR child process hang stats in my local Nightly or Aurora and poiru is not seeing them server-side either https://bugzilla.mozilla.org/show_bug.cgi?id=1222972#c1
Blocks: 1222972
Blocks: 1182637
The e10s threadHangStats (under payload/childPayloads) looks like this: [{u'activity': {u'bucket_count': 33, u'histogram_type': 0, u'log_sum': 0, u'log_sum_squares': 0, u'range': [1, 0], u'sum': 0, u'values': {}}, u'hangs': {}, u'name': u''}, {u'activity': {u'bucket_count': 33, u'histogram_type': 0, u'log_sum': 0, u'log_sum_squares': 0, u'range': [1, 0], u'sum': 0, u'values': {u'0': 0, u'1': 15, u'3': 0}}, u'hangs': {}, u'name': u'ImageBridgeChild'}, {u'activity': {u'bucket_count': 33, u'histogram_type': 0, u'log_sum': 0, u'log_sum_squares': 0, u'range': [1, 0], u'sum': 0, u'values': {u'0': 0, u'1': 13, u'3': 0}}, u'hangs': {}, u'name': u'ProcessHangMonitor'}] So some data is collected about the ProcessHangMonitor and ImageBridgeChild threads, but the main thread seems to be missing. There is also a mysterious unnamed thread.
(In reply to Birunthan Mohanathas [:poiru] from comment #1) > The e10s threadHangStats (under payload/childPayloads) looks like this: ... > So some data is collected about the ProcessHangMonitor and ImageBridgeChild > threads, but the main thread seems to be missing. There is also a mysterious > unnamed thread. Is this valid for the majority of e10s pings?
The unnamed thread in comment 2 is actually the content process main thread. It's unnamed because this is never executed on the content process: https://dxr.mozilla.org/mozilla-central/rev/74c7941a9e22d50057800771ebae07f69deecc9f/xpcom/build/XPCOMInit.cpp#536-540 (In reply to Roberto Agostino Vitillo (:rvitillo) from comment #2) > Is this valid for the majority of e10s pings? I checked the e10s-enabled Nightly pings that were submitted yesterday (with `fraction=0.1`) and none of them had hang stacks under childPayloads.
Blocks: 1229104
tracking-e10s: --- → +
I tested that this works, but I feel like this approach might be overlooking hazards?
Attachment #8701925 - Flags: feedback?(nchen)
Assignee: nobody → vladan.bugzilla
Comment on attachment 8701925 [details] [diff] [review] Enable BHR and chromeHangs in content processes, draft Review of attachment 8701925 [details] [diff] [review]: ----------------------------------------------------------------- I think this would be fine. I don't think UI activity tracking in HangMonitor is relevant to content processes, but it looks harmless. Otherwise, I didn't see anything that wouldn't work in content processes.
Attachment #8701925 - Flags: feedback?(nchen) → feedback+
Comment on attachment 8701925 [details] [diff] [review] Enable BHR and chromeHangs in content processes, draft Approval Request Comment [Feature/regressing bug #]: Measurement missing since e10s became a thing [User impact if declined]: We won't get this BHR responsiveness measurement from the e10s content process, preventing us from accurately comparing e10s vs non-e10s performance in our Beta 44 experiment, needed for making a recommendation for shipping or not shipping e10s by default in Release 45 [Describe test coverage new/current, TreeHerder]: small change to measurement code, manual local testing [Risks and why]: Seems very low. The change is small + it just adds a measurement that is missing in the content process [String/UUID change made/needed]: None
Attachment #8701925 - Flags: approval-mozilla-beta?
Attachment #8701925 - Flags: approval-mozilla-aurora?
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/3505bec8eac5 for timing out in test_NuwaProcessCreation.html on B2G ICS emulator debug: backout job: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=3505bec8eac5 first failing job: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=6daddfb64d72 Log: https://treeherder.mozilla.org/logviewer.html#?job_id=19022879&repo=mozilla-inbound 19:08:59 INFO - 99 INFO TEST-START | dom/ipc/tests/test_NuwaProcessCreation.html 19:09:49 INFO - ###################################### forms.js loaded 19:09:49 INFO - ############################### browserElementPanning.js loaded 19:09:49 INFO - ###################################### BrowserElementCopyPaste.js loaded 19:09:50 INFO - ######################## BrowserElementChildPreload.js loaded 19:16:16 INFO - 100 INFO Set up preferences for testing the Nuwa process. 19:16:16 INFO - 101 INFO Shut down processes by disabling process prelaunch 19:16:16 INFO - 102 INFO Launch the Nuwa process 19:16:16 INFO - 103 INFO TEST-UNEXPECTED-FAIL | dom/ipc/tests/test_NuwaProcessCreation.html | Test timed out. 19:16:16 INFO - reportError@SimpleTest/TestRunner.js:114:7 19:16:16 INFO - TestRunner._checkForHangs@SimpleTest/TestRunner.js:134:7 19:16:17 INFO - MEMORY STAT | vsize 153MB | residentFast 70MB | heapAllocated 20MB 19:16:18 INFO - 104 INFO TEST-OK | dom/ipc/tests/test_NuwaProcessCreation.html | took 438607ms
Comment on attachment 8701925 [details] [diff] [review] Enable BHR and chromeHangs in content processes, draft Please renominate once the reasons for backout are addressed. Thanks!
Attachment #8701925 - Flags: approval-mozilla-beta?
Attachment #8701925 - Flags: approval-mozilla-aurora?
adding this to e10s m8 for tracking purposes. we are blocked on this waiting on good data for perf bug 1182637.
Blocks: e10s-perf
Priority: -- → P1
Re-landed with Ting-Yu's fix. Thank you!
Comment on attachment 8708235 [details] [diff] [review] patch v2 Approval Request Comment [Feature/regressing bug #]: none [User impact if declined]: we won't get good performance (BHR) data from the Beta 45 A/B experiment [Describe test coverage new/current, TreeHerder]: no issues on TreeHerder, tested locally [Risks and why]: not much, as far as I can tell. the overhead of content-process BHR monitoring is minimal and the equivalent monitoring is already done in non-e10s [String/UUID change made/needed]: none
Attachment #8708235 - Flags: approval-mozilla-aurora?
Blocks: 1240648
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Whiteboard: [e10s-45-uplift]
Blocks: 1241336
Comment on attachment 8708235 [details] [diff] [review] patch v2 Better metrics, taking it.
Attachment #8708235 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Whiteboard: [e10s-45-uplift]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: