Telemetry memory measurements cause frequent hangs in main and content processes

RESOLVED FIXED in Firefox 60

Status

()

defect
P1
normal
RESOLVED FIXED
Last year
Last year

People

(Reporter: philipp, Assigned: chutten)

Tracking

({perf, regression})

Trunk
mozilla61
x86_64
Windows 10
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox58 unaffected, firefox59 unaffected, firefox60 fixed, firefox61 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

During video playback in the browser i notice slight hangs in the video every now and then.

I took a profile and it appears to be related to telemetry gathering memory information: https://perfht.ml/2oeUGuU
Hey :phillip, looks like the link cannot be opened correctly :(

"Error: Unserializing the profile failed: Error: Unable to parse a processed profile of version 10 - are you running an outdated version of perf.html? The most recent version understood by this version of perf.html is version 9."

Any chance you can provide another one?
Flags: needinfo?(madperson)
strangely i had issues loading that url at the time when you've posted too - but it seems to work again now. could you try again?
Flags: needinfo?(madperson)
(In reply to [:philipp] from comment #2)
> strangely i had issues loading that url at the time when you've posted too -
> but it seems to work again now. could you try again?

It works now, thanks!
Was this on Windows 8.1 or above, x64?

If so, it's sounding suspiciously like bug 1436914...
Flags: needinfo?(madperson)
yes, windows 10 & a 64bit build of firefox
Flags: needinfo?(madperson)
If bug 1436914 disables vsize reporting on winx64, I expect this'll go away.
Depends on: 1436914
Priority: -- → P3
Chris, bugs-marked-as-regression triagers were curious what the plans are with landing the patches in bug 1436914.
Flags: needinfo?(chutten)
Hrm. According to bug 1436914 comment #24 this might not be a regression as much as it is noticing that vsize max contiguous is rather slow.

The plans for bug 1436914 are stalled. Though the patch has been reviewed, doubt has been cast on its appropriateness. 

I'll try again.
Flags: needinfo?(chutten)
I'm changing the bug's title - while the hangs are most apparent during media playback, they are also noticeable in other situations like tab switching. this manifests itself in 500-1500ms processing delays in the main browser process and all content processes that reappear every minute or so.

unfortunately turning off submission of telemetry data also won't help, as the measurements are still taken.

this is trivial to reproduce on 64bit windows 10 in a new profile:
- install the profiler addon from https://perf-html.io/ & turn on logging
- open a handful of websites and let the browser sit idle for 5 minutes
- capture the profile

another profile i took with these STR: https://perfht.ml/2pmbbWA
Has Regression Range: --- → yes
Has STR: --- → yes
OS: Unspecified → Windows 10
Hardware: Unspecified → x86_64
Summary: Hangs during video playback → Telemetry memory measurements cause frequent hangs in main and content processes
For what it's worth these hangs are pre-release only. We don't ask for any of the memory measures (except GHOST_WINDOWS) on release.
Component: Telemetry → XPCOM
Product: Toolkit → Core
ok, if this problem won't filter through to release it won't need tracking, i suppose. 
would be nice if it gets fixed nevertheless :))
(In reply to Chris H-C :chutten from comment #6)
> If bug 1436914 disables vsize reporting on winx64, I expect this'll go away.

It looks like this is vsize-max-contiguous, I think disabling that on winx64 [1] is non-controversial. Disabling vsize is a different conversation.

I'm moving this back to the telemetry component as we can't really make the measurement faster, but we can just avoid the issue by bypassing on x86_64 in the telemetry code.

[1] https://searchfox.org/mozilla-central/rev/3abf6fa7e2a6d9a7bfb88796141b0f012e68c2db/toolkit/components/telemetry/TelemetrySession.jsm#1103
Component: XPCOM → Telemetry
No longer depends on: 1436914
Product: Core → Toolkit
See Also: → 1436914
Assignee: nobody → chutten
Status: NEW → ASSIGNED
Priority: P3 → P1
r?erahm
Flags: needinfo?(erahm)
Comment on attachment 8960291 [details] [diff] [review]
0001-bug-1439222-Stop-recording-VSIZE_MAX_CONTIGUOUS-on-W.patch

Review of attachment 8960291 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks!
Attachment #8960291 - Flags: review+
Flags: needinfo?(erahm)
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/37d5fb897e8b
Stop recording VSIZE_MAX_CONTIGUOUS on Win64. r=erahm
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/37d5fb897e8b
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Sounds like this may want an approval request for Beta uplift?
Flags: needinfo?(chutten)
It is a small, self-contained change with perf impact... so, sure. Beta deserves this improvement early. (Once again: release is unaffected)
Flags: needinfo?(chutten)
Comment on attachment 8960546 [details] [diff] [review]
0001-bug-1439222-Stop-recording-VSIZE_MAX_CONTIGUOUS-on-W.patch

Approval Request Comment
[Feature/Bug causing the regression]: No specific bug, just increasing Win64 population size
[User impact if declined]: Win64 Beta users may experience main-thread hangs
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: Yes. My Nightly on Win64 shows MEMORY_VSIZE but not MEMORY_VSIZE_MAX_CONTIGUOUS
[Needs manual test from QE? If yes, steps to reproduce]:  Should be alright without them, but the steps would be 1) Open about:telemetry 2) Search for vsize 3) MEMORY_VSIZE_MAX_CONTIGUOUS should only appear on Win32 on pre-release channels
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No
[Why is the change risky/not risky?]: Small, self-contained, uses well-understood constants
[String changes made/needed]: None
Attachment #8960546 - Flags: approval-mozilla-beta?
Comment on attachment 8960546 [details] [diff] [review]
0001-bug-1439222-Stop-recording-VSIZE_MAX_CONTIGUOUS-on-W.patch

win64 telemetry change for beta60
Attachment #8960546 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.