Closed Bug 1373892 Opened 7 years ago Closed 7 years ago

Telemetry ping blocks main process for 120 seconds

Categories

(Toolkit :: Telemetry, defect)

x86_64
Windows 10
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1369594

People

(Reporter: bugzilla.mozilla.org, Unassigned)

Details

Attachments

(1 file, 1 obsolete file)

187.07 KB, image/png
Details
Attached image ping-profile.png (obsolete) —
I'm occasionally experiencing hangs lasting more than a minute. The profiler points at telemetry being responsible.

Attached: A screenshot of the profiler output.

And a stack trace of one of the functions taking up significant time: https://pastebin.mozilla.org/9025046
According to inverted call stacks 35% of the time is spent in NtQueryAttributesFile, 18% in NtReadVirtualMemory and 5% in NtQueryVirtualMemory.
I guess it's examining the virtual memory layout of all processes and there's some inefficiency in that.
This is jank resulting from loading module information from `Telemetry::GetStackAndModules()`, which we track in bug 1363453.

Note that this is limited to users who are on pre-release channels or turned on extended Telemetry, so release users are usually not affected.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Attached image telemetry2.png
(In reply to The 8472 from comment #0)
> According to inverted call stacks 35% of the time is spent in
> NtQueryAttributesFile, 18% in NtReadVirtualMemory and 5% in
> NtQueryVirtualMemory.

Bug 1369594 fixed the first two, but since they no longer occur NtQueryVirtualMemory is now the dominant leaf method and makes up 20% of the hang time all by itself.

Additionally garbage collection - apparently triggered among other things by the structured clone decoding - takes up another ~20%.
Attachment #8878743 - Attachment is obsolete: true
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
The main problem visible in the previous screenshot was fixed in bug 1369594.
It would be easier to open a new bug for remaining problems.

Uploading & linking a profile would be helpful for diagnosing.
Thanks.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → DUPLICATE
My bug report does not solely consist of a screenshot, it also contains text. 

> and 5% in NtQueryVirtualMemory.

This part was NOT fixed and made up a multi-second fraction of the hang, therefore the issue as a whole is not fixed either. Why do you insist on closing it?
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
> Uploading & linking a profile would be helpful for diagnosing.

The profiler lacks an option to anonymize the profile, until it has that a screenshot will have to do due to privacy reasons. If you need to know additional stats I can provide them of course.
(In reply to The 8472 from comment #4)
> This part was NOT fixed and made up a multi-second fraction of the hang,
> therefore the issue as a whole is not fixed either. Why do you insist on
> closing it?

Because:
- as far as i can tell, the other issue was the main one
- breaking separate issues out into separate bugs makes it easier for us to talk about and organize them

Opening a separate bug doesn't mean we are not looking at this issue.

FWIW, bug 1340134 moves gatherMemory() to idle times; i assume that addresses the remaining issue.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: