Closed Bug 1348113 Opened 7 years ago Closed 7 years ago

Collect telemetry on sync IPC triggered by JS

Categories

(Core :: DOM: Core & HTML, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: ehsan.akhgari, Assigned: nika)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 4 obsolete files)

See bug 1347425.  It's really bad that this bug was basically opaque to us so far.

We should see if we get the name of the sync IPC messages that the JS code uses down to the telemetry message in MessageChannel::Send() and use it as the name of the key for the probe.  If that doesn't work, perhaps we should just bit the bullet and add a separate probe for it.

Michael, can you take this one also?  Thanks!
Flags: needinfo?(michael)
From what I can tell by looking at the code (idl is sometimes hard to follow but I think I found the only implementer which doesn't forward to this implementation), this is the only implemener of sendRpcMessage and sendSyncMessage:

http://searchfox.org/mozilla-central/rev/ee7cfd05d7d9f83b017dd54c2052a2ec19cbd48b/dom/base/nsFrameMessageManager.cpp#558-582

In the content process case this ends up in this code:

http://searchfox.org/mozilla-central/rev/ee7cfd05d7d9f83b017dd54c2052a2ec19cbd48b/dom/base/nsFrameMessageManager.cpp#1822-1827

Which calls either ContentChild::SendSyncMessage or ContentChild::SendRpcMessage, which are IPDL implemented methods on PContentChild. As far as I can tell, there is no way in JS to send a sync message the other way (from chrome to content JS) using either sendSyncMessage or sendRpcMessage.

The easiest way to handle this would be to add another telemetry probe around http://searchfox.org/mozilla-central/rev/ee7cfd05d7d9f83b017dd54c2052a2ec19cbd48b/dom/base/nsFrameMessageManager.cpp#1822-1827, as we know the JS telemetry message in this situation. The less-easy way would be to add some special IPDL attribute which tells the code generator for SendSyncMessage or SendRpcMessage that it should somehow include the first argument when producing telemetry.

I'm inclined to take the option of adding another telemetry probe which we measure around the callsite in nsFrameMessageManager.cpp. What do you think ehsan?
Assignee: nobody → michael
Flags: needinfo?(michael) → needinfo?(ehsan)
Asking for data-review from bsmedberg, and review from billm. Asking for feedback from ehsan to make sure that this is a good patch.

AFAICT this is the simplest version of this patch. Trying to merge the telemetry together sounds fairly tricky.

MozReview-Commit-ID: 6ROx87BL18p
Attachment #8849309 - Flags: review?(wmccloskey)
Attachment #8849309 - Flags: review?(benjamin)
Attachment #8849309 - Flags: feedback?(ehsan)
I derped and didn't include one of the callsites in the original patch...

MozReview-Commit-ID: 6ROx87BL18p
Attachment #8849315 - Flags: review?(wmccloskey)
Attachment #8849315 - Flags: review?(benjamin)
Attachment #8849315 - Flags: feedback?(ehsan)
Attachment #8849309 - Attachment is obsolete: true
Attachment #8849309 - Flags: review?(wmccloskey)
Attachment #8849309 - Flags: review?(benjamin)
Attachment #8849309 - Flags: feedback?(ehsan)
Comment on attachment 8849315 [details] [diff] [review]
Add the IPC_SYNC_JS_LATENCY_MS probe to track JS sync IPC latency from sendSyncMessage/sendRpcMessage

Review of attachment 8849315 [details] [diff] [review]:
-----------------------------------------------------------------

Why not instrument here?
http://searchfox.org/mozilla-central/rev/557f236c19730116d3bf53c0deef36362cafafcd/dom/base/nsFrameMessageManager.cpp#614
It's just one site.

::: dom/base/nsFrameMessageManager.h
@@ +45,5 @@
>  namespace ipc {
>  
> +// Note: we round the time we spend to the nearest millisecond. So a min value
> +// of 1 ms actually captures from 500us and above.
> +static const uint32_t kMinTelemetrySyncJsIPCLatencyMs = 1;

This name is a little weird (Js and IPC?). Maybe kMinTelemetryMessageManagerSyncLatencyMs?
(In reply to Bill McCloskey (:billm) from comment #4)
> Why not instrument here?
> http://searchfox.org/mozilla-central/rev/
> 557f236c19730116d3bf53c0deef36362cafafcd/dom/base/nsFrameMessageManager.
> cpp#614
> It's just one site.

Uhmmm... That's a good question. Fixed :).

> This name is a little weird (Js and IPC?). Maybe
> kMinTelemetryMessageManagerSyncLatencyMs?

Fixed

MozReview-Commit-ID: 6ROx87BL18p
Attachment #8849326 - Flags: review?(wmccloskey)
Attachment #8849326 - Flags: review?(benjamin)
Attachment #8849326 - Flags: feedback?(ehsan)
Attachment #8849315 - Attachment is obsolete: true
Attachment #8849315 - Flags: review?(wmccloskey)
Attachment #8849315 - Flags: review?(benjamin)
Attachment #8849315 - Flags: feedback?(ehsan)
Comment on attachment 8849326 [details] [diff] [review]
Add the IPC_SYNC_JS_LATENCY_MS probe to track JS sync IPC latency from sendSyncMessage/sendRpcMessage

Review of attachment 8849326 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsFrameMessageManager.h
@@ +45,5 @@
>  namespace ipc {
>  
> +// Note: we round the time we spend to the nearest millisecond. So a min value
> +// of 1 ms actually captures from 500us and above.
> +static const uint32_t kMinTelemetryMessageManagerSyncLatencyMs = 1;

On second thought, how about kMinTelemetrySyncMessageManagerLatencyMs (to match the histogram name)?

::: toolkit/components/telemetry/Histograms.json
@@ +11048,5 @@
>      "n_buckets": 20,
>      "keyed": true,
>      "description": "Measures the number of milliseconds we spend waiting on the main thread for IPC messages to serialize their parameters. Note: only messages that take more than 500 microseconds are included in this probe. This probe is keyed on the IPDL message name."
> +  },
> +  "IPC_SYNC_JS_LATENCY_MS": {

I would rather this be IPC_SYNC_MESSAGE_MANAGER_LATENCY_MS. That's usually how we refer to these types of messages.

@@ +11057,5 @@
> +    "low": 32,
> +    "high": 750,
> +    "n_buckets": 40,
> +    "keyed": true,
> +    "description": "Measures the number of milliseconds we spend waiting for sync IPC messages sent by JavaScript through sendSyncMessage or sendRpcMessage to finish sending, keyed by message name. Note: only messages that wait for more than 500 microseconds are included in this probe."

Similar thing about mentioning the message manager rather than just "by JavaScript".
Attachment #8849326 - Flags: review?(wmccloskey) → review+
Attachment #8849326 - Attachment is obsolete: true
Attachment #8849326 - Flags: review?(benjamin)
Attachment #8849326 - Flags: feedback?(ehsan)
Comment on attachment 8849567 [details] [diff] [review]
Add the IPC_SYNC_JS_LATENCY_MS probe to track JS sync IPC latency from sendSyncMessage/sendRpcMessage

Review of attachment 8849567 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this looks great to me!
Attachment #8849567 - Flags: feedback?(ehsan) → feedback+
Harald, this bug is adding another sync IPC telemetry, this time for JS consumers.
Flags: needinfo?(ehsan)
Michael, this can't land as it is. Some addons spam out a ton of different message types, up to a thousand, which will break telemetry. See bug 1275707. You need to just do the same thing I did there and run StripChars(). Really, we should have a method that sanitizes the message name for telemetry.
Comment on attachment 8849567 [details] [diff] [review]
Add the IPC_SYNC_JS_LATENCY_MS probe to track JS sync IPC latency from sendSyncMessage/sendRpcMessage

data-r=me. I did not review the code.
Attachment #8849567 - Flags: review?(benjamin) → review+
(In reply to Andrew McCreight [:mccr8] from comment #10)
> Michael, this can't land as it is. Some addons spam out a ton of different
> message types, up to a thousand, which will break telemetry. See bug
> 1275707. You need to just do the same thing I did there and run
> StripChars(). Really, we should have a method that sanitizes the message
> name for telemetry.

NI?-ing myself so that I fix this before landing.
Flags: needinfo?(michael)
This patch should fix the issue you brought up mccr8.

MozReview-Commit-ID: 8j4lxT8UULT
Attachment #8852105 - Flags: review?(continuation)
Flags: needinfo?(michael)
Comment on attachment 8852105 [details] [diff] [review]
Part 2: Sanitize the message names before sending them to telemetry

Review of attachment 8852105 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks.

::: dom/base/nsFrameMessageManager.cpp
@@ +669,5 @@
>    uint32_t latencyMs = round((TimeStamp::Now() - start).ToMilliseconds());
>    if (latencyMs >= kMinTelemetrySyncMessageManagerLatencyMs) {
> +    NS_ConvertUTF16toUTF8 messageName(aMessageName);
> +    // NOTE: We need to strip digit characters from the message name in order to
> +    // avoid an large number of buckets due to generated names from addons (such

"an large" -> "a large".

::: layout/base/nsILayoutHistoryState.h
@@ +1,2 @@
> +/*
> + * DO NOT EDIT.  THIS FILE IS GENERATED FROM ../../../dist/idl/nsILayoutHistoryState.idl

You shouldn't commit this file. ;)
Attachment #8852105 - Flags: review?(continuation) → review+
Attachment #8852105 - Attachment is obsolete: true
Pushed by michael@thelayzells.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/63782542588e
Part 1: Add the IPC_SYNC_JS_LATENCY_MS probe to track JS sync IPC latency from sendSyncMessage/sendRpcMessage, r=billm
https://hg.mozilla.org/integration/mozilla-inbound/rev/460c2bcc15d3
Part 2: Sanitize the message names before sending them to telemetry, r=mccr8
https://hg.mozilla.org/mozilla-central/rev/63782542588e
https://hg.mozilla.org/mozilla-central/rev/460c2bcc15d3
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Michael, I just realized there seems to be no data being submitted for this probe...  Any idea why?
Flags: needinfo?(michael)
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #18)
> Michael, I just realized there seems to be no data being submitted for this
> probe...  Any idea why?

There is data being collected for the probe - I just forgot to update the commit message when I changed the name of the probe.

The probe is actually called IPC_SYNC_MESSAGE_MANAGER_LATENCY_MS.
Flags: needinfo?(michael)
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: