Closed Bug 1115820 Opened 5 years ago Closed 5 years ago

Add telemetry to ServiceWorkers

Categories

(Core :: DOM: Workers, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: nsm, Assigned: jaoo)

References

Details

Attachments

(1 file, 5 obsolete files)

It would be good to know this feature's uptake. Of the top of my head:
* Every time registration occurs.
* Whenever a document is controlled. I'm not sure if logging on every fetch is realistic.
* If we can log how long a SW stays alive after it is spawned, and what it does during that time, we could know more about managing their lifetime.
Assignee: nobody → josea.olivera
Status: NEW → ASSIGNED
Attached patch WIP (obsolete) — Splinter Review
Logs every time registration occurs and whenever a document is controlled.
Attached patch v1 (obsolete) — Splinter Review
In this patch we:
- Count how many registrations occur.
- Count whenever a document is controlled.
- Count ServiceWorkers scripts that are not in the cache whenever there is a registration.
- Tracking how long a ServiceWorker stays alive after it is spawned.

I don't know what to log about what the service worker does during the time it is alive. On the other hand what is logged in the patch might not make sense so please let me know about it and some feedback in general please. Thanks!
Attachment #8589722 - Attachment is obsolete: true
Attachment #8590805 - Flags: feedback?(nsm.nikhil)
OS: Linux → All
Hardware: x86_64 → All
Version: 33 Branch → Trunk
Comment on attachment 8590805 [details] [diff] [review]
v1

Review of attachment 8590805 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/RuntimeService.cpp
@@ +1694,5 @@
> +    AssertIsOnMainThread();
> +    Telemetry::AccumulateTimeDelta(Telemetry::SERVICE_WORKER_STAYS_ALIVE,
> +                                   aWorkerPrivate->SpawnTimeStamp());
> +  }
> +  

trailing whitespace

::: dom/workers/WorkerPrivate.h
@@ +188,5 @@
>    bool mMainThreadObjectsForgotten;
>    WorkerType mWorkerType;
>    TimeStamp mCreationTimeStamp;
>    TimeStamp mNowBaseTimeStamp;
> +  TimeStamp mSpawnTimeStamp;

Just use CreationTimeStamp.

::: toolkit/components/telemetry/Histograms.json
@@ +7979,5 @@
> +    "expires_in_version": "50",
> +    "kind": "count",
> +    "description": "Count whenever a document is controlled."
> +  },
> +  "SERVICE_WORKER_NOT_IN_CACHE": {

I'd call this SERVICE_WORKER_UPDATED and update the description accordingly.

@@ +7984,5 @@
> +    "expires_in_version": "50",
> +    "kind": "boolean",
> +    "description": "Count ServiceWorkers scripts that are not in the cache whenever there is a registration."
> +  },
> +  "SERVICE_WORKER_STAYS_ALIVE": {

SERVICE_WORKER_LIFE_TIME
Attachment #8590805 - Flags: feedback?(nsm.nikhil) → feedback+
Attached patch v2 (obsolete) — Splinter Review
Addresses review comments.
Attachment #8590805 - Attachment is obsolete: true
Comment on attachment 8593453 [details] [diff] [review]
v2

Would you guys mind to review this please? Thanks!

:mreid, I wanted to review request at :rvitillo but he is not accepting review request. We reviewed me a similar bug a few weeks ago. I set this expiring in 50 version as we did with the other telemetry probes we added for service workers.
Attachment #8593453 - Flags: review?(nsm.nikhil)
Attachment #8593453 - Flags: review?(mreid)
Attachment #8593453 - Flags: review?(nsm.nikhil) → review+
Attachment #8593453 - Flags: review?(mreid) → review?(vdjeric)
I'm not that familiar with client-side telemetry, hopefully :vladan can take a look.
Comment on attachment 8593453 [details] [diff] [review]
v2

Review of attachment 8593453 [details] [diff] [review]:
-----------------------------------------------------------------

How do you plan to interpret the count histograms? Are you going to look at the average count per session?

::: toolkit/components/telemetry/Histograms.json
@@ +7980,5 @@
>      "kind": "count",
>      "description": "Tracking whether a ServiceWorker spawn gets queued due to hitting max workers per domain limit"
> +  },
> +  "SERVICE_WORKER_REGISTRATIONS": {
> +    "expires_in_version": "50",

add alert_emails fields so you receive automatic notifications when one of these measures starts fluctuating as a result of a bug or change in usage patterns

@@ +7991,5 @@
> +    "description": "Count whenever a document is controlled."
> +  },
> +  "SERVICE_WORKER_UPDATED": {
> +    "expires_in_version": "50",
> +    "kind": "boolean",

do you ever accumulate a value of "False" in this histogram? If not, you should use a count histogram

@@ +7997,5 @@
> +  },
> +  "SERVICE_WORKER_LIFE_TIME": {
> +    "expires_in_version": "50",
> +    "kind": "exponential",
> +    "high": "5000",

Should the max be higher than 5 seconds?
Attachment #8593453 - Flags: review?(vdjeric)
also note that our existing tools (like the Telemetry dashboard and histogram change-detector) don't support count histograms & keyed histograms at the moment. The other histograms types are well supported. We'll be adding support for these during Spring 2015
(In reply to José Antonio Olivera Ortega [:jaoo] from comment #5)
> :mreid, I wanted to review request at :rvitillo but he is not accepting
> review request. We reviewed me a similar bug a few weeks ago. I set this
> expiring in 50 version as we did with the other telemetry probes we added
> for service workers.

Sorry about that, while changing some of my preferences in Bugzilla the other day I must have mistakenly blocked review requests. You should be able to r? me now.
See Also: → 1156857
Depends on: 1156857
Attached patch v3 (obsolete) — Splinter Review
Address review comments made at comment 7. Carrying out r=nsm.

(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #7)
> Comment on attachment 8593453 [details] [diff] [review]
> v2
> 
> Review of attachment 8593453 [details] [diff] [review]:
> -----------------------------------------------------------------

Thanks for reviewing this!

> How do you plan to interpret the count histograms? Are you going to look at
> the average count per session?

More or less. There are other service worker related count histograms. We are -somehow- able to extrapolate how the plumbing behaves taking into account all of them.

> ::: toolkit/components/telemetry/Histograms.json
> @@ +7980,5 @@
> >      "kind": "count",
> >      "description": "Tracking whether a ServiceWorker spawn gets queued due to hitting max workers per domain limit"
> > +  },
> > +  "SERVICE_WORKER_REGISTRATIONS": {
> > +    "expires_in_version": "50",
> 
> add alert_emails fields so you receive automatic notifications when one of
> these measures starts fluctuating as a result of a bug or change in usage
> patterns

We have not added the `alert_emails` field in the other histograms, so we will keep these as the other ones if you don't mind.

> @@ +7991,5 @@
> > +    "description": "Count whenever a document is controlled."
> > +  },
> > +  "SERVICE_WORKER_UPDATED": {
> > +    "expires_in_version": "50",
> > +    "kind": "boolean",
> 
> do you ever accumulate a value of "False" in this histogram? If not, you
> should use a count histogram

Changed, now it's a count one.

> @@ +7997,5 @@
> > +  },
> > +  "SERVICE_WORKER_LIFE_TIME": {
> > +    "expires_in_version": "50",
> > +    "kind": "exponential",
> > +    "high": "5000",
> 
> Should the max be higher than 5 seconds?

Thanks for raising this. I set that time two minutes.

We could land this if the review passes but I prefer to wait until bug 1156857 gets fixed so we could see how the telemetry info is being collected in the child (right now I'm testing this with e10s disabled).

:vladan, could have a look at the patch again please? Thanks!

Try results at:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2daf65d2fdaf
Attachment #8593453 - Attachment is obsolete: true
Attachment #8595828 - Flags: review?(vdjeric)
Comment on attachment 8595828 [details] [diff] [review]
v3

Review of attachment 8595828 [details] [diff] [review]:
-----------------------------------------------------------------

The alert_emails field is strongly encouraged because it helps identify the owners of the histogram and they don't have to manually check for changes in the data in the dashboard, instead they get automatic notifications.
In this case, an email address might not be appropriate. Can you just add the sentence "File bugs in Core::DOM in case of a Telemetry regression." to the Description field?
Attachment #8595828 - Flags: review?(vdjeric) → review+
Attached patch v4 (obsolete) — Splinter Review
(In reply to Vladan Djeric (:vladan), please needinfo -- PTO April 25-May 3 from comment #11)
> Comment on attachment 8595828 [details] [diff] [review]
> v3

Address review comments made at comment 11. Carrying out r=nsm,vladan.

> Review of attachment 8595828 [details] [diff] [review]:
> -----------------------------------------------------------------

Thanks for the review.

> Can you just add
> the sentence "File bugs in Core::DOM in case of a Telemetry regression." to
> the Description field?

Sure. Done!

PS. Tested this with patch from bug 1156857 applied. Everything seems to work now. I'll land this as soon as bug 1156857 gets landed.
Attachment #8595828 - Attachment is obsolete: true
Depends on: 1169402
Attachment #8599771 - Attachment is obsolete: true
The Telemetry blockers have been fixed (thanks guys) and now everything is fine so we could land this. Let's see how Try looks and land this.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=da1e50197439
(In reply to José Antonio Olivera Ortega [:jaoo] from comment #14)
> The Telemetry blockers have been fixed (thanks guys) and now everything is
> fine so we could land this. Let's see how Try looks and land this.
> 
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=da1e50197439

Looking good, lets land it.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/601d3d227be1
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.