Closed
Bug 1139513
Opened 10 years ago
Closed 10 years ago
Warn and gather data if ServiceWorker hits max workers per domain limit
Categories
(Core :: DOM: Workers, defect)
Tracking
()
RESOLVED
FIXED
mozilla39
Tracking | Status | |
---|---|---|
firefox39 | --- | fixed |
People
(Reporter: nsm, Assigned: jaoo, Mentored)
References
Details
(Whiteboard: [good first bug][lang=C++])
Attachments
(1 file, 6 obsolete files)
6.53 KB,
patch
|
rvitillo
:
review+
|
Details | Diff | Splinter Review |
This should be a fairly easy bug requiring adding one telemetry probe and some warnings to RuntimeService::RegisterWorker().
1) Warn to window console or browser console (when no window) if a ServiceWorker spawn gets queued due to hitting max workers per domain limit.
2) Add a Telemetry probe to count when a ServiceWorker hits the limit. We may also need another probe to count that a SW was spawned, so we get a percentage of failures, unless metrics has some other way to determine this.
Andrea, would you be willing to mentor since step 1 is the big piece and you know about it?
Reporter | ||
Updated•10 years ago
|
Updated•10 years ago
|
Status: NEW → ASSIGNED
Updated•10 years ago
|
Mentor: amarchesini
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 2•10 years ago
|
||
Just show a warning in the web console when a ServiceWorker spawn gets queued.
Assignee | ||
Comment 3•10 years ago
|
||
Added telemetry probes for service worker spawns. I'm planing to add a couple of more probes for the number of domains and the number of domains hitting the limit.
Attachment #8576971 -
Attachment is obsolete: true
Assignee | ||
Comment 4•10 years ago
|
||
(In reply to José Antonio Olivera Ortega [:jaoo] from comment #3)
> I'm planing to add a
> couple of more probes for the number of domains and the number of domains
> hitting the limit.
Probes for the hit-the-limit thing I commented in the last version of the patch.
Assignee | ||
Updated•10 years ago
|
Attachment #8577501 -
Attachment is obsolete: true
Assignee | ||
Comment 5•10 years ago
|
||
Comment on attachment 8578731 [details] [diff] [review]
WIP
Andrea, I would be nice to have some feedback at this point. Would you mind to have a look a it? Thanks!
PS. When you are back from your holidays! ;)
Attachment #8578731 -
Flags: feedback?(amarchesini)
Reporter | ||
Comment 6•10 years ago
|
||
Comment on attachment 8578731 [details] [diff] [review]
WIP
Review of attachment 8578731 [details] [diff] [review]:
-----------------------------------------------------------------
Patch wide fixes:
1) You need to call telemetry only when it is a service worker, and not for both service and shared workers.
2) Always AssertIsOnMainThread() before calling Telemetry, unless the containing block already does so.
::: dom/workers/RuntimeService.cpp
@@ +1420,5 @@
> aWorkerPrivate->IsServiceWorker();
> if (isSharedOrServiceWorker) {
> AssertIsOnMainThread();
>
> + Telemetry::Accumulate(Telemetry::SERVICE_WORKER_SPAWN_ATTEMPS, 1);
Only for serviceworkers (use aWorkerPrivate->IsServiceWorker()), don't count shared workers.
@@ +1455,5 @@
>
> domainInfo = new WorkerDomainInfo();
> domainInfo->mDomain = domain;
> + if (isSharedOrServiceWorker && !domainInfo->mDomainTracked) {
> + Telemetry::Accumulate(Telemetry::DOMAIN_WITH_SERVICE_WORKERS, 1);
Nit: Whitespace.
@@ +1473,5 @@
> + // ServiceWorker spawn gets queued due to hitting max workers per domain
> + // limit so let's log a warning.
> + // Note: aWorkerPrivate->GetDocument() call might result nullptr due to
> + // no window so the message warning will show up in the browser console.
> + nsContentUtils::ReportToConsole(nsIScriptError::warningFlag,
Before you call this, AssertIsOnMainThread().
@@ +1489,5 @@
> else if (parent) {
> domainInfo->mChildWorkerCount++;
> + if (isSharedOrServiceWorker) {
> + Telemetry::Accumulate(Telemetry::SERVICE_WORKER_WAS_SPAWNED, 1);
> + }
SWs can never have a parent (yet), so this block is not needed
@@ +1495,5 @@
> else {
> domainInfo->mActiveWorkers.AppendElement(aWorkerPrivate);
> + if (isSharedOrServiceWorker) {
> + Telemetry::Accumulate(Telemetry::SERVICE_WORKER_WAS_SPAWNED, 1);
> + }
If you put this at the end of RegisterWorker, right before |return true;| then you will only count workers that really did get a thread created for them, which is what we want.
@@ +1515,5 @@
> }
>
> // From here on out we must call UnregisterWorker if something fails!
> + // TODO: Should we decrease the counters of service workers spawns if the
> + // worker ends up unregistered?
You don't need this if you move the probe as suggested above.
::: dom/workers/RuntimeService.h
@@ +47,5 @@
> nsTArray<WorkerPrivate*> mQueuedWorkers;
> nsClassHashtable<nsCStringHashKey, SharedWorkerInfo> mSharedWorkerInfos;
> uint32_t mChildWorkerCount;
> + bool mDomainTracked;
> + bool mDomainHitMaxWorkersLimitTracked;
You set these to true but never use them. Is there a follow up? I don't think you need to preserve this state.
::: toolkit/components/telemetry/Histograms.json
@@ +7678,5 @@
> "expires_in_version": "never",
> "kind": "count",
> "description": "Count plugin hang notices in e10s"
> + },
> + "SERVICE_WORKER_SPAWN_ATTEMPS": {
Nit: ATTEMPTS
@@ +7681,5 @@
> + },
> + "SERVICE_WORKER_SPAWN_ATTEMPS": {
> + "expires_in_version": "never",
> + "kind": "count",
> + "description": "Tracking ServiceWorker spawn attemps"
description should be "Count attempts to spawn a ServiceWorker for a domain"
Attachment #8578731 -
Flags: feedback?(amarchesini)
Assignee | ||
Comment 7•10 years ago
|
||
(In reply to Nikhil Marathe [:nsm] (needinfo? please) from comment #6)
> Comment on attachment 8578731 [details] [diff] [review]
> WIP
>
> Review of attachment 8578731 [details] [diff] [review]:
> -----------------------------------------------------------------
Thanks for the review!
> Patch wide fixes:
> 1) You need to call telemetry only when it is a service worker, and not for
> both service and shared workers.
> 2) Always AssertIsOnMainThread() before calling Telemetry, unless the
> containing block already does so.
Done!
> ::: dom/workers/RuntimeService.h
> @@ +47,5 @@
> > nsTArray<WorkerPrivate*> mQueuedWorkers;
> > nsClassHashtable<nsCStringHashKey, SharedWorkerInfo> mSharedWorkerInfos;
> > uint32_t mChildWorkerCount;
> > + bool mDomainTracked;
> > + bool mDomainHitMaxWorkersLimitTracked;
>
> You set these to true but never use them. Is there a follow up? I don't
> think you need to preserve this state.
I use them. There is no such follow up work. They are useful to have a way to avoid reporting twice the same domain hit the limit.
I'll ping you today so we could go over the patch and see whether this domain probes are useful or no.
Attachment #8578731 -
Attachment is obsolete: true
Reporter | ||
Comment 8•10 years ago
|
||
Comment on attachment 8579318 [details] [diff] [review]
WIP
Review of attachment 8579318 [details] [diff] [review]:
-----------------------------------------------------------------
The reason I'm not in favour of tracking it across domains is that it doesn't give us much more useful information.
We measure "X SWs hit the limit out of Y attempts". "A domains hit the limit out of Z domains" does not add much to it. Also, using booleans like this is wrong because:
1) user visits example.com, spawns various workers flips counts and booleans etc. So telemetry is bumped up once.
2) user closes all tabs for example.com, at this point runtime service removes the domain info from domain map
3) step 1 happens again.
Which means we measure how many times the domaininfo is created and then sw is spawned and hits the limit. the domain infos don't persist, another reason why I think having this isn't useful.
::: dom/workers/RuntimeService.cpp
@@ +1458,5 @@
> NS_ASSERTION(!parent, "Shouldn't have a parent here!");
>
> domainInfo = new WorkerDomainInfo();
> domainInfo->mDomain = domain;
> + if (isServiceWorker && !domainInfo->mDomainTracked) {
mDomainTracked will always be false here since it was just created.
::: dom/workers/RuntimeService.h
@@ +46,5 @@
> nsTArray<WorkerPrivate*> mActiveWorkers;
> nsTArray<WorkerPrivate*> mQueuedWorkers;
> nsClassHashtable<nsCStringHashKey, SharedWorkerInfo> mSharedWorkerInfos;
> uint32_t mChildWorkerCount;
> + bool mDomainTracked;
if you decide to keep it, rename to mTelemetryUsesServiceWorker.
@@ +47,5 @@
> nsTArray<WorkerPrivate*> mQueuedWorkers;
> nsClassHashtable<nsCStringHashKey, SharedWorkerInfo> mSharedWorkerInfos;
> uint32_t mChildWorkerCount;
> + bool mDomainTracked;
> + bool mDomainHitMaxWorkersLimitTracked;
similarly, mTelemetryHitMaxWorkersLimit.
Assignee | ||
Comment 9•10 years ago
|
||
(In reply to Nikhil Marathe [:nsm] (needinfo? please) from comment #8)
> Comment on attachment 8579318 [details] [diff] [review]
> WIP
>
> Review of attachment 8579318 [details] [diff] [review]:
> -----------------------------------------------------------------
Thanks for explain your reason in detail.
> The reason I'm not in favour of tracking it across domains is that it
> doesn't give us much more useful information.
Let's remove the domain related probes then.
I guess this's ready for review. Would you mind please? Thanks!
Attachment #8579318 -
Attachment is obsolete: true
Attachment #8581514 -
Flags: review?(nsm.nikhil)
Reporter | ||
Comment 10•10 years ago
|
||
Comment on attachment 8581514 [details] [diff] [review]
v1
Review of attachment 8581514 [details] [diff] [review]:
-----------------------------------------------------------------
looks good. Needs worker peer review.
::: dom/locales/en-US/chrome/dom/dom.properties
@@ +160,5 @@
> # LOCALIZATION NOTE: Do not translate "IndexedDB".
> IndexedDBTransactionAbortNavigation=An IndexedDB transaction that was not yet complete has been aborted due to page navigation.
> # LOCALIZATION NOTE (WillChangeBudgetWarning): Do not translate Will-change, %1$S,%2$S,%3$S are numbers.
> WillChangeBudgetWarning=Will-change memory consumption is too high. Surface area covers %1$S pixels, budget is the document surface area multiplied by %2$S (%3$S pixels). All occurences of will-change in the document are ignored when over budget.
> +HittingMaxWorkersPerDomain=ServiceWorker spawn gets queued due to hitting max workers per domain limit.
Nit: s/gets/got
Attachment #8581514 -
Flags: review?(nsm.nikhil) → review?(amarchesini)
Comment 11•10 years ago
|
||
Comment on attachment 8581514 [details] [diff] [review]
v1
Review of attachment 8581514 [details] [diff] [review]:
-----------------------------------------------------------------
lgtm but before landing I want to know if we really need these 2 additional telemetry logs.
::: dom/workers/RuntimeService.cpp
@@ +27,5 @@
> #include "jsfriendapi.h"
> #include "mozilla/ArrayUtils.h"
> #include "mozilla/CycleCollectedJSRuntime.h"
> #include "mozilla/TimeStamp.h"
> +#include "mozilla/Telemetry.h"
alphabetic order.
@@ +1418,5 @@
>
> + bool isServiceWorker = aWorkerPrivate->IsServiceWorker();
> + if (isServiceWorker) {
> + AssertIsOnMainThread();
> + Telemetry::Accumulate(Telemetry::SERVICE_WORKER_SPAWN_ATTEMPTS, 1);
I don't think we want this. Have you spoken with nsm about this?
@@ +1551,5 @@
> }
>
> + if (isServiceWorker) {
> + AssertIsOnMainThread();
> + Telemetry::Accumulate(Telemetry::SERVICE_WORKER_WAS_SPAWNED, 1);
Same for this.
::: toolkit/components/telemetry/Histograms.json
@@ +7688,5 @@
> + "expires_in_version": "never",
> + "kind": "count",
> + "description": "Count ServiceWorkers that really did get a thread created for them"
> + },
> + "SERVICE_WORKER_SPAWN_GETS_QUEUED": {
Actually, this is the only telemetry is required for this bug, right?
The other 2 are not related to this bug: I don't think we want to know how many SWs are created.
Attachment #8581514 -
Flags: review?(amarchesini) → review+
Comment 12•10 years ago
|
||
> looks good. Needs worker peer review.
nsm, what do you think about these 2 additional telemetry logs? attempt/spawned?
Comment 13•10 years ago
|
||
Comment on attachment 8581514 [details] [diff] [review]
v1
Review of attachment 8581514 [details] [diff] [review]:
-----------------------------------------------------------------
bent, we need a review from a worker peer. The patch looks good to me.
Attachment #8581514 -
Flags: feedback?(bent.mozilla)
Assignee | ||
Comment 14•10 years ago
|
||
(In reply to Andrea Marchesini (:baku) from comment #12)
> > looks good. Needs worker peer review.
>
> nsm, what do you think about these 2 additional telemetry logs?
> attempt/spawned?
Flags: needinfo?(nsm.nikhil)
Reporter | ||
Comment 15•10 years ago
|
||
(In reply to Andrea Marchesini (:baku) from comment #12)
> > looks good. Needs worker peer review.
>
> nsm, what do you think about these 2 additional telemetry logs?
> attempt/spawned?
I'd like to have it in to get a good idea of the relative number of SWs that get queued. That way, if we see a large percentage of queues, we will have to look at alternate strategies for SWs like a thread pool and aggressive termination. If it is fairly small, we can just tweak the limits a little.
:jaoo feel free to land once :bent approves. Thanks for the patch!
Flags: needinfo?(nsm.nikhil)
Comment on attachment 8581514 [details] [diff] [review]
v1
Review of attachment 8581514 [details] [diff] [review]:
-----------------------------------------------------------------
This looks great, thanks! I just have two suggestions:
::: dom/locales/en-US/chrome/dom/dom.properties
@@ +160,5 @@
> # LOCALIZATION NOTE: Do not translate "IndexedDB".
> IndexedDBTransactionAbortNavigation=An IndexedDB transaction that was not yet complete has been aborted due to page navigation.
> # LOCALIZATION NOTE (WillChangeBudgetWarning): Do not translate Will-change, %1$S,%2$S,%3$S are numbers.
> WillChangeBudgetWarning=Will-change memory consumption is too high. Surface area covers %1$S pixels, budget is the document surface area multiplied by %2$S (%3$S pixels). All occurences of will-change in the document are ignored when over budget.
> +HittingMaxWorkersPerDomain=ServiceWorker spawn gets queued due to hitting max workers per domain limit.
Nit: I think this would be better:
"A ServiceWorker could not be started immediately because other documents in the same origin are already using the maximum number of workers. The ServiceWorker is now queued and will be started after some of the other workers have completed."
And please mark with
# LOCALIZATION NOTE: Do not translate "ServiceWorker".
so that our localizers know that "ServiceWorker" is something special.
::: dom/workers/RuntimeService.cpp
@@ +1415,5 @@
> }
>
> nsCString sharedWorkerScriptSpec;
>
> + bool isServiceWorker = aWorkerPrivate->IsServiceWorker();
Nit: Make this const
Attachment #8581514 -
Flags: feedback?(bent.mozilla) → feedback+
Assignee | ||
Comment 17•10 years ago
|
||
Addresses review comments from comment 11 and comment 16.
Ben, would mind to have a look a it again please? Thank!
PS. Should we request review at someone from the telemetry team?
Attachment #8581514 -
Attachment is obsolete: true
Attachment #8583046 -
Flags: review?(bent.mozilla)
Updated•10 years ago
|
Attachment #8583046 -
Flags: review?(bent.mozilla) → review+
(In reply to José Antonio Olivera Ortega [:jaoo] from comment #17)
> PS. Should we request review at someone from the telemetry team?
If you'd like, but I think this is fine. Thanks!
Assignee | ||
Comment 19•10 years ago
|
||
Comment on attachment 8583046 [details] [diff] [review]
v2
I'd like :rvitillo to have a look at the patch so he could let's know if this is ready to go.
Attachment #8583046 -
Flags: review?(rvitillo)
Comment 20•10 years ago
|
||
Ben, could you confirm that this probe should never expire? Just making sure "never" is not misused as a default value.
Flags: needinfo?(bent.mozilla)
Assignee | ||
Comment 21•10 years ago
|
||
I went ahead and pushed this to try. Find the results at https://treeherder.mozilla.org/#/jobs?repo=try&revision=9899f1f39e47 please.
Reporter | ||
Comment 22•10 years ago
|
||
(In reply to rvitillo@mozilla.com from comment #20)
> Ben, could you confirm that this probe should never expire? Just making sure
> "never" is not misused as a default value.
Good point, something like 50 should be reasonable.
Flags: needinfo?(bent.mozilla)
Assignee | ||
Comment 23•10 years ago
|
||
Setting expire to gecko 50.
Carrying out r=bent.
Thanks!
Attachment #8583046 -
Attachment is obsolete: true
Attachment #8583046 -
Flags: review?(rvitillo)
Attachment #8583305 -
Flags: review?(rvitillo)
Updated•10 years ago
|
Attachment #8583305 -
Flags: review?(rvitillo) → review+
Assignee | ||
Comment 24•10 years ago
|
||
Keywords: checkin-needed
Comment 25•10 years ago
|
||
Keywords: checkin-needed
Comment 26•10 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
You need to log in
before you can comment on or make changes to this bug.
Description
•