Closed Bug 1468550 Opened 6 years ago Closed 6 years ago

Track webextension activity

Categories

(WebExtensions :: Experiments, enhancement, P3)

enhancement

Tracking

(firefox63 fixed)

RESOLVED FIXED
mozilla63
Tracking Status
firefox63 --- fixed

People

(Reporter: tarek, Assigned: tarek)

References

Details

(Whiteboard: [perf-tools])

Attachments

(1 file)

Let's see if we can come up with a way to detect webextension activity within the event scheduler.

use case: detect webextensions that are using a lot of resources.

Some feedback from Ehsan:

We don't have a similar mechanism to the TabGroup tagging at the runnable level for extensions.  However, if you have a principal, you can call BasePrincipal::AddonPolicy() <https://searchfox.org/mozilla-central/rev/5a744713370ec47969595e369fd5125f123e6d24/caps/BasePrincipal.h#107> in order to decide whether the principal belongs to an add-on (it maps to codebase principals belonging to a moz-extension:// URI).  Using this, it may be possible to do some rough estimation of which runnables are dispatches on behalf of extensions, for example at dispatch time by looking at the JS call stack and obtaining a subject principal and looking at the add-on policy from that principal.

However that would give you a subset of all runnables that come from an extension and not all of them.  In order to get them all, we would need to do a massive amount of work to properly tag runnables similarly to how we tagged them with TabGroup information, not sure if that would be worth doing...
Assignee: nobody → tarek
Product: Toolkit → WebExtensions
I've tried to implement a function to get the current web extension id, and call it on every dispatch call that is
happening in DocGroup::Dispatch(), but the AddonPolicy() call is always returning null from there.

There's obvioulsy something I am not doing well, here's my current function:

  nsCString
  GetWebextensionID()
  {
    nsCString extensionID;
    if (!IsJSAPIActive()) {
      return extensionID;
    }
    nsCOMPtr<nsIStackFrame> frame = GetCurrentJSStack(10);
    JSContext* cx = frame ? nsContentUtils::GetCurrentJSContext() : nullptr;
    if (!cx) {
      return extensionID;
    }
    nsIPrincipal* principal = nsContentUtils::SubjectPrincipal(cx);
    if (!principal) {
      return extensionID;
    }
    RefPtr<extensions::WebExtensionPolicy> policy = BasePrincipal::Cast(principal)->AddonP
olicy();
    if (policy) {
      LOG(("****** Getting AddonPolicy ID"));
      policy->Id()->ToUTF8String(extensionID);
    }
    return extensionID;
  }


Any hints? :)
Flags: needinfo?(nika)
Flags: needinfo?(ehsan)
This is a thorny topic.

I've wanted telemetry for this for a while, particularly for content scripts. And we *used* to have accounting for time spent in compartments, that tried to track jank caused by add-ons. But that all got ripped out.

So, that's the first snag. The thing that's probably most important at this point is jank caused by content scripts, which run in Sandboxes attached to web content pages. Most of the events scheduled by those belong to the content page's doc group, though the caller principal will generally be the extension principal when they get scheduled. That may give us something to go on, but I'm not sure how useful the information would be. What's really interesting to me is how much time we're spending in content script compartments when we should be doing things like painting.

The second snag is that extension API calls are proxied to the parent process. It's possible for those to cause significant main process jank if they do expensive operations a lot. Which they do. But those operations aren't tied to the extensions at all, as far as the platform is concerned. A content API calls into the extension framework, at which point content process chrome code sends a message to parent process chrome code, and all of that work is completely divorced from extension principals. The extension framework knows what extension is responsible for it, but the platform does not.

We could, of course, add special-purpose accounting for that, though.
Are you trying to detect Content Scripts? You might want BasePrincipal->ContentScriptAddonPolicy() instead of AddonPolicy() here.
Flags: needinfo?(nika)
Thanks Nika.

Kris, thanks for your feedback as well, that's very interesting. 

1/ first snag

> What's really interesting to me is how much time we're spending in content script compartments when we should be doing things like painting.
 
My current understanding is that if we're able to label "correctly" runnables that are dispatched via the compartment,
we can measure that time, like I do with other runnables via nsThread and nsWorkerThread.

Another approach would be to have that accounting at the Sandbox level maybe ?

2/ second snag

That makes me wonder: how hard would it be from within the extension framework to mark messages that are sent to content then parent process (is that all through IPDL?) as being issued from a given extension ID. 

I am more than happy to try to write patches for getting that info.
Flags: needinfo?(ehsan) → needinfo?(kmaglione+bmo)
I've talked with kmag on IRC and will work on an initial patch that tracks on the JS side the webextension API calls activity (both on the child and on the parent sides) -- most calls are synchronous so we can use that to track both the number of calls and the duration.
Flags: needinfo?(kmaglione+bmo)
Priority: -- → P3
Hey Kris, this patch is just a first step to see how I can collect calls done by webextensions

Here, I just grab the ones sent to the parent process and count them by webext id and API path.
Do you think this would be a good first set of counters?

According to my tests on various web extensions (including "bad" ones), it looks like it gives us
a first hint on the general activity of webextensions.

Let me know what you think

(A real patch would of course consist of a clean PerformanceCounter JS class that can be queried from about:performance)
Flags: needinfo?(kmaglione+bmo)
Talked with Kris -- this looks like the right direction (except that we should measure context.withPendingBrowser and use DefaultMap)
Flags: needinfo?(kmaglione+bmo)
Whiteboard: perf-tools]
Whiteboard: perf-tools] → [perf-tools]
Comment on attachment 8991625 [details]
Bug 1468550 - Track webextension activity -

https://reviewboard.mozilla.org/r/256544/#review264772

Sorry for the delay. My review queue is a bit of a mess.

This mostly looks good. Just some minor issues.

::: toolkit/components/extensions/ExtensionParent.jsm:20
(Diff revision 5)
>  
>  var EXPORTED_SYMBOLS = ["ExtensionParent"];
>  
>  ChromeUtils.import("resource://gre/modules/Services.jsm");
>  ChromeUtils.import("resource://gre/modules/XPCOMUtils.jsm");
> +ChromeUtils.import("resource://gre/modules/Preferences.jsm");

Please don't use Preferences.jsm. We're trying to get rid of it.

::: toolkit/components/extensions/ExtensionParent.jsm:64
(Diff revision 5)
>  
>  const BASE_SCHEMA = "chrome://extensions/content/schemas/manifest.json";
>  const CATEGORY_EXTENSION_MODULES = "webextension-modules";
>  const CATEGORY_EXTENSION_SCHEMAS = "webextension-schemas";
>  const CATEGORY_EXTENSION_SCRIPTS = "webextension-scripts";
> +const kTimingEnabledPref = "extensions.webextensions.enablePerformanceCounters";

We don't use `k*` constants in JS. All-caps-and-underscores, please.

::: toolkit/components/extensions/ExtensionParent.jsm:769
(Diff revision 5)
>      Services.mm.addMessageListener("API:CloseProxyContext", this, true);
>      Services.mm.addMessageListener("API:Call", this);
>      Services.mm.addMessageListener("API:AddListener", this);
>      Services.mm.addMessageListener("API:RemoveListener", this);
>  
> +    this.timingEnabled = Preferences.get(kTimingEnabledPref, false);

We should probably use XPCOMUtils.defineLazyPreferenceGetter for this

::: toolkit/components/extensions/ExtensionParent.jsm:907
(Diff revision 5)
>      }
>    },
>  
> +  storeExecutionTime(webExtensionId, apiPath, duration) {
> +    let apiCounter = this.performanceCounters.get(webExtensionId).get(apiPath);
> +    // XXX deal with potential overflow

JavaScript numbers are double precision floats. There's no chance of overflow before the heat death of the universe. That said, after we get to 53 bits of mantissa and a couple bits of exponent, we'll lose so much precision that adding durations will stop having an effect.

::: toolkit/components/extensions/ExtensionParent.jsm:939
(Diff revision 5)
> +      if (this.timingEnabled)  {
> +        let webExtId = data.childId.split(".")[0];
> +        let start = Cu.now();
> +        result = context.withPendingBrowser(pendingBrowser,
> +                                            () => fun(...args));
> +        let end = Cu.now();
> +        this.storeExecutionTime(webExtId, data.path, end - start);
> +      } else {
> +        result = context.withPendingBrowser(pendingBrowser,
> -                                              () => fun(...args));
> +                                            () => fun(...args));
> +      }

We typically write things like this as something like:

    let result = this.withTiming(() => ...);

The withTiming function can check whether timing is enabled.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:18
(Diff revision 5)
> +function background() {
> +  browser.test.log("running background script");
> +
> +  browser.test.onMessage.addListener((x, y) => {
> +    browser.test.assertEq(x, 10, "x is 10");
> +    browser.test.assertEq(y, 20, "y is 20");
> +
> +    browser.test.notifyPass("background test passed");
> +  });
> +
> +  browser.test.sendMessage("running", 1);
> +}

We typically define background scripts inline in the extension data object.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:19
(Diff revision 5)
> +"use strict";
> +
> +/* global sendAsyncMessage */
> +
> +function background() {
> +  browser.test.log("running background script");

Please capitalize all log messages.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:28
(Diff revision 5)
> +    browser.test.assertEq(y, 20, "y is 20");
> +
> +    browser.test.notifyPass("background test passed");
> +  });
> +
> +  browser.test.sendMessage("running", 1);

This isn't necessary. startup() doesn't resolve until the background script has finished executing.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:31
(Diff revision 5)
> +let extensionData = {
> +  background,
> +};

We typically define extensionData objects inline in the loadExtension arguments.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:37
(Diff revision 5)
> +  background,
> +};
> +
> +add_task(async function test_background() {
> +  let extension = ExtensionTestUtils.loadExtension(extensionData);
> +  info("load complete");

Not necessary.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:38
(Diff revision 5)
> +};
> +
> +add_task(async function test_background() {
> +  let extension = ExtensionTestUtils.loadExtension(extensionData);
> +  info("load complete");
> +  let [, x] = await Promise.all([extension.startup(), extension.awaitMessage("running")]);

No need for Promise.all. Just await startup() and then await the message.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:39
(Diff revision 5)
> +
> +add_task(async function test_background() {
> +  let extension = ExtensionTestUtils.loadExtension(extensionData);
> +  info("load complete");
> +  let [, x] = await Promise.all([extension.startup(), extension.awaitMessage("running")]);
> +  is(x, 1, "got correct value from extension");

We generally use `equal` rather than `is`.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:43
(Diff revision 5)
> +  let [, x] = await Promise.all([extension.startup(), extension.awaitMessage("running")]);
> +  is(x, 1, "got correct value from extension");
> +  info("startup complete");
> +  extension.sendMessage(10, 20);
> +  await extension.awaitFinish();
> +  info("test complete");

Not necessary. awaitFinish will log something similar.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:45
(Diff revision 5)
> +  info("startup complete");
> +  extension.sendMessage(10, 20);
> +  await extension.awaitFinish();
> +  info("test complete");
> +  await extension.unload();
> +  info("extension unloaded successfully");

Not necessary.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:47
(Diff revision 5)
> +  let chromeScript = SpecialPowers.loadChromeScript(() => {
> +    ChromeUtils.import("resource://gre/modules/ExtensionParent.jsm");
> +    let {
> +      ParentAPIManager,
> +    } = ExtensionParent;
> +    // grab one counter
> +    let counters = ParentAPIManager.performanceCounters;
> +    let counter = counters.get("webcompat@mozilla").get("contentScripts.register");
> +    sendAsyncMessage("done", counter);
> +  });

This whole test should probbly be implemented as an xpcshell test. That would remove the need for the loadChromeScript indirection, and would also run much more quickly.

::: toolkit/components/extensions/test/mochitest/test_ext_performance_counters.html:54
(Diff revision 5)
> +    let {
> +      ParentAPIManager,
> +    } = ExtensionParent;
> +    // grab one counter
> +    let counters = ParentAPIManager.performanceCounters;
> +    let counter = counters.get("webcompat@mozilla").get("contentScripts.register");

This should be checking something done by the test extension, not an arbitrary system extension.
Attachment #8991625 - Flags: review?(kmaglione+bmo)
Blocks: 1477717
Comment on attachment 8991625 [details]
Bug 1468550 - Track webextension activity -

https://reviewboard.mozilla.org/r/256544/#review266858

r=me with nit addressed. Thanks!

::: toolkit/components/extensions/ExtensionParent.jsm:895
(Diff revision 7)
> +    let start = Cu.now();
> +    let result = await callable();
> +    let end = Cu.now();

I think this needs to be something like:

    let start = Cu.now();
    try {
      return callable();
    } finally {
      let end = Cu.now();
      this.storeExecutionTime(webExtId, data.path, end - start);
    }

Otherwise, you're going to wind up storing the time it takes for the promise to resolve, which doesn't have very much to do with the amount of CPU time it uses.
Attachment #8991625 - Flags: review?(kmaglione+bmo) → review+
Pushed by tziade@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9fee865dcbf5
Track webextension activity - r=kmag
Backed out changeset 9fee865dcbf5 (bug 1468550) for xpcshell failures at toolkit/components/extensions/test/xpcshell/test_ext_performance_counters.js

Backout: https://hg.mozilla.org/integration/autoland/rev/1ca6d0d30c038116edceaaada7597543f7c4f536

Failure push: 
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=9fee865dcbf5d3944a6081b982d453ad1230c601

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=190561459&repo=autoland&lineNumber=1355

[task 2018-07-27T16:19:14.096Z] 16:19:14     INFO -  TEST-START | xpcshell.ini:toolkit/components/extensions/test/xpcshell/test_ext_performance_counters.js
[task 2018-07-27T16:24:14.096Z] 16:24:14  WARNING -  TEST-UNEXPECTED-TIMEOUT | xpcshell.ini:toolkit/components/extensions/test/xpcshell/test_ext_performance_counters.js | Test timed out
[task 2018-07-27T16:24:14.097Z] 16:24:14     INFO -  TEST-INFO took 300001ms
[task 2018-07-27T16:24:14.200Z] 16:24:14     INFO -  xpcshell return code: -1
[task 2018-07-27T16:24:29.884Z] 16:24:29  WARNING -  TEST-UNEXPECTED-FAIL | Received SIGINT (control-C), so stopped run. (Use --keep-going to keep running tests after killing one with SIGINT)
[task 2018-07-27T16:24:29.884Z] 16:24:29     INFO -  INFO | Result summary:
[task 2018-07-27T16:24:29.884Z] 16:24:29     INFO -  INFO | Passed: 11
[task 2018-07-27T16:24:29.885Z] 16:24:29  WARNING -  INFO | Failed: 1
[task 2018-07-27T16:24:29.885Z] 16:24:29  WARNING -  One or more unittests failed.
[task 2018-07-27T16:24:29.885Z] 16:24:29     INFO -  INFO | Todo: 0
[task 2018-07-27T16:24:29.886Z] 16:24:29     INFO -  INFO | Retried: 0
[task 2018-07-27T16:24:29.886Z] 16:24:29     INFO -  SUITE-END | took 536s
[task 2018-07-27T16:24:29.904Z] 16:24:29    ERROR - Return code: 1
[task 2018-07-27T16:24:29.904Z] 16:24:29     INFO - TinderboxPrint: xpcshell<br/>11/<em class="testfail">1</em>/0
[task 2018-07-27T16:24:29.905Z] 16:24:29     INFO - ##### xpcshell log ends
[task 2018-07-27T16:24:29.905Z] 16:24:29  WARNING - # TBPL WARNING #
[task 2018-07-27T16:24:29.905Z] 16:24:29  WARNING - setting return code to 1
[task 2018-07-27T16:24:29.905Z] 16:24:29  WARNING - The xpcshell suite: xpcshell ran with return status: WARNING
[task 2018-07-27T16:24:29.906Z] 16:24:29     INFO - Running post-action listener: _package_coverage_data
[task 2018-07-27T16:24:29.906Z] 16:24:29     INFO - Running post-action listener: _resource_record_post_action
[task 2018-07-27T16:24:29.906Z] 16:24:29     INFO - Running post-action listener: stop_emulator
[task 2018-07-27T16:24:29.913Z] 16:24:29     INFO - Killing every process called emulator64-arm
[task 2018-07-27T16:24:29.913Z] 16:24:29     INFO - Killing pid 473.
[task 2018-07-27T16:24:29.914Z] 16:24:29     INFO - [mozharness: 2018-07-27 16:24:29.914219Z] Finished run-tests step (success)
[task 2018-07-27T16:24:29.914Z] 16:24:29     INFO - Running post-run listener: _resource_record_post_run
[task 2018-07-27T16:24:29.988Z] 16:24:29     INFO - Total resource usage - Wall time: 631s; CPU: 13.0%; Read bytes: 42266624; Write bytes: 1257734144; Read time: 1172; Write time: 496476
[task 2018-07-27T16:24:29.988Z] 16:24:29     INFO - TinderboxPrint: CPU usage<br/>12.6%
[task 2018-07-27T16:24:29.988Z] 16:24:29     INFO - TinderboxPrint: I/O read bytes / time<br/>42,266,624 / 1,172
[task 2018-07-27T16:24:29.988Z] 16:24:29     INFO - TinderboxPrint: I/O write bytes / time<br/>1,257,734,144 / 496,476
[task 2018-07-27T16:24:29.989Z] 16:24:29     INFO - TinderboxPrint: CPU idle<br/>2,178.7 (87.1%)
[task 2018-07-27T16:24:29.989Z] 16:24:29     INFO - TinderboxPrint: CPU user<br/>306.5 (12.3%)
[task 2018-07-27T16:24:29.989Z] 16:24:29     INFO - TinderboxPrint: Swap in / out<br/>0 / 0
[task 2018-07-27T16:24:29.990Z] 16:24:29     INFO - verify-emulator - Wall time: 3s; CPU: 4.0%; Read bytes: 8192; Write bytes: 524582912; Read time: 8; Write time: 132492
[task 2018-07-27T16:24:29.990Z] 16:24:29     INFO - install - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2018-07-27T16:24:29.996Z] 16:24:29     INFO - run-tests - Wall time: 628s; CPU: 13.0%; Read bytes: 41439232; Write bytes: 733151232; Read time: 1164; Write time: 363984
[task 2018-07-27T16:24:30.177Z] 16:24:30     INFO - Running post-run listener: copy_logs_to_upload_dir
[task 2018-07-27T16:24:30.177Z] 16:24:30     INFO - Copying logs to upload dir...
[task 2018-07-27T16:24:30.177Z] 16:24:30     INFO - mkdir: /builds/worker/workspace/build/upload/logs
[task 2018-07-27T16:24:30.179Z] 16:24:30     INFO - Copying logs to upload dir...
[task 2018-07-27T16:24:30.181Z] 16:24:30  WARNING - returning nonzero exit status 1
Flags: needinfo?(tarek)
Pushed by tziade@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0350a76cbbc9
Track webextension activity - r=kmag
Flags: needinfo?(tarek)
https://hg.mozilla.org/mozilla-central/rev/0350a76cbbc9
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Can you please some STRs to this issue(and a test webextension if possible) or mark it as "qe-verify-" ?
Flags: needinfo?(tarek)
I don't know what's "STRs" ? also, there's a test here : toolkit/components/extensions/test/xpcshell/test_ext_performance_counters.js
Flags: needinfo?(tarek)
Flags: qe-verify-
(In reply to Tarek Ziadé (:tarek) from comment #24)
> I don't know what's "STRs" ? also, there's a test here :
> toolkit/components/extensions/test/xpcshell/test_ext_performance_counters.js

I am sorry, STR is "steps to reproduce". Thank you for adding the flag if there are tests for this.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: