Closed Bug 1824726 Opened 2 years ago Closed 8 months ago

Throttle Resources events on the server side

Categories

(DevTools :: Framework, enhancement)

enhancement

Tracking

(firefox129 fixed)

RESOLVED FIXED
129 Branch
Tracking Status
firefox129 --- fixed

People

(Reporter: ochameau, Assigned: ochameau)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file, 1 obsolete file)

Bug 1663614 introduced client side throttling for all resources, but it would be even better to throttle resources directly from the server.

Bug 1824724 proved that it can drastically improve performance by reducing the overhead introduced by DevTools Transport classes and JSWindow Actor.
Emitting less and larger RDP packets is overall more efficient than emitting more and smaller ones.

Also, once we introduce server side throttling, the client side one may become irrelevant and introduce now-unwanted extra delay.

This helps accumulate RDP packets related to resources being available/updated/destroyed
and emit less and larger RDP packets instead or more and smaller.
This appears to reduce the overhead of DevTools transport and JSWindowActor layers.

The issue in browser_markup_events_toggle.js is interesting.
It highlights that a resource notified before a call to WatcherActor.watchResources,
may be emitted during the call to watchResources.
This makes ignoreExistingResources flag a bit brittle as that resource should be
considered as an existing/past one.
We should probably flush past resource and probably introduce a more explicit
way of handling "existing" resources on the server side.

The current revision is green if you ignore intermittent, which may relate to this patch :/

DAMP reports significant improvements, especially in console test in content process:
https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=95a68c34e391a0f4a7c9a2c82d5b59c806f38949&originalSignature=4081483&newSignature=4081483&framework=12&application=&originalRevision=9921e561972fa235db82d34767231fec9f801feb&page=1&showOnlyConfident=1
But there is also suspicious regressions. This may relate to the quite high throttle delay, set to 250ms in this revision, while the ResourceCommand one used to be set to 100ms.

Using the same throttle delay as in client (100ms) leads to mostly improvements:
https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=642317d62bf0548acb2f06feef283ff9d07048c3&originalSignature=4081483&newSignature=4081483&framework=12&application=&originalRevision=9921e561972fa235db82d34767231fec9f801feb&page=1&showOnlyConfident=1
Improvements are still very high on all content process recordings done for the console.
But there is also a notable 17% improvement on the very intense custom.webconsole.reload test.

Looking at these two DAMP records, it looks like the inspector is impacted by a throttle delay.
It is still reported as being slower with 100ms delay. But it was really significantly worse with 250ms delay.
It may be worth investigating why inspector is negatively impacted by the throttling. Does it have some sort of redundant throttling done somewhere?

Duplicate of this bug: 1760962
Blocks: 1873065

This helps accumulate RDP packets related to resources being available/updated/destroyed
and emit less and larger RDP packets instead or more and smaller.
This appears to reduce the overhead of DevTools transport and JSWindowActor layers.

The issue in browser_markup_events_toggle.js is interesting.
It highlights that a resource notified before a call to WatcherActor.watchResources,
may be emitted during the call to watchResources.
This makes ignoreExistingResources flag a bit brittle as that resource should be
considered as an existing/past one.
We should probably flush past resource and probably introduce a more explicit
way of handling "existing" resources on the server side.

The fixes in stylesheets-managers.js relates to failures in browser_styleeditor_at_rules_sidebar.js.
StyleSheetManager.startWatching wasn't waiting for async "applicable-stylesheet-added" listeners
set by the StyleSheetWatcher. This is important in order to ensure emitting
all existing stylesheets resources before its watch method resolves.

Similarly to the client side, DOCUMENT_EVENT's will-navigate is special
and has to be emitted synchronously in order to prevent clearing things out of order.
This is also true for NETWORK_EVENT_STACKTRACE which is expected to be received
before related NETWORK_EVENT. NETWORK_EVENT_STACKTRACE is fired from the content
process while NETWORK_EVENT is fired from the parent process.
For now, it is easier to synchronously emit those resources rather than trying
to do cross process coordination.
We may revisit this choice once we start doing throttling from the parent process
and may be once D143409 lands.

About browser_resources_clear_resources.js, it is surprising it wasn't already failing.

Assignee: nobody → poirot.alex
Status: NEW → ASSIGNED
Attachment #9325628 - Attachment is obsolete: true
Pushed by apoirot@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/242972a790a3 [devtools] Throttle all resources from the server side. r=devtools-reviewers,nchevobbe,perftest-reviewers,sparky
Regressions: 1879364

I'm struggling with JS Window Actor being destroyed too abruptly and preventing sending throttled resources when the WindowGlobal is getting destroyed on navigation.

I'll try to get bug 1866814 which should help as we would no longer use JSWindowActor and rather use JS Process actor.

Depends on: 1648499, 1866814
Flags: needinfo?(poirot.alex)

There is an r+ patch which didn't land and no activity in this bug for 2 weeks.
:ochameau, could you have a look please?
If you still have some work to do, you can add an action "Plan Changes" in Phabricator.
For more information, please visit BugBot documentation.

Flags: needinfo?(poirot.alex)
Flags: needinfo?(gmierz2)
Flags: needinfo?(gmierz2)
Flags: needinfo?(poirot.alex)
Pushed by apoirot@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/6f7a33d823af [devtools] Throttle all resources from the server side. r=devtools-reviewers,nchevobbe,perftest-reviewers,sparky

Updated DAMP results:
https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=44fec98db6806b274595c0519466f81c0eecd198&originalSignature=4763542&newSignature=4763542&framework=12&application=firefox&originalRevision=3488076c5d10db79d05ade43872aa401c353b10a&page=1&showOnlyConfident=1
Between 14 and 60% improvements on all log-in-loop tests. Meaning, the web page will be significantly more responsive when using console.log while DevTools are opened!

Many other tests are also improved, probably related to less UI/React Updates, but that may also be related to lower GC overhead.
browser-toolbox.debugger-ready is 15% faster and custom.jsdebugger.reload 11% faster.

Regressions: 1899280
Flags: needinfo?(poirot.alex)
Depends on: 1901643
Pushed by apoirot@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/1443c909d1a7 [devtools] Throttle all resources from the server side. r=devtools-reviewers,nchevobbe,perftest-reviewers,sparky
Regressions: 1901674
Status: ASSIGNED → RESOLVED
Closed: 8 months ago
Resolution: --- → FIXED
Target Milestone: --- → 129 Branch
Regressions: 1901788
Regressions: 1902701
Regressions: 1902857

Strong evidence (but not yet confirmed) that this lead to large improvements in lots of tests
https://treeherder.mozilla.org/perfherder/alerts?id=699
https://treeherder.mozilla.org/perfherder/alerts?id=687&hideDwnToInv=0

Perfherder has detected a devtools performance change from push b72da580cf2a9dcba1c0e7efe78b67fe471d8547.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
14000% reload-webconsole:parent-process objects-with-stacks linux1804-64-shippable-qr 0.25 -> 35.25
7050% reload-webconsole:parent-process objects-with-stacks linux1804-64-qr 0.50 -> 35.75
30% damp custom.netmonitor.reload.DAMP macosx1015-64-shippable-qr e10s fission stylo webrender 188.97 -> 245.79
29% damp custom.netmonitor.reload.DAMP macosx1015-64-shippable-qr e10s fission stylo webrender-sw 189.13 -> 243.45
19% damp simple.webconsole.close.DAMP linux1804-64-shippable-qr e10s fission stylo webrender-sw 11.17 -> 13.26
18% damp simple.webconsole.close.DAMP linux1804-64-shippable-qr e10s fission stylo webrender 10.22 -> 12.05
13% damp custom.netmonitor.reload.DAMP windows10-64-shippable-qr e10s fission stylo webrender 328.01 -> 371.40
12% damp custom.netmonitor.reload.DAMP windows10-64-shippable-qr e10s fission stylo webrender-sw 328.89 -> 369.76
11% damp custom.inspector.reload.DAMP windows10-64-shippable-qr e10s fission stylo webrender-sw 414.76 -> 461.12
11% damp custom.inspector.reload.DAMP windows10-64-shippable-qr e10s fission stylo webrender 420.19 -> 466.95
... ... ... ... ...
3% damp browser-toolbox.webconsole-ready.DAMP windows10-64-shippable-qr e10s fission stylo webrender-sw 559.81 -> 576.13
3% damp browser-toolbox.webconsole-ready.DAMP windows10-64-shippable-qr e10s fission stylo webrender 566.60 -> 581.80
3% damp browser-toolbox.styleeditor-ready.DAMP windows10-64-shippable-qr e10s fission stylo webrender 456.04 -> 468.13
2% damp custom.jsdebugger.open-large-minified-file.full-selection.DAMP macosx1015-64-shippable-qr e10s fission stylo webrender 4,001.34 -> 4,094.99
2% damp custom.jsdebugger.open-large-minified-file.full-selection.DAMP macosx1015-64-shippable-qr e10s fission stylo webrender-sw 3,998.65 -> 4,089.46

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
75% damp console.log-in-loop-content-process-string windows10-64-shippable-qr e10s fission stylo webrender-sw 66.79 -> 16.65
74% damp simple.inspector.reload.DAMP macosx1015-64-shippable-qr e10s fission stylo webrender-sw 223.22 -> 57.47
72% damp console.log-in-loop-content-process-node windows10-64-shippable-qr e10s fission stylo webrender-sw 37.43 -> 10.38
72% damp simple.inspector.reload.DAMP macosx1015-64-shippable-qr e10s fission stylo webrender 196.51 -> 55.92
71% damp console.log-in-loop-content-process-undefined windows10-64-shippable-qr e10s fission stylo webrender-sw 36.31 -> 10.69
... ... ... ... ...
2% damp custom.jsdebugger.stepOver.DAMP linux1804-64-shippable-qr e10s fission stylo webrender-sw 506.64 -> 495.83

As author of one of the patches included in that push, we need your help to address this regression.
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 687

For more information on performance sheriffing please see our FAQ.

I tried to investigate the reload-webconsole:parent-process objects-with-stacks, but couldn't reproduce locally.
I always get zero leak in the parent process.

Looking at graphs, this leak seems to be intermittent. The value changes.
https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&highlightCommonAlerts=0&replicates=0&series=autoland,3987816,1,12&series=mozilla-central,3989074,1,12&timerange=1209600

It's surprising as there is a 1s delay at the end of the test, which should ideally avoid race conditions:
https://searchfox.org/mozilla-central/rev/30e593b57fe1e71f5b4f50d1fa7c7dab3a9cac0b/devtools/client/framework/test/allocations/reload-test.js#25-43

Regressions: 1904154
See Also: → 1879454
Blocks: 1914386
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: