Closed Bug 1917652 Opened 2 months ago Closed 14 days ago

Make X11 GPU Process on-by-default

Categories

(Core :: Graphics: WebRender, enhancement, P3)

Desktop
Linux
enhancement

Tracking

()

RESOLVED FIXED
133 Branch
Tracking Status
firefox133 --- fixed

People

(Reporter: bradwerth, Assigned: bradwerth)

References

(Blocks 1 open bug, Regressed 2 open bugs)

Details

(Keywords: perf-alert)

Attachments

(4 files)

X11 has support for GPU Process, but it's turned off by default. Let's turn this on-by-default at least for Nightly and monitor incoming Bug reports.

Pushed by bwerth@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bc26c89fe116 Turn on GPU Process for X11 Nightly. r=aosmond

Backed out for causing build bustages related to mozbuild.preprocessor.ParseError

[task 2024-09-10T23:16:52.532Z] 23:16:52     INFO -  gmake[3]: Entering directory '/builds/worker/workspace/obj-build'
[task 2024-09-10T23:16:52.532Z] 23:16:52     INFO -  /builds/worker/.mozbuild/srcdirs/gecko-8a5b87fe5d69/_virtualenvs/build/bin/python -m mozbuild.action.file_generate /builds/worker/checkouts/gecko/js/src/GeneratePrefs.py generate_prefs_header js/public/PrefsGenerated.h js/public/.deps/PrefsGenerated.h.pp js/public/.deps/PrefsGenerated.h.stub /builds/worker/checkouts/gecko/modules/libpref/init/StaticPrefList.yaml
[task 2024-09-10T23:16:52.533Z] 23:16:52    ERROR -  Traceback (most recent call last):
[task 2024-09-10T23:16:52.533Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 705, in do_if
[task 2024-09-10T23:16:52.533Z] 23:16:52     INFO -      e = Expression(args)
[task 2024-09-10T23:16:52.533Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 76, in __init__
[task 2024-09-10T23:16:52.533Z] 23:16:52     INFO -      self.e = self.__get_logical_or()
[task 2024-09-10T23:16:52.533Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 97, in __get_logical_or
[task 2024-09-10T23:16:52.533Z] 23:16:52     INFO -      rv.append(self.__get_logical_or())
[task 2024-09-10T23:16:52.533Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 97, in __get_logical_or
[task 2024-09-10T23:16:52.533Z] 23:16:52     INFO -      rv.append(self.__get_logical_or())
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 88, in __get_logical_or
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      rv.append(self.__get_logical_and())
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 109, in __get_logical_and
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      rv.append(self.__get_equality())
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 130, in __get_equality
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      rv.append(self.__get_unary())
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 150, in __get_unary
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      return self.__get_value()
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 179, in __get_value
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      raise Expression.ParseError(self)
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -  mozbuild.preprocessor.ParseError: Unexpected content at offset 50, "(de"
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -  During handling of the above exception, another exception occurred:
[task 2024-09-10T23:16:52.535Z] 23:16:52    ERROR -  Traceback (most recent call last):
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/fetches/python/lib/python3.8/runpy.py", line 194, in _run_module_as_main
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      return _run_code(code, main_globals, None,
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/fetches/python/lib/python3.8/runpy.py", line 87, in _run_code
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      exec(code, run_globals)
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/action/file_generate.py", line 154, in <module>
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      sys.exit(main(sys.argv[1:]))
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/action/file_generate.py", line 98, in main
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      ret = module.__dict__[method](
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/js/src/GeneratePrefs.py", line 73, in generate_prefs_header
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      prefs = load_yaml(yaml_path)
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/js/src/GeneratePrefs.py", line 44, in load_yaml
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      pp.do_include(yaml_path)
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 904, in do_include
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      self.handleLine(l)
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 654, in handleLine
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      cmd(args)
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -    File "/builds/worker/checkouts/gecko/python/mozbuild/mozbuild/preprocessor.py", line 709, in do_if
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -      raise Preprocessor.Error(self, "SYNTAX_ERR", args)
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -  mozbuild.preprocessor.Error: ('$SRCDIR/modules/libpref/init/StaticPrefList.yaml', 8537, 'SYNTAX_ERR', 'defined(XP_WIN) || defined(MOZ_WIDGET_ANDROID) || (defined(MOZ_X11) && defined(NIGHTLY_BUILD))')
[task 2024-09-10T23:16:52.535Z] 23:16:52    ERROR -  gmake[3]: *** [backend.mk:156: js/public/.deps/PrefsGenerated.h.stub] Error 1
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -  gmake[3]: Leaving directory '/builds/worker/workspace/obj-build'
[task 2024-09-10T23:16:52.535Z] 23:16:52     INFO -  gmake[3]: Entering directory '/builds/worker/workspace/obj-build'
Flags: needinfo?(bwerth)
Pushed by bwerth@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3743627d5eff Turn on GPU Process for X11 Nightly. r=aosmond

I've got a solution for this, but I'm going to base it off of Bug 1919114, which adds a testing-variant that this Bug should update.

Attachment #9423641 - Attachment description: Bug 1917652: Turn on GPU Process for X11 Nightly. → Bug 1917652 Part 1: Turn on GPU Process for X11 Nightly.

This change prevents VideoBridgeParent from being incorrectly reported
as leaked memory. Something about the logging used by the previous macro
confuses the leakchecker. Appropriate deallocation was confirmed by
creating manual implementations of AddRef and Release and the various
Actor lifecyle methods (not part of this patch). The change to this new
macro does the same work as the old macro, minus the conditional logging
of the old macro.

This allows us to compare Linux test results with and without a gpu process.

Attachment #9425711 - Attachment description: Bug 1917652 Part 2: Use non-loggin refcounting macros in VideoBridgeParent. → Bug 1917652 Part 2: Use non-logging refcounting macros in VideoBridgeParent.

I'm struggling to get blocking Bug 1919114 landed. I'll return to this once that is resolved.

Flags: needinfo?(bwerth)
Attachment #9425711 - Attachment description: Bug 1917652 Part 2: Use non-logging refcounting macros in VideoBridgeParent. → Bug 1917652 Part 2: Explicitly define AddRef and Release in VideoBridgeParent.
Pushed by bwerth@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/6d6a04bb5979 Part 1: Turn on GPU Process for X11 Nightly. r=aosmond https://hg.mozilla.org/integration/autoland/rev/4bab31597a04 Part 2: Explicitly define AddRef and Release in VideoBridgeParent. r=aosmond
Blocks: 1924155
Pushed by bwerth@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/fd5c7d51e623 Part 3: Enable the no-gpu-process test variant for Linux builds. r=aosmond,taskgraph-reviewers,jmaher

:bradwerth, could you consider nominating this for a release note? (Process info)
We could include it in the nightly only release notes.

Flags: needinfo?(bwerth)

Potential regressions (Alert 2434/ alert 2435) on linux only tart/tsvgx/session-restore/others. (waiting for confirmation from sheriffs).

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 133 Branch → ---
Flags: needinfo?(bwerth)
Flags: needinfo?(bwerth)

The message timings that were modified in this test are sent by the
GPUProcessManager, and this test was using WIN and !WIN as a proxy for
whether or not a GPU process was in place. The modifications here make
that more explicit, although they remain approximate. Anything that is
conditioned on GPUPROCESS needs to "ignoreIfUnused" because those
markers will be sent or not sent depending on the presence of the GPU
process, which has complex controls not completely covered by the logic
in this test.

Pushed by bwerth@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/41ea6ed2f0f6 Part 1: Turn on GPU Process for X11 Nightly. r=aosmond https://hg.mozilla.org/integration/autoland/rev/8fa91b1a76c3 Part 2: Explicitly define AddRef and Release in VideoBridgeParent. r=aosmond https://hg.mozilla.org/integration/autoland/rev/d21ca0c9e80a Part 3: Enable the no-gpu-process test variant for Linux builds. r=aosmond,taskgraph-reviewers,jmaher https://hg.mozilla.org/integration/autoland/rev/8002391fe2ea Part 4: Update IPC timing expectations for gpu process platforms. r=mconley

(In reply to Pulsebot from comment #10)

Pushed by bwerth@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6d6a04bb5979
Part 1: Turn on GPU Process for X11 Nightly. r=aosmond
https://hg.mozilla.org/integration/autoland/rev/4bab31597a04
Part 2: Explicitly define AddRef and Release in VideoBridgeParent. r=aosmond

Perfherder has detected a browsertime performance change from push 4bab31597a049cd50392ac9e683deee298ed686e.

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
15% motionmark-htmlsuite-1-3 cpuTimePageload linux1804-64-shippable-qr fission webrender 3,383.83 -> 2,867.83
15% motionmark-htmlsuite-1-3 cpuTimePageload linux1804-64-shippable-qr fission webrender 3,436.62 -> 2,922.00
13% motionmark-1-3 cpuTimePageload linux1804-64-shippable-qr fission webrender 3,394.00 -> 2,951.50

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 2518

For more information on performance sheriffing please see our FAQ.

Keywords: perf-alert

(In reply to Sandor Molnar[:smolnar] from comment #16)

Backed out for causing perma bc failures @ browser_startup_syncIPC.js

Backout link: https://hg.mozilla.org/integration/autoland/rev/f65cc2fb970b08f26b3229a0a62c37576407ff7a

Backout merge to central: https://hg.mozilla.org/mozilla-central/rev/fa1fe4195c4c98ed4541164cae171c52f30c58fe

Push with failures

Failure log -> TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_syncIPC.js

Perfherder has detected a browsertime performance change from push f65cc2fb970b08f26b3229a0a62c37576407ff7a.

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
21% motionmark-htmlsuite-1-3 cpuTimePageload linux1804-64-shippable-qr fission webrender 2,838.76 -> 3,438.08 Before/After
19% motionmark-1-3 cpuTimePageload linux1804-64-shippable-qr fission webrender 2,863.92 -> 3,415.67

As author of one of the patches included in that push, we need your help to address this regression.
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 2458

For more information on performance sheriffing please see our FAQ.

Pushed by bwerth@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b5bf70fa21e7 Part 1: Turn on GPU Process for X11 Nightly. r=aosmond https://hg.mozilla.org/integration/autoland/rev/c8e8360fd856 Part 2: Explicitly define AddRef and Release in VideoBridgeParent. r=aosmond https://hg.mozilla.org/integration/autoland/rev/861c66502227 Part 3: Enable the no-gpu-process test variant for Linux builds. r=aosmond,taskgraph-reviewers,jmaher https://hg.mozilla.org/integration/autoland/rev/483c42666d63 Part 4: Update IPC timing expectations for gpu process platforms. r=mconley,taskgraph-reviewers,jmaher
See Also: → 1925837
Regressions: 1925896

Leaving a note here, but since we got backout on ForkServer by default in https://bugzilla.mozilla.org/show_bug.cgi?id=1874689#c59 that was due to the handling of MOZ_SANDBOXED and the fact the GPU proces is not sandboxed, this has probably broken Debian packages nightly on X11, since we enable forkserver by default there.

Looks like this is going to stick this time.

Flags: needinfo?(bwerth)
See Also: → 1870427
Regressions: 1926547
Regressions: 1927058

(In reply to Pulsebot from comment #19)

Pushed by bwerth@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/41ea6ed2f0f6
Part 1: Turn on GPU Process for X11 Nightly. r=aosmond
https://hg.mozilla.org/integration/autoland/rev/8fa91b1a76c3
Part 2: Explicitly define AddRef and Release in VideoBridgeParent. r=aosmond
https://hg.mozilla.org/integration/autoland/rev/d21ca0c9e80a
Part 3: Enable the no-gpu-process test variant for Linux builds.
r=aosmond,taskgraph-reviewers,jmaher
https://hg.mozilla.org/integration/autoland/rev/8002391fe2ea
Part 4: Update IPC timing expectations for gpu process platforms. r=mconley

Perfherder has detected a browsertime performance change from push 8002391fe2ead2d0bcddd115d0f4ca94b6963759.

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
5% speedometer3 Perf-Dashboard/Render/Async linux1804-64-shippable-qr fission webrender 9.11 -> 8.69 Before/After
4% speedometer3 cpuTimePageload linux1804-64-shippable-qr fission webrender 617.34 -> 591.03 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 2580

For more information on performance sheriffing please see our FAQ.

(In reply to Sandor Molnar[:smolnar] from comment #20)

Backed out for causing bc failures @ browser_startup_syncIPC.js

Backout link: https://hg.mozilla.org/integration/autoland/rev/a0591f59d36030f1c86f30d4a9968c42230ae18b

Push with failures

Failure log -> TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_syncIPC.js | unexpected PCompositorBridge::Msg_NotifyChildCreated sync IPC before first paint

Perfherder has detected a browsertime performance change from push a0591f59d36030f1c86f30d4a9968c42230ae18b.

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
10% expedia LastVisualChange macosx1015-64-shippable-qr fission warm webrender 2,048.52 -> 2,245.23 Before/After
7% speedometer EmberJS-Debug-TodoMVC/CompletingAllItems/Sync linux1804-64-nightlyasrelease-qr fission webrender 221.68 -> 236.19
6% speedometer EmberJS-Debug-TodoMVC/CompletingAllItems linux1804-64-nightlyasrelease-qr fission webrender 224.78 -> 239.28
5% speedometer EmberJS-Debug-TodoMVC/DeletingItems linux1804-64-nightlyasrelease-qr fission webrender 236.11 -> 247.96
5% speedometer EmberJS-Debug-TodoMVC linux1804-64-nightlyasrelease-qr fission webrender 798.60 -> 837.33
5% speedometer EmberJS-Debug-TodoMVC/DeletingItems/Sync linux1804-64-nightlyasrelease-qr fission webrender 235.18 -> 246.01
4% speedometer3 cpuTimePageload linux1804-64-shippable-qr fission webrender 590.53 -> 616.57 Before/After

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
7% speedometer3 Charts-observable-plot/Dotted/Sync windows11-64-nightlyasrelease-qr fission webrender 8.16 -> 7.55 Before/After
5% speedometer jQuery-TodoMVC linux1804-64-nightlyasrelease-qr fission webrender 371.91 -> 353.32
5% speedometer3 Charts-observable-plot/Dotted/total windows11-64-nightlyasrelease-qr fission webrender 13.11 -> 12.47 Before/After
4% speedometer3 TodoMVC-jQuery/total windows11-64-nightlyasrelease-qr fission webrender 152.46 -> 145.85 Before/After
4% speedometer3 TodoMVC-jQuery/CompletingAllItems/Sync windows11-64-nightlyasrelease-qr fission webrender 70.99 -> 68.00 Before/After
... ... ... ... ... ...
2% speedometer3 Charts-observable-plot/total windows11-64-nightlyasrelease-qr fission webrender 55.47 -> 54.35 Before/After

As author of one of the patches included in that push, we need your help to address this regression.
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 2594

For more information on performance sheriffing please see our FAQ.

(In reply to Pulsebot from comment #23)

Pushed by bwerth@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b5bf70fa21e7
Part 1: Turn on GPU Process for X11 Nightly. r=aosmond
https://hg.mozilla.org/integration/autoland/rev/c8e8360fd856
Part 2: Explicitly define AddRef and Release in VideoBridgeParent. r=aosmond
https://hg.mozilla.org/integration/autoland/rev/861c66502227
Part 3: Enable the no-gpu-process test variant for Linux builds.
r=aosmond,taskgraph-reviewers,jmaher
https://hg.mozilla.org/integration/autoland/rev/483c42666d63
Part 4: Update IPC timing expectations for gpu process platforms.
r=mconley,taskgraph-reviewers,jmaher

Perfherder has detected a browsertime performance change from push 483c42666d63d12055535e2cb5e22ca3684a0019.

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
17% motionmark-1-3 cpuTimePageload linux1804-64-shippable-qr fission webrender 3,388.33 -> 2,806.58
16% motionmark-htmlsuite-1-3 cpuTimePageload linux1804-64-shippable-qr fission webrender 3,380.25 -> 2,852.33
15% motionmark-1-3 cpuTimePageload linux1804-64-shippable-qr fission webrender 3,380.88 -> 2,871.50

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 2758

For more information on performance sheriffing please see our FAQ.

Hardware: Unspecified → Desktop
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: