Open Bug 1978224 Opened 26 days ago Updated 3 days ago

Testcase resizing canvas N times in a loop without allocating is 31x slower in Firefox, spending all the time in sweep/mark.

Tracking

()

Status:

REOPENED

People

(Reporter: mayankleoboy1, Assigned: jonco, NeedInfo)

References

(Blocks 3 open bugs)

Details

(Keywords: perf-alert)

Attachments

(5 files)

Canvas resize no alloc all GC.html 26 days ago Mayank Bansal 769 bytes, text/html		Details
about:support 26 days ago Mayank Bansal 48.31 KB, text/plain		Details
Canvas resize no alloc all GC_16kx16k.html 26 days ago Mayank Bansal 771 bytes, text/html		Details
Bug 1978224 - Part 1: Refactor allocation triggers r?sfink 22 days ago Jon Coppeard (:jonco) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1978224 - Part 2: Deschedule zones at the beginning of GC if their heap sizes no longer reach their thresholds r?sfink 22 days ago Jon Coppeard (:jonco) 48 bytes, text/x-phabricator-request		Details \| Review

Mayank Bansal

Reporter

Description

•

26 days ago

Attached file Canvas resize no alloc all GC.html — Details

Open testcase
Enter 500000 and click run.

Firefox: https://share.firefox.dev/4m369sA (52s)
Chrome: https://share.firefox.dev/44HQZU4 (1.7s)

Very artificial, but maybe there is something worth improving?
Not sure what the component should be. Tentatively putting under JS, and cc'ing jonco and Lee.

Mayank Bansal

Reporter

Updated

•

26 days ago

Component: JavaScript Engine → JavaScript: GC

Mayank Bansal

Reporter

Comment 1

•

26 days ago

Attached file about:support — Details

Mayank Bansal

Reporter

Comment 2

•

26 days ago

Attached file Canvas resize no alloc all GC_16kx16k.html — Details

You can also see this testcase which creates canvas with size 16Kx16K.

Jon Coppeard (:jonco)

Assignee

Comment 3

•

24 days ago

I think what's happening is that the loop is repeatedly allocating memory (for the canvas) and then causing it to be freed again. The size change from allocation triggers a GC, but by the time it starts the memory has already been freed. The threshold for the next GC is then set based on almost nothing being allocated. The next allocation triggers another GC and so on.

Maybe we should check at the start of an allocation-triggered GC whether it is still necessary and abort if not.

This situation is pretty unlikely to happen in practice though.

Mayank Bansal

Reporter

Comment 4

•

24 days ago

Not to derail from the original bug and analysis, but i found that if i change the parallel marking parameters, i can get ~30% improvement on the testcase.

Testcase is the 16kX16k file.

N=5000

javascript.options.mem.gc_max_parallel_marking_threads = 4
javascript.options.mem.gc_parallel_marking_threshold_mb = 1

Profile: https://share.firefox.dev/46qG5mV (1.5s)
Default nightly: https://share.firefox.dev/414Qfps (2.1-2.4s)

Maybe worth rechecking to reduce teh threshold size? (or maybe that should wait for the pending allocator changes)

Nicolas B. Pierron [:nbp]

Updated

•

23 days ago

Blocks: GC.performance

Severity: -- → S4

Type: task → defect

Priority: -- → P2

Jon Coppeard (:jonco)

Assignee

Updated

•

22 days ago

Depends on: 1978751

Jon Coppeard (:jonco)

Assignee

Comment 5

•

22 days ago

Attached file Bug 1978224 - Part 1: Refactor allocation triggers r?sfink — Details

This splits up malloc/JIT code checks that were conflated and tidies up a bunch
of other things.

JS::AddAssociatedMemory doesn't need to call Zone::maybeTriggerGCOnMalloc as
addCellMemory already does this.

Phabricator Automation

Updated

•

22 days ago

Assignee: nobody → jcoppeard

Status: NEW → ASSIGNED

Jon Coppeard (:jonco)

Assignee

Comment 6

•

22 days ago

Attached file Bug 1978224 - Part 2: Deschedule zones at the beginning of GC if their heap sizes no longer reach their thresholds r?sfink — Details

This re-checks the heap thresholds for zones in allocation triggered GCs. This
improves the test case a lot, although it still triggeres a ton of GCs.

Mayank Bansal

Reporter

Comment 7

•

22 days ago

•

Edited

First testcase, N=500000

Nightly: https://share.firefox.dev/4lG4l99 (13s)
So we are about 4x faster here.

Using the 16kX16k testcase with N=5000:

Profile with latest Nightly (with fix from bug 1978751): https://share.firefox.dev/4meypsC (800ms)
Previous profile: : https://share.firefox.dev/414Qfps (2.1-2.4s)

Caveat is that on this testcase, profiling with the gecko profiler has a lot of overhead. For instance, profiling the first testcase with gecko profiler took 52s and the captured profile had empty threads for half of the profile. Thus, the magnitude of improvement may not be fully accurate as the timings between gecko profiler and Samply are not comparable.

Pulsebot

Comment 8

•

16 days ago

Pushed by jcoppeard@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/e2cdde96e61f https://hg.mozilla.org/integration/autoland/rev/a9f62ebf3a27 Part 1: Refactor allocation triggers r=sfink https://github.com/mozilla-firefox/firefox/commit/8bf470fcc3b3 https://hg.mozilla.org/integration/autoland/rev/2b2cdae238a8 Part 2: Deschedule zones at the beginning of GC if their heap sizes no longer reach their thresholds r=sfink https://github.com/mozilla-firefox/firefox/commit/117fd123fc44 https://hg.mozilla.org/integration/autoland/rev/e5fb733c7f96 apply code formatting via Lando

Cristian Tuns

Comment 9

•

16 days ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/a9f62ebf3a27
https://hg.mozilla.org/mozilla-central/rev/2b2cdae238a8
https://hg.mozilla.org/mozilla-central/rev/e5fb733c7f96

Status: ASSIGNED → RESOLVED

Closed: 16 days ago

status-firefox143: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 143 Branch

Mayank Bansal

Reporter

Comment 10

•

16 days ago

•

Edited

For the first testcase: https://share.firefox.dev/3H6l6eL (16s)

For the second testcase, Profile with latest Nightly : https://share.firefox.dev/3U4jvJf (430ms)

Pulsebot

Comment 11

•

15 days ago

Pushed by sstanca@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/31008b7dd31c https://hg.mozilla.org/integration/autoland/rev/3bf8f808fadf Revert "Bug 1978224: apply code formatting via Lando" as requested by @jonco for increasing GC time in the sp3 benchmark.

Serban Stanca [:SerbanS]

Comment 12

•

15 days ago

Backed out as requested by @jonco for increasing GC time in the sp3 benchmark: https://hg.mozilla.org/integration/autoland/rev/3bf8f808fadf731c5592792643c5617c4a10137a

Status: RESOLVED → REOPENED

status-firefox143: fixed → ---

Flags: needinfo?(jcoppeard)

Resolution: FIXED → ---

Target Milestone: 143 Branch → ---

Treeherder Bug Filer

Updated

•

14 days ago

Regressions: 1980447

Treeherder Bug Filer

Updated

•

14 days ago

Regressions: 1980448

Mayank Bansal

Reporter

Updated

•

14 days ago

Blocks: 1980560

Mayank Bansal

Reporter

Updated

•

14 days ago

Comment 13

•

3 days ago

(In reply to Pulsebot from comment #11)

Pushed by sstanca@mozilla.com:
https://github.com/mozilla-firefox/firefox/commit/31008b7dd31c
https://hg.mozilla.org/integration/autoland/rev/3bf8f808fadf
Revert "Bug 1978224: apply code formatting via Lando" as requested by @jonco
for increasing GC time in the sp3 benchmark.

Perfherder has detected a browsertime performance change from push 3bf8f808fadf731c5592792643c5617c4a10137a.

If you have any questions, please reach out to a performance sheriff. Alternatively, you can find help on Slack by joining #perf-help, and on Matrix you can find help by joining #perftest.

Improvements:

Ratio	Test	Platform	Options	Absolute values (old vs new)	Performance Profiles
28%	speedometer3 perfstats-NonIdleMajorGC	android-hw-a55-14-0-aarch64-shippable	webrender	690.74 -> 500.63
25%	speedometer3 perfstats-NonIdleMajorGC	linux1804-64-nightlyasrelease-qr	fission webrender	271.62 -> 203.85	Before/After
22%	speedometer3 perfstats-MajorGC	android-hw-a55-14-0-aarch64-shippable	webrender	2,432.60 -> 1,891.57
5%	speedometer3 TodoMVC-JavaScript-ES5/Adding100Items/Sync	linux1804-64-nightlyasrelease-qr	fission webrender	62.53 -> 59.66	Before/After
4%	speedometer3 TodoMVC-JavaScript-ES5/Adding100Items/total	linux1804-64-nightlyasrelease-qr	fission webrender	68.02 -> 65.13	Before/After
...	...	...	...	...	...
3%	speedometer3 TodoMVC-jQuery/CompletingAllItems/Sync	android-hw-a55-14-0-aarch64-shippable	webrender	196.75 -> 191.46

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a performance sheriff to do that for you.

You can run all of these tests on try with ./mach try perf --alert 46111

The following documentation link provides more information about this command.

Keywords: perf-alert

You need to log in before you can comment on or make changes to this bug.