Closed Bug 1774458 Opened 6 months ago Closed 5 months ago

29.08 - 2.05% cross_origin_pageload / twinopen ext+twinopen:twinopen.html + 26 more (OSX) regression on Tue June 7 2022

Categories

(Core :: Memory Allocator, defect)

defect

Tracking

()

RESOLVED FIXED
104 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- unaffected
firefox101 --- unaffected
firefox102 --- unaffected
firefox103 --- fixed
firefox104 --- fixed

People

(Reporter: andrej, Assigned: gsvelto, NeedInfo)

References

(Regression)

Details

(4 keywords)

Attachments

(1 file)

Perfherder has detected a talos performance regression from push c8a0acdbee4e63dc3e7076aadd16a2280081ba35. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
29% cross_origin_pageload macosx1015-64-shippable-qr e10s fission stylo webrender 194.91 -> 251.58
27% cross_origin_pageload macosx1015-64-shippable-qr e10s fission stylo webrender-sw 189.45 -> 240.97
22% cross_origin_pageload macosx1015-64-shippable-qr e10s fission stylo webrender-sw 195.53 -> 237.99
19% tp5o_scroll macosx1015-64-shippable-qr e10s fission stylo webrender 1.56 -> 1.85
16% tresize macosx1015-64-shippable-qr e10s fission stylo webrender-sw 5.57 -> 6.47
16% tresize macosx1015-64-shippable-qr e10s fission stylo webrender 5.19 -> 6.00
15% tart macosx1015-64-shippable-qr e10s fission stylo webrender-sw 2.05 -> 2.35
14% tart macosx1015-64-shippable-qr e10s fission stylo webrender 2.14 -> 2.44
13% tp5o_scroll macosx1015-64-shippable-qr e10s fission stylo webrender-sw 1.56 -> 1.76
13% tresize macosx1015-64-shippable-qr e10s fission stylo webrender 5.32 -> 5.99
... ... ... ... ...
3% ts_paint macosx1015-64-shippable-qr e10s fission stylo webrender 1,020.58 -> 1,051.17
3% sessionrestore_many_windows macosx1015-64-shippable-qr e10s fission stylo webrender 1,053.75 -> 1,084.50
2% pdfpaint macosx1015-64-shippable-qr e10s fission stylo webrender 578.26 -> 591.40
2% pdfpaint macosx1015-64-shippable-qr e10s fission stylo webrender-sw 576.91 -> 588.99
2% twinopen ext+twinopen:twinopen.html macosx1015-64-shippable-qr e10s fission stylo webrender 103.45 -> 105.56

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(gsvelto)

Set release status flags based on info from the regressing bug 1670885

There are no good solutions here unfortunately. os_unfair_lock is an non-spinning lock (i.e. it goes straight into the kernel upon contention) while OSSpinLock did spin in user-space and manually yield()ed. Given that the latter is deprecated I strongly suspect that Apple prefers the former's behavior both for power and because it's less prone to live-locks. In fact I'd be curious to know how it behaves on ARM hardware.

This aside we need to claw back the performance we have lost because of this change - with the worst case being to back out the change if I can't solve it quickly. There are no non-deprecated spinlocks however there is an undocumented function called os_unfair_lock_with_options() which can be fed options that could be useful. One of these is OS_UNFAIR_LOCK_ADAPTIVE_SPIN which forces spinning in the kernel if the owning thread is on the same (multi-threaded) CPU. I suspect this would help. This is used by POSIX mutexes in darwin-libpthread and it's bound to a kernel interface so I think it would be fairly stable (or at least we'd be able to detect quickly if it gets broken). Another option is OS_UNFAIR_LOCK_DATA_SYNCHRONIZATION and is meant for locks guarding data - such as in our case. This one doesn't affect the lock behavior per se but rather informs Apple's thread pool machinery not to spawn worker threads if one gets stuck waiting for a lock. I'm unsure if this is needed but it might also be worth a try. Mike, Paul, what do you think? The alternative is doing our own spinning in user-space but that makes me grind my teeth for the potential power and soft-deadlocking implications.

Flags: needinfo?(pbone)
Flags: needinfo?(mh+mozilla)

I was curious about which locks are causing these regressions, so I triggered some jobs with profiling, on these pushes:

Before fix: https://treeherder.mozilla.org/jobs?repo=autoland&revision=d99e6d440d02aa9fdef08aa648191dcefa21baca&group_state=expanded&selectedTaskRun=CXXI2FsiTfSpKMSM7Eb9QQ.0
After fix: https://treeherder.mozilla.org/jobs?repo=autoland&revision=c8a0acdbee4e63dc3e7076aadd16a2280081ba35&group_state=expanded&selectedTaskRun=Wu4qBU91TbuUJgZevWKYWA.0

I was most interested in the cross_origin_pageload profiles but those are unfortunately not usable (bug 1774627).

So I looked at tp5o_scroll. Here's an example profile from the most regressed tp5o_scroll subtest (bild.de): https://share.firefox.dev/3xWWtd2
There aren't a lot of samples, but 10% of the samples in the content process are in the lock that's grabbed in TimerThread::FindNextFireTimeForCurrentThread called via MainThreadIdlePeriod::GetIdlePeriodHint.

(In reply to Markus Stange [:mstange] from comment #3)

There aren't a lot of samples, but 10% of the samples in the content process are in the lock that's grabbed in TimerThread::FindNextFireTimeForCurrentThread called via MainThreadIdlePeriod::GetIdlePeriodHint.

Jeff informed me that this is not the type of lock that got changed. It's mostly just the allocator that's different.

I'm a bit confused - a regression of 20% would imply that we spend at least 20% of the test in the allocator. That can't be true, can it?

Using os_unfair_lock_lock_with_options() with the OS_UNFAIR_LOCK_DATA_SYNCHRONIZATION and OS_UNFAIR_LOCK_ADAPTIVE_SPIN options claws back most of the performance loss from this change. In particular I tried the test that regressed the most and I get this when comparing with plain os_unfair_lock_t and no special locking:

https://treeherder.mozilla.org/perfherder/compare?originalProject=mozilla-central&originalRevision=213305a52f15418a036fd6ef8cc12231ad481690&newProject=try&newRevision=2c5db86ea672b7d5b90d3bf79b1f02eb1532568c&framework=1&page=1&showOnlyComparable=1

And this when comparing with OSSpinLock:

https://treeherder.mozilla.org/perfherder/compare?originalProject=mozilla-central&originalRevision=533ad0ead2345d753cfee67be9d044c10ecce710&newProject=try&newRevision=043b2e816c0c0fdcc47e6ade229c4c465015a5d9&framework=1&page=1&showOnlyComparable=1

As you can see the regressions are down to 1-2% tops which I think is perfectly acceptable considering that the new locks will behave a lot better under heavy load compared to OSSpinLock and our tests are running on unloaded machines. Additionally all these performance comparisons are on x86 hardware and we've seen that OSSpinLock tends to perform worse on ARM-based macs. So chances are that this change will benefit ARM more than it degrades x86 (if someone has access to an ARM-based mac and can give my patch a perf spin I'd be grateful).

Paul, Mike, unless you have other concerns I'd ask for review for this change. While the function and options I'm using are non-public and undocumented they're used internally by Apple's POSIX threads library, additionally the flags match two kernel-level corresponding flags so I don't see Apple breaking this particular interface anytime soon.

Flags: needinfo?(gsvelto)
See Also: → 1774712
Assignee: nobody → gsvelto
Attachment #9281763 - Attachment description: WIP: Bug 1774458 - Use undocumented, non-public adaptive spinlocks on macOS → Bug 1774458 - Use undocumented, non-public adaptive spinlocks on macOS r=glandium
Status: NEW → ASSIGNED

I have very little to add that's not already been said as I'm still becomming familiar with MacOS.

I did once roll my own spinlocks for another project, and they were probably bad and very wasteful of power. My environment was a bit different because I also had set CPU afinity and was the only CPU-heavy task on the system so I was reasonably certain that the thread holding a lock would be about to let it go very soon.

20% sounds like a lot for "allocator time", but I don't have an informed guess to counter with. I took a look at the profile trying to find locks in the allocator and didn't find them near 20% of the runtime. Am I missing something / looking in the wrong profile?

I think this is down the scheduling issues rather than lock slowness. With my patch applied the problem essentially goes away which is a strong indication that the issue was in the kernel implementation. The flags that I pass there only affect kernel behavior and do nothing in userspace.

(In reply to Gabriele Svelto [:gsvelto] from comment #8)

I think this is down the scheduling issues rather than lock slowness. With my patch applied the problem essentially goes away which is a strong indication that the issue was in the kernel implementation. The flags that I pass there only affect kernel behavior and do nothing in userspace.

Yeah, I was trying to figure this out: why changing this lock which hardly shows up in profiles can make a 20% difference?! I think it's providing some "hint" (a bad one) to the kernel about scheduling. I think we've come to the same conclusion.

Flags: needinfo?(pbone)

Gabriele, I tested your new patch on Mac M1. I tested only the cross_origin_pageload test as it regressed the worst.

Before Bug 1670885: 156.3ms (tested by applying the reverse patch to m-c, the comparison to the other numbers here is fair).
Mozilla Central: 203.18ms
With patch: 170.74ms

So it's still slower than before Bug 1670885 but better than weithout any spinlocks.

(In reply to Paul Bone [:pbone] from comment #10)

Gabriele, I tested your new patch on Mac M1. I tested only the cross_origin_pageload test as it regressed the worst.

Before Bug 1670885: 156.3ms (tested by applying the reverse patch to m-c, the comparison to the other numbers here is fair).
Mozilla Central: 203.18ms
With patch: 170.74ms

So it's still slower than before Bug 1670885 but better than weithout any spinlocks.

Thanks, this is very useful! I'm OK with taking a small regression here because I assume that you tested on an unloaded machine. The new locks should behave a lot better than the old ones when the machine is subjected to some load and I think that's an acceptable trade-off giving the old ones could live-lock with enough activity. In fact I would be very grateful if you could run the test while running something that keeps the machine under load while the test is running.

:gsvelto just a reminder that Monday, June 27th is Merge Day for 103 to Beta. If this will land before then?

Flags: needinfo?(gsvelto)

Sorry, I thought I had already submitted it for landing but apparently I did not! Landing it now.

Flags: needinfo?(gsvelto)
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ecea89cb3d35
Use undocumented, non-public adaptive spinlocks on macOS r=pbone

Backed out changeset ecea89cb3d35 (Bug 1774458) as requested by glandium for crashing on macOS 10.14 and older.
Backout link

Flags: needinfo?(gsvelto)

(In reply to Gabriele Svelto [:gsvelto] from comment #11)

I assume that you tested on an unloaded machine.

Yes, I didn't disable any background services but I also wasn't otherwise using the computer.

The new locks should behave a lot better than the old ones when the machine is subjected to some load and I think that's an acceptable trade-off giving the old ones could live-lock with enough activity. In fact I would be very grateful if you could run the test while running something that keeps the machine under load while the test is running.

I'll see what I can do. I assume some background multicore workload like compiling Firefox would be suitable?

Attachment #9281763 - Attachment description: Bug 1774458 - Use undocumented, non-public adaptive spinlocks on macOS r=glandium → WIP: Bug 1774458 - Use undocumented, non-public adaptive spinlocks on macOS r=glandium
Attachment #9281763 - Attachment description: WIP: Bug 1774458 - Use undocumented, non-public adaptive spinlocks on macOS r=glandium → Bug 1774458 - Use undocumented, non-public adaptive spinlocks on macOS r=glandium
Attachment #9281763 - Attachment description: Bug 1774458 - Use undocumented, non-public adaptive spinlocks on macOS r=glandium → Bug 1774458 - Use undocumented, non-public adaptive spinlocks on macOS 10.15+, revert to user-space spinlocks on older versions r=glandium

(In reply to Gabriele Svelto [:gsvelto] from comment #11

Thanks, this is very useful! I'm OK with taking a small regression here because I assume that you tested on an unloaded machine. The new locks should behave a lot better than the old ones when the machine is subjected to some load and I think that's an acceptable trade-off giving the old ones could live-lock with enough activity. In fact I would be very grateful if you could run the test while running something that keeps the machine under load while the test is running.

No worries, I ran the same test while ./mach build was running on a different workspace. I waited until it got to the compile stage to be resonably sure it'd keep all the cores busy.

Before Bug 1670885: 188.98ms (tested by applying the reverse patch to m-c, the comparison to the other numbers here is fair).
Mozilla Central: 203.78ms
With patch: 197.68ms

So it regresses slightly compared with OSSpinLock, but it's marginly better than the default unfair locks. The first surprising thing is that the unfair locks perform the same whether the system is loaded or not. The spin locks (both kinds) regress but they don't become as bad as the non-spinning kind, at least with this test measuring in this way, who knows, maybe the use more power? I'm not sure what I expected to see.

I also tested without the OS_UNFAIR_LOCK_ADAPTIVE_SPIN option and got 186.86ms. That shouldn't be different if spinning is only done on SMT (aka Hyperthreading) systems...

Okay, ran it again 196.32, so I guess we got lucky or this is just very variable due to the way the firefox build loads the system? Yeah, I ran some more. It's all over the place while loaded. It's possibly more informative to track standard deviation and score the locks by "consistency". I think it'd be unproductive to do any further benchmarking with a load without a really stable load, many more samples and box-and-wisker plots. I don't think there's much we can conclude about loaded systems except that neither OSSpinLock or OS_UNFAIR_LOCK_ADAPTIVE_SPIN are terrible.

Set release status flags based on info from the regressing bug 1670885

Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/910652b53350
Use undocumented, non-public adaptive spinlocks on macOS 10.15+, revert to user-space spinlocks on older versions r=pbone

Backed out causing build bustages on Mutex.cpp

Failure line: /builds/worker/checkouts/gecko/memory/build/Mutex.cpp:10:39: error: __builtin_available does not guard availability here; use if (__builtin_available) instead [-Werror,-Wunsupported-availability-guard]

Push with failures

Failure log

Backout link

[task 2022-06-29T08:24:50.409Z] 08:24:50     INFO -  In file included from /builds/worker/checkouts/gecko/memory/build/Mutex.cpp:5:
[task 2022-06-29T08:24:50.410Z] 08:24:50  WARNING -  /builds/worker/checkouts/gecko/memory/build/Mutex.h:94:7: warning: 'OSSpinLockLock' is deprecated: first deprecated in macOS 10.12 - Use os_unfair_lock_lock() from <os/lock.h> instead [-Wdeprecated-declarations]
[task 2022-06-29T08:24:50.410Z] 08:24:50     INFO -        OSSpinLockLock(&mMutex.mSpinLock);
[task 2022-06-29T08:24:50.412Z] 08:24:50     INFO -        ^
[task 2022-06-29T08:24:50.412Z] 08:24:50     INFO -  /builds/worker/fetches/MacOSX11.0.sdk/usr/include/libkern/OSSpinLockDeprecated.h:99:9: note: 'OSSpinLockLock' has been explicitly marked deprecated here
[task 2022-06-29T08:24:50.412Z] 08:24:50     INFO -  void    OSSpinLockLock( volatile OSSpinLock *__lock );
[task 2022-06-29T08:24:50.412Z] 08:24:50     INFO -          ^
[task 2022-06-29T08:24:50.412Z] 08:24:50     INFO -  In file included from /builds/worker/checkouts/gecko/memory/build/Mutex.cpp:5:
[task 2022-06-29T08:24:50.413Z] 08:24:50  WARNING -  /builds/worker/checkouts/gecko/memory/build/Mutex.h:117:7: warning: 'OSSpinLockUnlock' is deprecated: first deprecated in macOS 10.12 - Use os_unfair_lock_unlock() from <os/lock.h> instead [-Wdeprecated-declarations]
[task 2022-06-29T08:24:50.413Z] 08:24:50     INFO -        OSSpinLockUnlock(&mMutex.mSpinLock);
[task 2022-06-29T08:24:50.413Z] 08:24:50     INFO -        ^
[task 2022-06-29T08:24:50.413Z] 08:24:50     INFO -  /builds/worker/fetches/MacOSX11.0.sdk/usr/include/libkern/OSSpinLockDeprecated.h:105:9: note: 'OSSpinLockUnlock' has been explicitly marked deprecated here
[task 2022-06-29T08:24:50.413Z] 08:24:50     INFO -  void    OSSpinLockUnlock( volatile OSSpinLock *__lock );
[task 2022-06-29T08:24:50.413Z] 08:24:50     INFO -          ^
[task 2022-06-29T08:24:50.413Z] 08:24:50    ERROR -  /builds/worker/checkouts/gecko/memory/build/Mutex.cpp:10:39: error: __builtin_available does not guard availability here; use if (__builtin_available) instead [-Werror,-Wunsupported-availability-guard]
[task 2022-06-29T08:24:50.413Z] 08:24:50     INFO -  bool Mutex::UseUnfairLocks() { return __builtin_available(macOS 10.15, *); }
[task 2022-06-29T08:24:50.413Z] 08:24:50     INFO -                                        ^
[task 2022-06-29T08:24:50.413Z] 08:24:50     INFO -  4 warnings and 1 error generated.
[task 2022-06-29T08:24:50.413Z] 08:24:50    ERROR -  gmake[4]: *** [/builds/worker/checkouts/gecko/config/rules.mk:668: Mutex.o] Error 1
[task 2022-06-29T08:24:50.413Z] 08:24:50     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/memory/build'
[task 2022-06-29T08:24:50.413Z] 08:24:50    ERROR -  gmake[3]: *** [/builds/worker/checkouts/gecko/config/recurse.mk:72: memory/build/target-objects] Error 2
[task 2022-06-29T08:24:50.414Z] 08:24:50     INFO -  gmake[3]: *** Waiting for unfinished jobs....
[task 2022-06-29T08:24:50.420Z] 08:24:50     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/security/nss/lib/base/base_nssb'
[task 2022-06-29T08:24:50.420Z] 08:24:50     INFO -  security/nss/lib/base/error.o
[task 2022-06-29T08:24:50.423Z] 08:24:50     INFO -  /builds/worker/fetches/sccache/sccache /builds/worker/fetches/clang/bin/clang --sysroot /builds/worker/fetches/MacOSX11.0.sdk -mmacosx-version-min=10.12 -std=gnu99 --target=x86_64-apple-darwin -o error.o -c  -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -DNDEBUG -DTRIMMED=1 -DNSS_FIPS_DISABLED -DNSS_NO_INIT_SUPPORT -DNSS_X86_OR_X64 -DNSS_X64 -DNSS_USE_64 -DUSE_UTIL_DIRECTLY -DNO_NSPR_10_SUPPORT -DSSL_DISABLE_DEPRECATED_CIPHER_SUITE_NAMES -DHAVE_BSD_FLOCK -DHAVE_STRERROR -DXP_UNIX -D_REENTRANT -DDARWIN -DNSS_DISABLE_DBM -DNSS_DISABLE_LIBPKIX -I/builds/worker/checkouts/gecko/security/nss/lib/base -I/builds/worker/workspace/obj-build/security/nss/lib/base/base_nssb -I/builds/worker/workspace/obj-build/dist/include/nspr -I/builds/worker/workspace/obj-build/dist/include/private/nss -I/builds/worker/workspace/obj-build/dist/include/nss -I/builds/worker/workspace/obj-build/dist/include -include /builds/worker/workspace/obj-build/mozilla-config.h -DMOZILLA_CLIENT -Qunused-arguments -fcrash-diagnostics-dir=/builds/worker/artifacts -fPIC -ffunction-sections -fdata-sections -fno-math-errno -pthread -pipe -gdwarf-4 -Xclang -load -Xclang /builds/worker/workspace/obj-build/build/clang-plugin/libclang-plugin.so -Xclang -add-plugin -Xclang moz-check -O3 -fno-omit-frame-pointer -funwind-tables -Qunused-arguments -Wall -Wbitfield-enum-conversion -Wempty-body -Wformat-type-confusion -Wignored-qualifiers -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wno-error=tautological-type-limit-compare -Wunreachable-code -Wunreachable-code-return -Wunused-but-set-parameter -Wclass-varargs -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wno-range-loop-analysis -Wenum-compare-conditional -Wduplicate-method-arg -Wduplicate-method-match -Wmissing-method-return-type -Wobjc-signed-char-bool -Wsemicolon-before-method-body -Wsuper-class-method-mismatch -Werror=non-literal-null-conversion -Wstring-conversion -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=backend-plugin -Wno-error=free-nonheap-object -Wno-error=return-std-move -Wno-error=atomic-alignment -Wno-error=deprecated-copy -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -Werror=implicit-function-declaration -Wno-psabi -Wthread-safety -fno-strict-aliasing -ffp-contract=off  -MD -MP -MF .deps/error.o.pp   /builds/worker/checkouts/gecko/security/nss/lib/base/error.c
[task 2022-06-29T08:24:50.423Z] 08:24:50     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/security/nss/lib/base/base_nssb'
[task 2022-06-29T08:24:50.441Z] 08:24:50     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/memory/mozalloc'
[task 2022-06-29T08:24:50.444Z] 08:24:50     INFO -  /builds/worker/fetches/sccache/sccache /builds/worker/fetches/clang/bin/clang++ --sysroot /builds/worker/fetches/MacOSX11.0.sdk -mmacosx-version-min=10.12 -stdlib=libc++ -std=gnu++17 --target=x86_64-apple-darwin -o Unified_cpp_memory_mozalloc0.o -c  -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 -D_GNU_SOURCE -DIMPL_MFBT -DMOZ_HAS_MOZGLUE -I/builds/worker/checkouts/gecko/memory/mozalloc -I/builds/worker/workspace/obj-build/memory/mozalloc -I/builds/worker/workspace/obj-build/xpcom -I/builds/worker/checkouts/gecko/memory/build -I/builds/worker/workspace/obj-build/dist/include -I/builds/worker/workspace/obj-build/dist/include/nspr -I/builds/worker/workspace/obj-build/dist/include/nss -DMOZILLA_CLIENT -include /builds/worker/workspace/obj-build/mozilla-config.h -Qunused-arguments -Qunused-arguments -Wall -Wbitfield-enum-conversion -Wdeprecated-this-capture -Wempty-body -Wformat-type-confusion -Wignored-qualifiers -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wno-error=tautological-type-limit-compare -Wunreachable-code -Wunreachable-code-return -Wunused-but-set-parameter -Wno-invalid-offsetof -Wclass-varargs -Wempty-init-stmt -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wno-range-loop-analysis -Wc++2a-compat -Wcomma -Wenum-compare-conditional -Wimplicit-fallthrough -Wduplicate-method-arg -Wduplicate-method-match -Wmissing-method-return-type -Wobjc-signed-char-bool -Wsemicolon-before-method-body -Wsuper-class-method-mismatch -Werror=non-literal-null-conversion -Wstring-conversion -Wno-inline-new-delete -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=backend-plugin -Wno-error=free-nonheap-object -Wno-error=return-std-move -Wno-error=atomic-alignment -Wno-error=deprecated-copy -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -Wno-psabi -Wthread-safety -Wno-unknown-warning-option -fno-sized-deallocation -fno-aligned-new -fcrash-diagnostics-dir=/builds/worker/artifacts -fno-exceptions -fPIC -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread -pipe -gdwarf-4 -Xclang -load -Xclang /builds/worker/workspace/obj-build/build/clang-plugin/libclang-plugin.so -Xclang -add-plugin -Xclang moz-check -O3 -fno-omit-frame-pointer -funwind-tables -Werror -fno-strict-aliasing -ffp-contract=off  -MD -MP -MF .deps/Unified_cpp_memory_mozalloc0.o.pp   Unified_cpp_memory_mozalloc0.cpp
[task 2022-06-29T08:24:50.445Z] 08:24:50     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/memory/mozalloc'
[task 2022-06-29T08:24:50.457Z] 08:24:50     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/mozglue/misc'
[task 2022-06-29T08:24:50.460Z] 08:24:50     INFO -  /builds/worker/fetches/sccache/sccache /builds/worker/fetches/clang/bin/clang++ --sysroot /builds/worker/fetches/MacOSX11.0.sdk -mmacosx-version-min=10.12 -stdlib=libc++ -std=gnu++17 --target=x86_64-apple-darwin -o AwakeTimeStamp.o -c  -fvisibility=hidden -fvisibility-inlines-hidden -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 '-DMOZ_APP_BASENAME="Firefox"' '-DMOZ_APP_VENDOR="Mozilla"' -DIMPL_MFBT -DMOZ_HAS_MOZGLUE -I/builds/worker/checkouts/gecko/mozglue/misc -I/builds/worker/workspace/obj-build/mozglue/misc -I/builds/worker/workspace/obj-build/dist/include -I/builds/worker/workspace/obj-build/dist/include/nspr -I/builds/worker/workspace/obj-build/dist/include/nss -DMOZILLA_CLIENT -include /builds/worker/workspace/obj-build/mozilla-config.h -Qunused-arguments -Qunused-arguments -Wall -Wbitfield-enum-conversion -Wdeprecated-this-capture -Wempty-body -Wformat-type-confusion -Wignored-qualifiers -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wno-error=tautological-type-limit-compare -Wunreachable-code -Wunreachable-code-return -Wunused-but-set-parameter -Wno-invalid-offsetof -Wclass-varargs -Wempty-init-stmt -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wno-range-loop-analysis -Wc++2a-compat -Wcomma -Wenum-compare-conditional -Wimplicit-fallthrough -Wduplicate-method-arg -Wduplicate-method-match -Wmissing-method-return-type -Wobjc-signed-char-bool -Wsemicolon-before-method-body -Wsuper-class-method-mismatch -Werror=non-literal-null-conversion -Wstring-conversion -Wno-inline-new-delete -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=backend-plugin -Wno-error=free-nonheap-object -Wno-error=return-std-move -Wno-error=atomic-alignment -Wno-error=deprecated-copy -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -Wno-psabi -Wthread-safety -Wno-unknown-warning-option -fno-sized-deallocation -fno-aligned-new -fcrash-diagnostics-dir=/builds/worker/artifacts -fno-exceptions -fPIC -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread -pipe -gdwarf-4 -Xclang -load -Xclang /builds/worker/workspace/obj-build/build/clang-plugin/libclang-plugin.so -Xclang -add-plugin -Xclang moz-check -O3 -fno-omit-frame-pointer -funwind-tables -Werror -fno-strict-aliasing -ffp-contract=off  -MD -MP -MF .deps/AwakeTimeStamp.o.pp   /builds/worker/checkouts/gecko/mozglue/misc/AwakeTimeStamp.cpp
[task 2022-06-29T08:24:50.460Z] 08:24:50     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/mozglue/misc'
[task 2022-06-29T08:24:50.461Z] 08:24:50     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/mozglue/misc'
[task 2022-06-29T08:24:50.461Z] 08:24:50     INFO -  mozglue/misc/ConditionVariable_posix.o
[task 2022-06-29T08:24:50.461Z] 08:24:50     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/mozglue/misc'
[task 2022-06-29T08:24:50.465Z] 08:24:50     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/config/external/nspr/pr'
[task 2022-06-29T08:24:50.466Z] 08:24:50     INFO -  config/external/nspr/pr/darwin.o

Meh, I had seen the issue already, but then forgot when addressing the review comments :(

Flags: needinfo?(gsvelto)
Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 104 Branch

Now that this landed I'll wait a few days to see if all the other tests that regressed go back to reasonable values. Once I'm confident they have we can uplift this to beta.

(In reply to Cristian Tuns from comment #22)

https://hg.mozilla.org/mozilla-central/rev/042b52ffcc3d

== Change summary for alert #34735 (as of Wed, 06 Jul 2022 14:45:18 GMT) ==

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
7% wikia loadtime macosx1015-64-shippable-qr fission warm webrender 213.08 -> 197.88
4% wikia ContentfulSpeedIndex macosx1015-64-shippable-qr fission warm webrender 679.46 -> 653.33
4% wikia LastVisualChange macosx1015-64-shippable-qr fission warm webrender 1,280.00 -> 1,232.00
3% ebay loadtime macosx1015-64-shippable-qr fission warm webrender 185.58 -> 179.75

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=34735

(In reply to Alex Finder from comment #24)

== Change summary for alert #34735 (as of Wed, 06 Jul 2022 14:45:18 GMT) ==

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
7% wikia loadtime macosx1015-64-shippable-qr fission warm webrender 213.08 -> 197.88
4% wikia ContentfulSpeedIndex macosx1015-64-shippable-qr fission warm webrender 679.46 -> 653.33
4% wikia LastVisualChange macosx1015-64-shippable-qr fission warm webrender 1,280.00 -> 1,232.00
3% ebay loadtime macosx1015-64-shippable-qr fission warm webrender 185.58 -> 179.75

This is interesting, looking at the graphs all these tests were already improved by the switch to os_unfair_locks (in some cases significantly so, wikia loadtime dropped from ~280ms to ~210ms with significantly lesser variability too) and have further improved with this change. This means that we might have actually improved performance overall.

:gsvelto are you ready to submit an uplfit request, or do you need to more time?
Looks like Bug 1773984 and Bug 1774712 have also been addressed by this patch

Flags: needinfo?(gsvelto)

== Change summary for alert #34746 (as of Thu, 07 Jul 2022 17:18:23 GMT) ==

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
9% tart macosx1015-64-shippable-qr e10s fission stylo webrender 2.46 -> 2.24
9% tresize macosx1015-64-shippable-qr e10s fission stylo webrender-sw 6.36 -> 5.81
9% tp5o_webext macosx1015-64-shippable-qr e10s fission stylo webrender 240.31 -> 219.69
9% tart macosx1015-64-shippable-qr e10s fission stylo webrender-sw 2.37 -> 2.17
8% tp5o macosx1015-64-shippable-qr e10s fission stylo webrender 177.52 -> 163.73
... ... ... ... ...
5% tscrollx macosx1015-64-shippable-qr e10s fission stylo webrender-sw 0.81 -> 0.77

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=34746

Yeah, this looks like it largely fixes the regressions introduced by bug 1670885 and in several cases even improves over the previous results.

Flags: needinfo?(gsvelto)

Comment on attachment 9281763 [details]
Bug 1774458 - Use undocumented, non-public adaptive spinlocks on macOS 10.15+, revert to user-space spinlocks on older versions r=glandium

Beta/Release Uplift Approval Request

  • User impact if declined: A measurable performance degradation in certain tasks
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This patch touches the memory allocator. Since it affects every aspect of Gecko's functionality if there was something wrong with it we'd know already.
  • String changes made/needed: none
  • Is Android affected?: No
Attachment #9281763 - Flags: approval-mozilla-beta?

Comment on attachment 9281763 [details]
Bug 1774458 - Use undocumented, non-public adaptive spinlocks on macOS 10.15+, revert to user-space spinlocks on older versions r=glandium

Approved for 103.0b7, thanks.

Attachment #9281763 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

DevTools performance improvement also spotted at https://treeherder.mozilla.org/perfherder/alerts?id=34717&hideDwnToInv=0 (the 2 regressions in the alert are from another bug)

== Change summary for alert #34808 (as of Wed, 13 Jul 2022 19:45:30 GMT) ==

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
11% tp5o_webext macosx1015-64-shippable-qr e10s fission stylo webrender-sw 236.16 -> 209.56
11% tart macosx1015-64-shippable-qr e10s fission stylo webrender 2.46 -> 2.19
11% tart macosx1015-64-shippable-qr e10s fission stylo webrender-sw 2.36 -> 2.11
10% tp5o_webext macosx1015-64-shippable-qr e10s fission stylo webrender 235.04 -> 211.15
10% tp5o_scroll macosx1015-64-shippable-qr e10s fission stylo webrender 1.77 -> 1.59
... ... ... ... ...
6% tscrollx macosx1015-64-shippable-qr e10s fission stylo webrender-sw 0.71 -> 0.67

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=34808

You need to log in before you can comment on or make changes to this bug.