Closed Bug 1843534 Opened 10 months ago Closed 9 months ago

clock_gettime() etc are very slow on some android devices

Categories

(Core :: Performance, defect)

Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
118 Branch
Tracking Status
firefox-esr115 --- wontfix
firefox117 --- wontfix
firefox118 --- fixed

People

(Reporter: jnicol, Assigned: jnicol)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [sp3])

Attachments

(8 files, 1 obsolete file)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

https://share.firefox.dev/3PPYuRI

This shows up a lot in a speedometer (around 4%), and skews comparisons with Chrome as both browsers take timestamps in different places, though appear to be similarly affected overall.

This occurs on several devices I have tested including the Samsung A51, Samsung S9, Pixel 5, OnePlus 8 Pro. On other devices (including Pixel 4, PIxel 6, Samsung S10) this barely shows up in profiles if at all. This isn't a exynos vs snapdragon type thing. Additionally, naively checking the kernel version doesn't show a pattern: if this was broken then fixed in specific versions then perhaps vendor specific backports muddy the water.

It appears that the VDSO fast path is failing for some reason and we end up doing a syscall. I'm trying to get kernel symbols included in the profile to see if that sheds any light on why.

We may need to audit our usage of timestamps on the critical path.

Whiteboard: [sp3]

I think I've figured out why this is slow, on the A51 at least.

The code I'm linking to below is from the 4.14 branch of the android common kernel. The actual kernel used on the device will differ but hopefully the relevant bits are still the same.

See ARM64_ERRATUM_858921, which affects Cortex-A73 CPUs (which the Samsung A51 has). Presumably this workaround is therefore enabled.

We can see here this creates a arch_timer_erratum_workaround instance with the read_cntvct_el0 field set.

We can then see here that if that field is set, we set clocksource_counter.archdata.vdso_direct to false.

Then in update_vsyscall(), which is where the kernel writes its time data to the vdso data page, if archdata.vdso_direct is false then it sets vdso_data.use_syscall to true.

Then finally, see the actual clock_gettime() implementation in the vdso. This uses the syscall_check macro, which reads the value of vdso_data.use_syscall and branches to the syscall fallback if it's set.

CLOCK_MONOTONIC_COARSE is unaffected by this issue, it stays in the VDSO fast path. So at the very least we can add an implementation of TimeStamp::NowLowRes() using this.

Additionally we can then audit our existing usage of timestamps, and see if there's any we can either remove or switch to low resolution.

The Pixel 5 family is also affected via the same mechanism, but due to a different CPU erratum, which affects Cortex A76 and Kyro 4G (of which the pixel 5 has the latter core).

While the pixel 6 family has a Cortex A76, it is not affected due to this commit which restricts that workaround to AArch32 mode.

There are other affected devices too with different CPUs, and sometimes 2 devices may have the same CPU and kernel version but one is affected and one is not. But I'm confident enough that this is our cause, and will chaulk that up to vendors cherry-picking different patches on to their kernels.

We have discovered that clock_gettime(CLOCK_MONOTONIC) can be slow on
certain arm64 devices due to kernel workarounds for CPU errata
avoiding the VDSO fast-path. Using CLOCK_MONOTONIC_COARSE is
unnaffected by these issues. This patch adds an implementation of
TimeStamp::NowLoRes() using the coarse clock, meaning that when lower
precision timestamps are adequate we do not pay the penalty of hitting
the slow path.

Assignee: nobody → jnicol
Status: NEW → ASSIGNED

On certain devices we have seen that taking frequent timestamps can be
expensive. Currently we measure the time taken to write protect JIT
code, and as this is a hot path we see these timestamps show up in
profiles. However, since bug 1835876 we no longer write protect code
in content processes. This patch therefore avoids measuring the time
when write protection is disabled.

Depends on D185004

On certain devices we have seen that taking frequent high-resolution
timestamps can be expensive, therefore it makes sense to use
low-resolution timestamps instead where possible.

js::Nursery::isUnderused() and CCGCScheduler::MaybePokeCC() check for
timeouts of > 1 second, therefore the low resolution timer is more
than adequate.

Depends on D185005

On certain devices we have seen that taking frequent timestamps can be
expensive, therefore it makes sense to reduce the number of timestamps
taken in hot paths. Measuring the time taken in refresh driver phases
and in NotifyObservers() shows up in profiles. However, these values
are only used to report to telemetry and we do not actually collect
this telemetry on Android. This patch therefore avoids measuring these
durations on Android.

Depends on D185006

As taking timestamps can be expensive on certain devices we should try
to avoid taking them where possible. This patch fixes a few occasions
where we take a timestamp solely for a profiler marker even when the
current thread is not being profiled.

Depends on D185007

On certain devices we have seen that taking frequent timestamps can be
expensive, therefore it makes sense to reduce the number of timestamps
taken in hot paths. HttpChannelChild::OnStopRequest() currently takes
multiple timestamps (for profiler markers and telemetry measurements)
without performing meaningful work in between. This patch replaces
these with a single timestamp used multiple times.

Depends on D185008

On certain devices we have seen that taking frequent timestamps can be
expensive, therefore it makes sense to reduce the number of timestamps
taken in hot paths. nsTimerImpl::Fire() currently takes a timestamp
which only gets used in logging. This patch therefore makes it so the
timestamp is only taken when logging is enabled.

Depends on D185009

Attachment #9346626 - Attachment description: Bug 1843534 - Avoid measuring durations for telemetry that is not collected on Android. r?#layout-reviewers!,#xpcom-reviewers! → Bug 1843534 - Remove unused telemetry probes. r?#layout-reviewers!,#xpcom-reviewers!
Attachment #9346623 - Attachment description: Bug 1843534 - Implement TimeStamp::NowLoRes() on Linux (including Android). r?glandium → Bug 1843534 - Implement TimeStamp::NowLoRes() when CLOCK_MONOTONIC_COARSE is supported. r?glandium
See Also: → 1847909
Pushed by jnicol@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dfac5b1681a1
Implement TimeStamp::NowLoRes() when CLOCK_MONOTONIC_COARSE is supported. r=glandium
https://hg.mozilla.org/integration/autoland/rev/e1cf0b9086ba
Avoid measuring protect time when write-protection is disabled. r=jandem
https://hg.mozilla.org/integration/autoland/rev/cf8e2772f907
Use low resolution timestamp for GC timeouts. r=jonco,smaug
https://hg.mozilla.org/integration/autoland/rev/65677e6aa05d
Remove unused telemetry probes. r=emilio
https://hg.mozilla.org/integration/autoland/rev/01be17916dd3
Avoid taking unnecessary timestamps when thread is not being profiled for markers. r=julienw
https://hg.mozilla.org/integration/autoland/rev/e03410fcc17d
Use single timestamp in HttpChannelChild::OnStopRequest(). r=necko-reviewers,kershaw,jesup,acreskey
https://hg.mozilla.org/integration/autoland/rev/7b7e075d34e7
Avoid unnecessary timestamp in nsTimerImpl::Fire if logging is disabled. r=xpcom-reviewers,nika
Regressions: 1847928
Attachment #9348145 - Attachment description: Bug 1843534 - Ensure profiler_add_marker arguments are not evaluated when not profiled for markers. r=julienw,jnicol → Bug 1843534 - Ensure profiler_add_marker arguments are not evaluated when not profiled. r=canaltinova,aabh
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ccf1b5218169
Ensure profiler_add_marker arguments are not evaluated when not profiled. r=jnicol,canaltinova

redirecting ni to emilio

Flags: needinfo?(jnicol) → needinfo?(emilio)

Comment on attachment 9348145 [details]
Bug 1843534 - Ensure profiler_add_marker arguments are not evaluated when not profiled. r=canaltinova,aabh

Revision D185800 was moved to bug 1848130. Setting attachment 9348145 [details] to obsolete.

Attachment #9348145 - Attachment is obsolete: true

Let me move that to a separate bug to ease tracking.

Flags: needinfo?(emilio)

Note, I'm going to request ESR115 uplift one particular patch on this bug, https://phabricator.services.mozilla.com/D185007 in particular, since it's just a code-removal and it makes the ESR uplifts in bug 1861259 apply more cleanly.

(I'm using moz-phab to request uplift of just that one patch; I think that'll result in a new ESR-bound phabricator revision here. Apologies if anything goes haywire, as the new uplift-approval process is quite new.)

These telemetry probes have long since expired, so remove them. This
avoids taking unnecessary timestamps in these functions, which we have
seen can be expensive on certain devices.

Original Revision: https://phabricator.services.mozilla.com/D185007

Attachment #9360719 - Flags: approval-mozilla-esr115?

Uplift Approval Request

  • Steps to reproduce for manual QE testing: N/A
  • Is Android affected?: yes
  • String changes made/needed: None
  • Code covered by automated testing: yes
  • User impact if declined: None, but it makes uplift in bug 1861259 more awkward/human-error-prone
  • Explanation of risk level: This is just removing code for an unused telemetry probe.
  • Risk associated with taking this patch: Essentially zero.
  • Fix verified in Nightly: yes
  • Needs manual QE test: no

Comment on attachment 9360719 [details]
Bug 1843534 - Remove unused telemetry probes.

Approved for 115.5esr

Attachment #9360719 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+

I don't think it's correct to say that this bug is fixed in ESR 115. Only one patch was uplifted that doesn't affect behaviour.

Blocks: 1807007
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: