signal 4 on freebsd arm64 due to missing cache synchronisation
Categories
(Core :: JavaScript Engine, defect, P3)
Tracking
()
People
(Reporter: jsm, Assigned: nbp)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: regression)
Attachments
(4 files)
1.06 KB,
patch
|
Details | Diff | Splinter Review | |
48 bytes,
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta+
|
Details | Review |
1.85 KB,
patch
|
Details | Diff | Splinter Review | |
1.43 KB,
patch
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:120.0) Gecko/20100101 Firefox/120.0
Steps to reproduce:
Run firefox > 115 i.e versions with write protected code off (i.e w+x pages and no call to mprotect between write and exec) for content processes, on freebsd Arm64. Jit code kills firefox with signal 4. https://searchfox.org/mozilla-central/rev/08899071a2c8a573ac47ac632869bb92269b3ec3/js/src/jit/arm64/vixl/MozCpu-vixl.cpp#113 add a defined FreeBSD here to also execute the cache sync code on FreeBSD. see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271081#c12 to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271081#c12#c14
Actual results:
signal 4
Expected results:
jit code should be ready to execute
Comment 1•6 months ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::JavaScript Engine' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Comment 2•6 months ago
|
||
add a defined FreeBSD here to also execute the cache sync code on FreeBSD
Why is that case limited to specific OSes at all? The #else
block below is:
#else
// If the host isn't AArch64, we must be using the simulator, so this function
// doesn't have to do anything.
USE(address, length);
so this code would be buggy for anything that is __aarch64__
and not one of the explicitly listed OSes.
Comment 3•5 months ago
|
||
Nicolas, thoughts on resolution here?
Assignee | ||
Comment 4•5 months ago
|
||
(In reply to jsm from comment #0)
Run firefox > 115 i.e versions with write protected code off (i.e w+x pages and no call to mprotect between write and exec) for content processes, on freebsd Arm64. Jit code kills firefox with signal 4. https://searchfox.org/mozilla-central/rev/08899071a2c8a573ac47ac632869bb92269b3ec3/js/src/jit/arm64/vixl/MozCpu-vixl.cpp#113 add a defined FreeBSD here to also execute the cache sync code on FreeBSD. see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271081#c12 to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271081#c12#c14
Thanks a lot to the FreeBSD community and you. This would have been harder to investigate otherwise.
Your patch to MozCPU-vixl.cpp sounds right, as well as your analysis on the fact that mprotect was hiding the lack of support.
On ARM the instruction cache has to be invalidated manually when we update the content, as otherwise it might keep a stale cache of what was previously seen as instructions, even if data was properly locked.
Firefox recently moved away from W^X, back to W&X, for performance reasons. After a long debate with the security team, this security mitigation no longer provides what it was meant for given the usage of JIT Spraying.
However, I am not sure for the need of MAP_FIXED in gc/Memory.cpp , I would expect our GC code to support this case correctly without MAP_FIXED.
I will forward the question To Jon, in case another bug has to be opened for the GC case.
Jesper (jsm), do you want to author the patch in Firefox as well? Or should I redo the patch?
(In reply to Ed Maste [:emaste] from comment #2)
add a defined FreeBSD here to also execute the cache sync code on FreeBSD
Why is that case limited to specific OSes at all?
This is because we are not able to test on all existing OSes, and probably a lack of trust that all OSes would behave the same. Especially given that Mac and Windows already require their own implementation.
Comment 5•5 months ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #4)
(In reply to Ed Maste [:emaste] from comment #2)
add a defined FreeBSD here to also execute the cache sync code on FreeBSD
Why is that case limited to specific OSes at all?
This is because we are not able to test on all existing OSes, and probably a lack of trust that all OSes would behave the same. Especially given that Mac and Windows already require their own implementation.
OK. Maybe it could be made explicit though:
#elif defined(__aarch64__)
#if (defined(__linux__) || defined(__android__))
...
#else
#error OS-specific cache invalidation needed!
#endif
#else
// If the host isn't AArch64, we must be using the simulator, so this function
// doesn't have to do anything.
USE(address, length);
#endif
Mainly I'd like to avoid falling into the case with the host isn't AArch64
comment, when the host actually is AArch64.
Firefox recently moved away from W^X, back to W&X, for performance reasons. After a long debate with the security team, this security mitigation no longer provides what it was meant for given the usage of JIT Spraying.
If you have it handy could I trouble you for more info on the change (e.g. is there is a public thread discussing it, wiki page etc.)? FreeBSD has a knob to enforce W^X policy (off by default), and a way to tag binaries as requiring WX mappings (NT_FREEBSD_FEATURE_CTL
ELF note). I'd like to look at a patch (both in our ports collection and for upstream) to handle this, and would like to include a reference.
Assignee | ||
Comment 6•5 months ago
|
||
(In reply to Ed Maste [:emaste] from comment #5)
Firefox recently moved away from W^X, back to W&X, for performance reasons. After a long debate with the security team, this security mitigation no longer provides what it was meant for given the usage of JIT Spraying.
If you have it handy could I trouble you for more info on the change (e.g. is there is a public thread discussing it, wiki page etc.)? FreeBSD has a knob to enforce W^X policy (off by default), and a way to tag binaries as requiring WX mappings (
NT_FREEBSD_FEATURE_CTL
ELF note). I'd like to look at a patch (both in our ports collection and for upstream) to handle this, and would like to include a reference.
The change happened in Bug 1835876 (https://hg.mozilla.org/integration/autoland/rev/9d31a829691c)
And the good news is that the behavior can be restored in about:config
(user.js
) to turn javascript.options.content_process_write_protect_code
back to true if needed.
(In reply to Nicolas B. Pierron [:nbp] from comment #4)
(In reply to jsm from comment #0)
Run firefox > 115 i.e versions with write protected code off (i.e w+x pages and no call to mprotect between write and exec) for content processes, on freebsd Arm64. Jit code kills firefox with signal 4. https://searchfox.org/mozilla-central/rev/08899071a2c8a573ac47ac632869bb92269b3ec3/js/src/jit/arm64/vixl/MozCpu-vixl.cpp#113 add a defined FreeBSD here to also execute the cache sync code on FreeBSD. see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271081#c12 to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271081#c12#c14
Thanks a lot to the FreeBSD community and you. This would have been harder to investigate otherwise.
Your patch to MozCPU-vixl.cpp sounds right, as well as your analysis on the fact that mprotect was hiding the lack of support.On ARM the instruction cache has to be invalidated manually when we update the content, as otherwise it might keep a stale cache of what was previously seen as instructions, even if data was properly locked.
Firefox recently moved away from W^X, back to W&X, for performance reasons. After a long debate with the security team, this security mitigation no longer provides what it was meant for given the usage of JIT Spraying.
However, I am not sure for the need of MAP_FIXED in gc/Memory.cpp , I would expect our GC code to support this case correctly without MAP_FIXED.
Yes the MAP_FIXED solves to run with aslr on FreeBSD and is another story I have to investegate further as mentioned in the downstream bug. As you say it is unrelated to jit crashing with signal 4.
I will forward the question To Jon, in case another bug has to be opened for the GC case.
Jesper (jsm), do you want to author the patch in Firefox as well? Or should I redo the patch?
I would like to be the author, I might be able to attach a patch tomorrow (I am in UTC+1 ) . I like the suggestion of Ed Maste to come up with a compile time error if the OS falls through to not implemented cache invalidation code.
(In reply to Ed Maste [:emaste] from comment #2)
add a defined FreeBSD here to also execute the cache sync code on FreeBSD
Why is that case limited to specific OSes at all?
This is because we are not able to test on all existing OSes, and probably a lack of trust that all OSes would behave the same. Especially given that Mac and Windows already require their own implementation.
Comment 8•5 months ago
•
|
||
(In reply to Ed Maste [:emaste] from comment #5)
Mainly I'd like to avoid falling into the case with the
host isn't AArch64
comment, when the host actually is AArch64.
I think this is a regression from the OpenBSD fix in bug 1672619. The patch there fixed the code to compile on OpenBSD by disabling the cache invalidation code if not Linux or Android.
We should at least add something like:
#elif defined(__aarch64__)
# error ...
#else
...
I have excluded OpenBSD from the error message, because the cache code is neither needed or able to compile there. The patch is for tip[0], but I have only tested on 122 build2 since I do not have easy access to newer nss on FreeBSD.
changeset: 771854:7235102b55e9
bookmark: autoland
tag: tip
user: moz-wptsync-bot <wptsync@mozilla.com>
date: Thu Jan 18 18:58:17 2024 +0000
summary: Bug 1875346 - [wpt-sync] Update web-platform-tests to d46f1aec78700b30ec633165cd51af2fd90bf3b6, a=testonly
Comment 10•5 months ago
|
||
Comment on attachment 9375624 [details] [diff] [review] cacheinvalidation.patch Review of attachment 9375624 [details] [diff] [review]: ----------------------------------------------------------------- This change LGTM
Comment 11•5 months ago
|
||
(In reply to jsm from comment #9)
I have excluded OpenBSD from the error message, because the cache code is neither needed or able to compile there.
Why does OpenBSD not need to invalidate the instruction cache?
Comment 12•5 months ago
|
||
The patch in bug 1672619 changed ifdefs in some other places too. I wonder if we should just revert that patch and fix the original OpenBSD issue in a different way.
Reporter | ||
Comment 13•5 months ago
|
||
OpenBSD write protects always, so I assume that their mprotect behaves the same with regards to cache, otherwise they would have seen the same issue. I do not have a arm64 openbsd atm though..
Assignee | ||
Comment 14•5 months ago
|
||
Updated•5 months ago
|
Assignee | ||
Comment 15•5 months ago
|
||
Hum … phabricator does not seems to keep the authorship set on the patch, and instead uses the one from the person who is pushing the patch to phabricator.
Assignee | ||
Comment 16•5 months ago
|
||
Apparently phabricator confuses the person who submit the patch with the patch author.
Looking at the patch which is pushed to our Try server, the authorship is retained: https://hg.mozilla.org/try/rev/c40f220781e5990dc3ef0cda92cd3660f45b8f7d
So everything is good despite phabricator reporting it wrongly.
Comment 17•5 months ago
|
||
Set release status flags based on info from the regressing bug 1672619
Reporter | ||
Comment 18•5 months ago
|
||
Same strategy for CPU::GetCacheType()
Reporter | ||
Comment 19•5 months ago
|
||
Do not error out in GetCacheType.
Updated•5 months ago
|
Comment 20•4 months ago
|
||
Pushed by npierron@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/1b595199c4de Use instruction cache invalidation on FreeBSD r=jandem
Comment 21•4 months ago
|
||
bugherder |
Comment 22•4 months ago
|
||
:nbp does this need a beta uplift request for Fx123, or should it ride the trains with Fx124?
Assignee | ||
Comment 23•4 months ago
|
||
(In reply to Donal Meehan [:dmeehan] from comment #22)
:nbp does this need a beta uplift request for Fx123, or should it ride the trains with Fx124?
As far as I understand this could ride the train, as there is no urgency in getting these patches out, as patches are already handled by the package manager, with the same outcome.
However, if FreeBSD maintainers prefer to have it upstream to Firefox 123, we can do it.
JSM, is there a need for having these patches as part of Firefox sooner than 124?
Reporter | ||
Comment 24•4 months ago
|
||
It would be really nice to already have it in Firefox 123. Thanks.
Updated•4 months ago
|
Assignee | ||
Comment 25•4 months ago
•
|
||
Comment on attachment 9375952 [details]
Bug 1871969 - Use instruction cache invalidation on FreeBSD
Beta/Release Uplift Approval Request
- User impact if declined: (see comment 24)
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Already tested and deployed in the form of patches to the FreeBSD community.
- String changes made/needed:
- Is Android affected?: No
Comment 26•4 months ago
|
||
Comment on attachment 9375952 [details]
Bug 1871969 - Use instruction cache invalidation on FreeBSD
Approved for 123 beta 8, thanks.
Comment 27•4 months ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/5cb029591fdb
Comment 28•4 months ago
|
||
bugherder uplift |
Description
•