ARM64: Javascript engine incorrectly assumes virtual addresses are 47 bit

RESOLVED FIXED in Firefox 49

Status

()

--
blocker
RESOLVED FIXED
4 years ago
26 days ago

People

(Reporter: steve.capper, Assigned: zheng.xu)

Tracking

Trunk
mozilla49
Other
Linux
Points:
---

Firefox Tracking Flags

(firefox49 fixed)

Details

Attachments

(4 attachments, 3 obsolete attachments)

(Reporter)

Description

4 years ago
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:35.0) Gecko/20100101 Firefox/35.0
Build ID: 20150115144710

Steps to reproduce:

Boot an AArch64 system (in my case a Seattle) with 48 bit virtual addresses enabled (4KB pages, 4-levels of page table).
Build the latest Javascript engine from:
hg clone http://hg.mozilla.org/mozilla-central/
(I had changeset:   233496:9dbb2d41bb2c)
Run jsapi-tests


Actual results:

I got an unhandled translation fault:
unhandled level 0 translation fault (11) at 0x7fff923ffff0, esr 0x92000004
pgd = ffff8000e95ac000
[7fff923ffff0] *pgd=0000000000000000

CPU: 1 PID: 4587 Comm: jsapi-tests Not tainted 3.19.0+ #1181
Hardware name:  /Default string, BIOS ROD0072B 10/23/2014
task: ffff800312ef4200 ti: ffff8000b2178000 task.ti: ffff8000b2178000
PC is at 0x4da6f0
LR is at 0x4da6d0
pc : [<00000000004da6f0>] lr : [<00000000004da6d0>] pstate: 20000000
sp : 0000ffffe76108e0
x29: 0000ffffe7610e70 x28: 0000000000000000 
x27: 0000ffffe7610e68 x26: 0000000000a325b8 
x25: 0000ffffe76109b8 x24: 0000ffffe7610aa8 
x23: 0000000000000001 x22: 000000002219d328 
x21: 000000000000003c x20: 0000ffff92322060 
x19: 00000000221ac120 x18: 00000000000007de 
x17: 0000ffff93d08a58 x16: 00000000015f5660 
x15: 000000000000016e x14: 0000000000000047 
x13: 0000000000000014 x12: 00000000221abaa0 
x11: 0000000000000000 x10: 0000000000000000 
x9 : 0000000000000005 x8 : 0000ffff92321150 
x7 : 000000000000001b x6 : 000000000000001f 
x5 : 0000000000000000 x4 : 0000000000000000 
x3 : fffcffff92323020 x2 : 00007fff923ff000 
x1 : 000000000001fff2 x0 : fffbffffffffffff


Expected results:

The invalid address listed above (0x7fff923ffff0) should have been (0xffff923ffff0).
Inspection of the source code led to: 
# define JSVAL_TAG_SHIFT 47
In js/public/Value.h.
The upper bit of the address is being incorrectly masked out.
(Reporter)

Comment 1

4 years ago
I've flagged this as a blocker as it was an incredibly hard bug to locate and will cause people running 48-bit VAs a world of hurt.

Earlier versions of the Javascript engine are also affected.

On Fedora, this actually prevents the machine from booting up properly due to polkitd crashing.
Severity: normal → blocker
Hardware: x86_64 → ARM
Sounds like bug 910845. Our NaN-boxing format requires those top bits to specify the type of the value. So you have to modify mmap or similar to ensure that those top bits are cleared.
We have a check for this at [1] (and [2]). It sounds like those #ifdefs need to be updated to include 64-bit ARM.

[1] https://dxr.mozilla.org/mozilla-central/source/js/src/gc/Memory.cpp#405
[2] https://dxr.mozilla.org/mozilla-central/source/js/src/gc/Memory.cpp#382
@jolesen, you may run into this when moving to real hardware :)
Hardware: ARM → Other
Summary: Javascript engine incorrectly assumes virtual addresses are 47 bit on 64-bit ARM → ARM64: Javascript engine incorrectly assumes virtual addresses are 47 bit
Setting needinfo so we don't forget about this.
Flags: needinfo?(jolesen)
Removing needinfo? since this is not actionable until I have real hardware.
Flags: needinfo?(jolesen)
Duplicate of this bug: 1250400
(Assignee)

Comment 8

3 years ago
Created attachment 8740353 [details] [diff] [review]
mozjs17-tag.patch

This patch fixes the tag pointer issue mozjs17 on Redhat 7.2 aarch64 with 48bit VA enabled.
(Assignee)

Comment 9

3 years ago
Created attachment 8740354 [details] [diff] [review]
mozjs24-tag.patch

This patch fixes the tag pointer issue mozjs24 on Redhat 7.2 aarch64 with 48bit VA enabled.
(Assignee)

Comment 10

3 years ago
Created attachment 8740355 [details] [diff] [review]
mozjs185-tag.patch

This patch fixes the tag pointer issue mozjs185 on Ubuntu aarch64 with 48bit VA enabled.

Comment 11

3 years ago
Zheng: Can you let us know if this is the final version going into mozjs? Anyone got further comments, please? We can't just have random "patch for distro X". That's not the right way at all.

Comment 12

3 years ago
Also, this (mozjs17 patch) fails to compile on x86_64, meaning it isn't acceptable anyway I assume.
These patches are workarounds for older mosjs releases. The approach taken here doesn't work for tip. It looks like supporting 48-bit VAs on tip will require a compromise somewhere, and probably a much more invasive patch than these.

Comment 14

3 years ago
Wait...what's the status of tip? Is there a proposed patch. RHEL isn't distro X. We don't pull patches that haven't been posted upstream properly without a good reason. Folks told me this problem was fixed. I read the above very differently.
FWIW, without reading the patches proposed for older branches, simply ensuring that all addresses seen by the Javascript engine *are* 47-bit is probably easier than extending it to deal with 48-bit addresses. As I said in comment #3, this probably just needs a couple of #ifdefs to be extended for the GC, and the same adjustments for jemalloc [1] to ensure the same for malloced data.

[1] https://dxr.mozilla.org/mozilla-central/source/memory/mozjemalloc/jemalloc.c#2412
(Reporter)

Comment 16

3 years ago
(In reply to Jon Masters from comment #12)
> Also, this (mozjs17 patch) fails to compile on x86_64, meaning it isn't
> acceptable anyway I assume.

The two asserts in js/src/methodjit/MethodJIT.cpp on lines 241, 242 just need to be changed to match the masks introduced in the patch. Then it appears to compile and pass all the x86_64 unit tests.
(Reporter)

Comment 17

3 years ago
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #15)
> FWIW, without reading the patches proposed for older branches, simply
> ensuring that all addresses seen by the Javascript engine *are* 47-bit is
> probably easier than extending it to deal with 48-bit addresses. As I said
> in comment #3, this probably just needs a couple of #ifdefs to be extended
> for the GC, and the same adjustments for jemalloc [1] to ensure the same for
> malloced data.
> 
> [1]
> https://dxr.mozilla.org/mozilla-central/source/memory/mozjemalloc/jemalloc.
> c#2412

I don't think the logic in the comment in [1] is quite right for Linux.

If I supply mmap with an `addr' parameter and don't specify MAP_FIXED, then the Linux kernel will do the following:
1) Attempt to create a VMA at `addr' with the size requested,
if that fails,
2) Pick an address as normal as if `addr' was never specified in the first place.

(I double checked this with a test program on arm64 and x64)

The Linux kernel will make no attempt to map addresses near to `addr' as mentioned in the comment; thus I think the code in [1] needs a test/review (as it may not work for ia64 properly at the moment), then it could be expanded to include arm64.
(Assignee)

Comment 18

3 years ago
(In reply to Jon Masters from comment #11)
> Zheng: Can you let us know if this is the final version going into mozjs?
> Anyone got further comments, please? We can't just have random "patch for
> distro X". That's not the right way at all.

Sorry for replying you late. As Jacob mentioned, the patches workarounds for older mosjs releases and they don't work for tip. Also as Steve mentioned, there are asserts need to be changed to pass x86_64 tests. Sorry for not testing the patches on x86
(Assignee)

Comment 19

3 years ago
(In reply to Steve Capper from comment #17)
> (In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #15)
> > FWIW, without reading the patches proposed for older branches, simply
> > ensuring that all addresses seen by the Javascript engine *are* 47-bit is
> > probably easier than extending it to deal with 48-bit addresses. As I said
> > in comment #3, this probably just needs a couple of #ifdefs to be extended
> > for the GC, and the same adjustments for jemalloc [1] to ensure the same for
> > malloced data.
> > 
> > [1]
> > https://dxr.mozilla.org/mozilla-central/source/memory/mozjemalloc/jemalloc.
> > c#2412
> 
> I don't think the logic in the comment in [1] is quite right for Linux.
> 
> If I supply mmap with an `addr' parameter and don't specify MAP_FIXED, then
> the Linux kernel will do the following:
> 1) Attempt to create a VMA at `addr' with the size requested,
> if that fails,
> 2) Pick an address as normal as if `addr' was never specified in the first
> place.
> 
> (I double checked this with a test program on arm64 and x64)
> 
> The Linux kernel will make no attempt to map addresses near to `addr' as
> mentioned in the comment; thus I think the code in [1] needs a test/review
> (as it may not work for ia64 properly at the moment), then it could be
> expanded to include arm64.

It looks like no good solutions here ... 

Even we specify MAP_FIXED, the behavior is still not our expectation. Description from mmap manual :
  Don't interpret addr as a hint: place the mapping at exactly that address.  addr must be a multiple of the page size.  If the memory region specified by addr and len overlaps  pages of  any  existing mapping(s), then the overlapped part of the existing mapping(s) will be discarded.

If we set addr with an address which is already mapped. We still get the addr we set, which means we might pollute the memory which is being used by someone else.
(Assignee)

Comment 20

3 years ago
Not sure if MAP_32BIT is the right flag for this case.
(Assignee)

Comment 21

3 years ago
Created attachment 8755770 [details] [diff] [review]
0001-Manually-mmap-on-arm64-to-ensure-high-17-bits-are-cl.patch

This patch has been tested with on arm64 with 48-bit VA kernel configuration:
./tests/jstests.py ${BUILDDIR}/dist/bin/js
./jit-test/jit_test.py ${BUILDDIR}/dist/bin/js

Comparing to the 39-bit VA kernel configuration, there should be no regression.
Attachment #8755770 - Flags: review?(terrence)
Attachment #8755770 - Flags: review?(luke)
Comment on attachment 8755770 [details] [diff] [review]
0001-Manually-mmap-on-arm64-to-ensure-high-17-bits-are-cl.patch

Review of attachment 8755770 [details] [diff] [review]:
-----------------------------------------------------------------

(deferring to Terrence)
Attachment #8755770 - Flags: review?(luke)
Comment on attachment 8755770 [details] [diff] [review]
0001-Manually-mmap-on-arm64-to-ensure-high-17-bits-are-cl.patch

Review of attachment 8755770 [details] [diff] [review]:
-----------------------------------------------------------------

This seems like it will work to me. Emanuel owns Memory.cpp these days, so I'd like to have him take the review.
Attachment #8755770 - Flags: review?(terrence) → review?(emanuel.hoogeveen)
Assignee: nobody → zheng.xu
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment on attachment 8755770 [details] [diff] [review]
0001-Manually-mmap-on-arm64-to-ensure-high-17-bits-are-cl.patch

Review of attachment 8755770 [details] [diff] [review]:
-----------------------------------------------------------------

Oof, nasty. In theory this could loop up to 8 million times if a lot of address space is in use - and up to 128 million times if the step size is changed to ChunkSize. That seems unlikely at first blush, but OdinMonkey reserves 4GiB regions of address space for asm.js IIRC. I'm wondering if we should just have the GC reserve 4GiB "super chunks" on 64-bit and hand out addresses from those - that would at least make this into a one-time thing. That would be a significantly more involved change though.

r+ with the changes below, but if this ends up looping a lot in practice we'll need something more clever. If you're building Firefox with this, you'll likely need similar changes for (moz)jemalloc to cover non-GC memory.

::: js/src/gc/Memory.cpp
@@ +490,5 @@
> +    * There might be similar virtual address issue on arm64 which depends on hardware and
> +    * kernel configurations. But the work around is slightly different due to the different
> +    * mmap behavior.
> +    *
> +    * TODO: Merge with the above code block if this implementation works for ia64 and sparc64.

Nit: Comments should fit within 80 columns (code can use up to 99). There are some pre-existing violations of that in this file, but it's probably best to not add more.

@@ +494,5 @@
> +    * TODO: Merge with the above code block if this implementation works for ia64 and sparc64.
> +    */
> +    constexpr uintptr_t start = 0x0000070000000000;
> +    constexpr uintptr_t end   = 0x0000800000000000;
> +    constexpr uintptr_t step  = 0x0000000001000000;

Is there a particular reason you chose a step size of 16MiB here? If not, this should probably be set to ChunkSize (1MiB on desktop, 256KiB on mobile).

@@ +503,5 @@
> +    uintptr_t hint;
> +    void* region = MAP_FAILED;
> +    for (hint = start; region == MAP_FAILED && hint + length <= end; hint += step) {
> +        region = mmap((void*)hint, length, prot, flags, fd, offset);
> +        if (region != (void*)hint) {

From comment #17, this will just choose the first available address if |hint| was already mapped, which might be valid for our purposes. Could you change this to

if ((uintptr_t(region) + (length - 1)) & 0xffff800000000000) {

(and add a check for MAP_FAILED above) so we'll accept valid addresses?

::: js/src/jsapi-tests/testGCAllocator.cpp
@@ +330,4 @@
>  mapMemory(size_t length)
>  {
>      void* hint = nullptr;
> +#if defined(__ia64__) || (defined(__sparc64__) && defined(__NetBSD__)) || defined(__aarch64__)

If we want these tests to work, we need the same logic here as in gc/Memory.cpp (feel free to omit the comments though).
Attachment #8755770 - Flags: review?(emanuel.hoogeveen) → review+
I'm just looking in from the sidelines here, but this makes me relatively nervous since there's no way for us to detect it if/when the performance becomes a problem - there will just be inexplicable browser "hangs", and in the worst case those hangs will happen frequently, once for every region allocation.

Could we add some kind of telemetry to count (or bucket) the number of probes required, or the time required to do the probing?  I confess I don't know what that telemetry should look like, but it seems less invasive than just crashing if we exceed some set limit on the number of probes.
(Assignee)

Comment 26

3 years ago
(In reply to Lars T Hansen [:lth] from comment #25)
> I'm just looking in from the sidelines here, but this makes me relatively
> nervous since there's no way for us to detect it if/when the performance
> becomes a problem - there will just be inexplicable browser "hangs", and in
> the worst case those hangs will happen frequently, once for every region
> allocation.
> 
> Could we add some kind of telemetry to count (or bucket) the number of
> probes required, or the time required to do the probing?  I confess I don't
> know what that telemetry should look like, but it seems less invasive than
> just crashing if we exceed some set limit on the number of probes.

I'm thinking on this problem. Now I plan to modify the code to just loop several times.
The hint address will be calculated by last successfully mmaped address(a global variable with lock).

BTW, how should I update the patch? Just submit a patch with the same name?
(Assignee)

Comment 27

3 years ago
Created attachment 8756224 [details] [diff] [review]
0001-Manually-mmap-on-arm64-to-ensure-high-17-bits-are-cl.patch
Attachment #8755770 - Attachment is obsolete: true
Attachment #8756224 - Flags: review?(emanuel.hoogeveen)
(Assignee)

Comment 28

3 years ago
Comment on attachment 8756224 [details] [diff] [review]
0001-Manually-mmap-on-arm64-to-ensure-high-17-bits-are-cl.patch

Review of attachment 8756224 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/Memory.cpp
@@ +502,5 @@
> +    * 1. Exam /proc/self/maps to find an available address. But this file is
> +    *    not always available. Even we exam /proc/self/maps, we still need to
> +    *    retry several times. Because the free space we found might be occupied
> +    *    by some other threads.
> +    * 2. Use a global/static variable with lock to track the addresses we have

Personally, I prefer the 2nd choice if there are too many retries in practice.
(Assignee)

Comment 29

3 years ago
Hi Emanuel,

Ping. Any comments on the updated patch?
Comment on attachment 8756224 [details] [diff] [review]
0001-Manually-mmap-on-arm64-to-ensure-high-17-bits-are-cl.patch

Review of attachment 8756224 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, I was expecting this to be a larger change - it doesn't look like you made the change you were considering in comment #26? It sounds like there are plans to support 48-bit addresses in the engine though, so I think we can accept this as a temporary fix.

I just noticed this, but it looks like you're using a non-standard patch format. Could you use the format from [1] please?

[1] https://developer.mozilla.org/en-US/docs/Mercurial/Using_Mercurial#How_can_I_generate_a_patch_for_somebody_else_to_check-in_for_me.3F

::: js/src/gc/Memory.cpp
@@ +497,5 @@
> +    constexpr uintptr_t start = 0x0000070000000000;
> +    constexpr uintptr_t end   = 0x0000800000000000;
> +    constexpr uintptr_t step  = ChunkSize;
> +   /*
> +    * Optimization options if there are too many retries in practise:

Nit: s/practise/practice/.

@@ +501,5 @@
> +    * Optimization options if there are too many retries in practise:
> +    * 1. Exam /proc/self/maps to find an available address. But this file is
> +    *    not always available. Even we exam /proc/self/maps, we still need to
> +    *    retry several times. Because the free space we found might be occupied
> +    *    by some other threads.

I suggest the following wording (slightly tweaked):

    * 1. Examine /proc/self/maps to find an available address. This file is
    *    not always available, however. In addition, even if we examine
    *    /proc/self/maps, we may still need to retry several times due to
    *    racing with other threads.

::: js/src/jsapi-tests/testGCAllocator.cpp
@@ +330,4 @@
>  mapMemory(size_t length)
>  {
>      void* hint = nullptr;
> +#if defined(__ia64__) || (defined(__sparc64__) && defined(__NetBSD__)) || defined(__aarch64__)

Please add the same logic here as what you added to Memory.cpp.
Attachment #8756224 - Flags: review?(emanuel.hoogeveen)
(Assignee)

Comment 31

3 years ago
Created attachment 8757868 [details] [diff] [review]
aarch64-48-bit-VA-fix.patch
Attachment #8756224 - Attachment is obsolete: true
Attachment #8757868 - Flags: review?(emanuel.hoogeveen)
Comment on attachment 8757868 [details] [diff] [review]
aarch64-48-bit-VA-fix.patch

Review of attachment 8757868 [details] [diff] [review]:
-----------------------------------------------------------------

Great, thanks! I had a final few comments that I should have noticed before, r+ with those fixed.

Feel free to "carry forward" r+ by the way; just set r+ on the updated patch yourself. Please use my Bugzilla handle in your patch comment instead of my e-mail address (so r=ehoogeveen) :)

::: js/src/gc/Memory.cpp
@@ +493,5 @@
> +    *
> +    * TODO: Merge with the above code block if this implementation works for
> +    * ia64 and sparc64.
> +    */
> +    constexpr uintptr_t start = 0x0000070000000000;

Nit: To avoid potential warnings, please wrap this 64-bit constant and the other ones you're adding in UINT64_C (so UINT64_C(0x0000070000000000) in this case). Don't worry about pre-existing uses of the ULL suffix, though you can change those too if you want (e.g. 0x70000000000ULL -> UINT64_C(0x70000000000)).

@@ +495,5 @@
> +    * ia64 and sparc64.
> +    */
> +    constexpr uintptr_t start = 0x0000070000000000;
> +    constexpr uintptr_t end   = 0x0000800000000000;
> +    constexpr uintptr_t step  = ChunkSize;

Nit: We're about to drop support for MSVC2013, but I don't think we have yet. Please use MOZ_CONSTEXPR_VAR or just const for these.

::: js/src/jsapi-tests/testGCAllocator.cpp
@@ +346,5 @@
> +    return region;
> +#elif defined(__aarch64__)
> +    constexpr uintptr_t start = 0x0000070000000000;
> +    constexpr uintptr_t end   = 0x0000800000000000;
> +    constexpr uintptr_t step  = ChunkSize;

Same nits here.
Attachment #8757868 - Flags: review?(emanuel.hoogeveen) → review+
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #32)
> just set r+ on the updated patch

Hmm, might need editbug privileges for that. If you can't, just let me know and I'll set it.
(Assignee)

Comment 34

3 years ago
Created attachment 8758187 [details] [diff] [review]
aarch64-48-bit-VA-fix.patch

Hi Emanuel,

I do not have the permission to set review+ .

Now I am struggling on how to make the mmap faster. I prefer the proposal 2 I mentioned in the comment. But there are still different choices. 1) hint = last_hint + ChunkSize. Might still be slow 2) hint = some_radom_mechanism(last_hint). Might result VA fragments. Can this patch be merged first? And I will fix the potential slowness issue later when I have time.

Another question is : What are the branches for mozjs17/185/24 which was used by RHEL and Ubuntu/Debian? Are we still releasing these old versions?
Attachment #8757868 - Attachment is obsolete: true
Attachment #8758187 - Flags: review?(emanuel.hoogeveen)
Attachment #8758187 - Flags: review?(emanuel.hoogeveen) → review+
To sheriffs: as far as I'm aware this is not a configuration we (can) test in automation, so a try run wouldn't do much good.

(In reply to zheng.xu from comment #34)
> Now I am struggling on how to make the mmap faster. I prefer the proposal 2
> I mentioned in the comment. But there are still different choices. 1) hint =
> last_hint + ChunkSize. Might still be slow 2) hint =
> some_radom_mechanism(last_hint). Might result VA fragments. Can this patch
> be merged first? And I will fix the potential slowness issue later when I
> have time.

That's fine; I think getting it working at all is a good first step. Let's do further improvements in a new bug.

Aside from improvements to this approach, there are two potential alternative fixes: 1) supporting 48-bit addresses in the engine (I don't think there's a bug for this yet, but there was some discussion about it on IRC), or 2) reserving large (4GB) chunks of address space in advance on 64-bit architectures (this is an idea I've been mulling over).

The latter would do something similar to your patch under the hood, but far less frequently and with less potential to loop a crazy amount of times. But it would require some bigger changes to the memory backend.

> Another question is : What are the branches for mozjs17/185/24 which was
> used by RHEL and Ubuntu/Debian? Are we still releasing these old versions?

SpiderMonkey 38 is the latest standalone release; the next one will be SpiderMonkey 45, based on the engine in Firefox 45 ESR. We can consider uplifting this to ESR 45, and perhaps rolling up a release of 38 with this fix; Steve, what's your opinion on this?

If distributions still ship with those (much) older versions, they'll have to apply the patch here locally.
Flags: needinfo?(sphink)
Keywords: checkin-needed
(Assignee)

Comment 36

3 years ago
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #35)
> To sheriffs: as far as I'm aware this is not a configuration we (can) test
> in automation, so a try run wouldn't do much good.
I've tested the patch on an ARM64 platform with 48-bit VA kernel configuration(./tests/jstest.py and ./jit-test/jit-test.py). Before the patch, all the tests fail. After the patch, there are totally 3 failures which is the same with 39-bit VA configuration.
> Aside from improvements to this approach, there are two potential
> alternative fixes: 1) supporting 48-bit addresses in the engine (I don't
> think there's a bug for this yet, but there was some discussion about it on
> IRC), or 2) reserving large (4GB) chunks of address space in advance on
> 64-bit architectures (this is an idea I've been mulling over).
Not sure if supporting 48-bit addresses is a good alternative, there might be CPUs with more VA bits in the future.
> The latter would do something similar to your patch under the hood, but far
> less frequently and with less potential to loop a crazy amount of times. But
> it would require some bigger changes to the memory backend.
That will be a lot of code. :)

Comment 37

3 years ago
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/dfaafbaaa291
Manually mmap on arm64 to ensure high 17 bits are clear. r=ehoogeveen
Keywords: checkin-needed
> SpiderMonkey 38 is the latest standalone release; the next one will be
> SpiderMonkey 45, based on the engine in Firefox 45 ESR. We can consider
> uplifting this to ESR 45, and perhaps rolling up a release of 38 with this
> fix; Steve, what's your opinion on this?

I'd probably lean towards backporting it to SM45, but not bothering with SM38. SM45 is sort of half-released already (as in, it's listed as the latest release and is downloadable), though there are some other things that should really be backported too.

I'd be ok with it going into SM38, except that then I'd really want to backport a few other things, and I'm too lazy to track those down and do them.
Flags: needinfo?(sphink)

Comment 39

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/dfaafbaaa291
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
status-firefox49: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Depends on: 1277742
Hi!

From the commit the above:

> * TODO: Merge with the above code block if this implementation works for
> * ia64 and sparc64.

Yes, this can be used to fix the issue on sparc64 [1]. However, on sparc64, you need to free up 52 bits for VA, not just 48. I will have a go and test a modified version of this patch on sparc64.

Two additional changes are required on sparc64 to make the build past the initial stages [2].

I would be happy to help get these issues on sparc64 resolved. I'm also happy to provide access to a fast sparc64 machine for anyone interested.

Adrian

> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1275204
> [2] https://bugzilla.mozilla.org/show_bug.cgi?id=1273764
Oh, and if you merge this with the sparc64 code, please remove the part "(defined(__sparc64__) && defined(__NetBSD__))" and use just (defined(__sparc__) && defined(__arch64__)), so that the fix is no longer specific to NetBSD but is used on all sparc64 targets. The replacement of __sparc64__ comes from [1].

> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1273764
Ah, I got the patch wrong. You are actually using mmap to make sure the upper 17 bits are free. Ok, this means that the patch should actually work on sparc64 without any additional modification except for the proper platform defines.

I'll test it.
I tried the following patch on sparc64 [1] - no success, unfortunately. Still crashes.

> [1] https://bug1275204.bmoattachments.org/attachment.cgi?id=8817844
Could you confirm that the address it's crashing on is indeed a greater than 47-bit address? I've had an idea of how to fix this in a more palatable way for a while, but it would be good to know if and why the current approach doesn't work.
I can't check that right now since it's actually 4AM here, but I will try to test it the next days. Maybe the backtrace from [1] helps in the meantime? It doesn't have the arm64 allocator though.

> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1275204
Oh, also note that the GC allocator is only half the story. We also use malloc to allocate things like slots, and jemalloc still uses the old logic (I was expecting some sort of follow-up, but I don't think it was ever filed). The same approach might work there, but getting fixes into mozjemalloc tends to be an uphill struggle.
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #46)
> Oh, also note that the GC allocator is only half the story. We also use
> malloc to allocate things like slots, and jemalloc still uses the old logic

Right. I was actually also patching jemalloc back in May when playing with this issue the first time, see [1]. Plus, I copied the xulrunner stub code over from NetBSD [2].

> [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=824449#27
> [2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=824449#17
Enabling the same code from ia64 also for sparc64 (__sparc__ && __arch64__) results to allocation failures:

> Executing /build/firefox-IqR6jJ/firefox-50.0.2/build-browser/dist/bin/xpcshell -g /build/firefox-IqR6jJ/firefox-50.0.2/build-browser/dist/bin/ -a /build/firefox-IqR6jJ/firefox-50.0.2/build-browser/dist/bin/ -f /build/firefox-IqR6jJ/firefox-50.0.2/toolkit/mozapps/installer/precompile_cache.js -e precompile_startupcache("resource://gre/");
> GLib (gthread-posix.c): Unexpected error from C library during 'malloc': Cannot allocate memory.
> Aborting.

This happens because the kernel always maps the memory from the top, ignoring the hint, when calling mmap() and therefore the allocator will never get a block of memory in the desired region. The jemalloc code will most likely also require the custom allocator.

Btw, isn't the performance of the allocator in this fix rather bad?
Ok, I finally managed to get Firefox build on Linux/sparc64. Will post my patch to #1275204 [1].

> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1275204
Duplicate of this bug: 1250400
You need to log in before you can comment on or make changes to this bug.