1439570 - Run JS test suite on ARM64 hardware

Reporter

Description

•

7 years ago

Currently we run the JS test suite on the ARM64 simulator. The simulator is pretty good, but there are aspects of the hardware it does not simulate accurately (atomics; instruction cache non-coherence; fault handling; address space layout) and the first run of the test suite on hardware found bugs (bug 1430743), there are other bugs that have been observed on hardware as well. An ARM64 VM running on top of x64 hardware is not likely to be a substantial improvement over the simulator. We should therefore run tests on actual ARM64 hardware; just running the JS shell tests (js/src/tests, js/src/jit-tests; jsapi-tests) would be a good start. We should run multiple configurations but there's no Ion JIT so configs will be nonstandard (--no-baseline; --baseline-eager; defaults). We should run debug and release builds. The platform doesn't need to be Android; Linux should be OK for now. A Raspberry Pi 3 might be OK + is cheap but is slow; I use a SoftIron Overdrive 1000 dev system which is faster but more expensive and also runs OpenSUSE, which is somewhat painful.

(not currently active) Ted Mielczarek

Comment 1

•

7 years ago

This doesn't belong in the Build Config component, but I'm not sure where the right place to put it is, so I'm going to stick it in RelEng for now. Any solution involving physical hardware is probably not going to be tractable--we had Pandaboards (TI arm dev boards) racked in a datacenter years ago and they were quite the headache, and the useful lifespan of physical hardware just isn't that great. Even if we could acquire the several hundred to 1000+ machines we'd need to run tests that keep up with our CI volume, it would take us months to a year to get them installed and supported, and they'd almost certainly be obsolete within 2 years. That being said, there are several providers offering ARMv8 cloud computing: https://www.scaleway.com/armv8-cloud-servers/ https://www.packet.net/bare-metal/servers/type-2a/ are two I've seen before. Those both use Cavium ThunderX CPUs: https://cavium.com/product-thunderx-arm-processors.html If that would work for JS testing purposes then I think that's a thing we could do.

Component: Build Config → Platform Support

Product: Core → Release Engineering

QA Contact: catlee

Lars T Hansen [:lth]

Reporter

Comment 2

•

7 years ago

Anything that runs on real hardware is fine; virtualization should not in itself be a hindrance, only emulation on top of another architecture.

Chris Peterson [:cpeterson]

Updated

•

7 years ago

Whiteboard: [geckoview:crow]

Gregory Szorc [:gps]

Comment 3

•

7 years ago

This feels like it will need Taskcluster platform support. Over to Coop for triage.

Flags: needinfo?(coop)

Chris Peterson [:cpeterson]

Comment 4

•

7 years ago

I'm talking to jmaher about our options for aarch64 cloud testing or real devices.

Chris Cooper [:coop] (he/him)

Comment 5

•

7 years ago

We've had two new requests in the past week for packet.net capacity. cc-ing Jonas who has been spearheading that effort. I want to be clear that our work with packet.net is still very much at the prototype stage. We still need to design and implement docker-engine support for the tc-worker in order to run real workloads. We also don't have provisioner support for packet.net yet, so any worker pool would need to be provisioned statically. As Joel (already cc-ed) will tell you, switching to a new machine or instance type is only one step in the process. If someone on the JS team is available to perform validation of the tests, the Taskcluster team (Jonas or someone else) can help you get setup with an instance or two to verify that your tests will, in fact, run in packet.net. From there, you can start getting a baseline of results for fixing/disabling failing specific test cases.

Flags: needinfo?(coop)

Lars T Hansen [:lth]

Reporter

Comment 6

•

7 years ago

I can certainly help out with anything from the JS team side, or corral suitable help for ditto.

Geoff Brown [:gbrown]

Updated

•

7 years ago

Comment 7

•

7 years ago

This sounds like you want per-task docker containers on arm hardware. We don't have a docker-engine for tc-worker quite ready yet. But ckousik (awesome contributor) have been working on one, and me + wcosta have plans to talk to him Friday and see if we can figure something out. It's also possible that we can deploy docker-worker on packet, or that we simply use a tc-worker configuration without any task isolation. This assumes that you guys are happy with running a command within a docker image, having task fail/succeed depending on exit code, and logs uploaded, but otherwise with no or very limited support for artifacts. We have no dynamic provisioning, but that might be okay, depending on the load. Similarly, we have no caching of artifacts in packet yet either, which might incur notable bandwidth cost if tests tasks are heavily chunked. As we would be paying 0.12 USD/GB for download. (Note: when we first when multi-region in EC2 cross region transfer at 0.02USD/GB dominated our EC2 bill pretty quickly -- so this might be worth back of envelope math, just to be sure).

Lars T Hansen [:lth]

Reporter

Updated

•

7 years ago

Depends on: 1440330

Lars T Hansen [:lth]

Reporter

Updated

•

7 years ago

No longer blocks: Rabaldr-ARM64

Nobody; OK to take it and work on it

Assignee

Updated

•

7 years ago

Component: Platform Support → Buildduty

Product: Release Engineering → Infrastructure & Operations

Chris Peterson [:cpeterson]

Comment 8

•

6 years ago

Lars, we are standing up jittests (bug 1475648) to run on ARM64 builds on Google Pixel 2 devices in Bitbar's device farm. Will that be adequate to address this request for testing on real ARM64 hardware? [geckoview:fxr:p2] because Firefox Reality 1.0 will not include ARM64 support.

Flags: needinfo?(lhansen)

Whiteboard: [geckoview:crow] → [geckoview:fxr:p2]

Chris Peterson [:cpeterson]

Updated

•

6 years ago

Comment 9

•

6 years ago

Testing on Pixel2 should be a major improvement over the current situation and it's a mainstream platform, so yes, I think that should satisfy the request for testing on real ARM64 hardware.

Flags: needinfo?(lhansen)

Bob Clary [:bc] (inactive)

Comment 10

•

6 years ago

The jit tests have been running on Android 8.0 Pixel2 AArch64 for mozilla-central opt as a tier-3 job for some time. <https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&superseded=&tier=1,2,3&searchStr=Android,8.0,Pixel2,AArch64,opt,jit> We have long standing failures in jit5,jit6,jit10 that haven't been addressed. Can we call this resolved and move on?

Flags: needinfo?(lhansen)

Lars T Hansen [:lth]

Reporter

Comment 11

•

6 years ago

(In reply to Bob Clary [:bc:] from comment #10) > Can we call this resolved and move on? Works for me. Chris?

Flags: needinfo?(lhansen) → needinfo?(cpeterson)

Chris Peterson [:cpeterson]

Comment 12

•

6 years ago

(In reply to Bob Clary [:bc:] from comment #10) > We have long standing failures in jit5,jit6,jit10 that haven't been > addressed. > > Can we call this resolved and move on? OK, since we have bug 1475648 on file for those jit5/6/10 test failures.

Status: NEW → RESOLVED

Closed: 6 years ago

Flags: needinfo?(cpeterson)

Resolution: --- → FIXED

Chris Peterson [:cpeterson]

Updated

•

6 years ago

Blocks: arm64-baseline-tests

Chris Peterson [:cpeterson]

Updated

•

6 years ago

Updated

•

5 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard

Bugzilla

Run JS test suite on ARM64 hardware

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P3)

Tracking

(Not tracked)

People

(Reporter: lth, Unassigned)

References

Details

(Whiteboard: [geckoview:fxr:p2])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Updated

Updated

Updated

Comment 8

Updated

Comment 9

Comment 10

Comment 11

Comment 12

Updated

Updated

Updated