Test Firefox with Rust beta and nightly toolchains

NEW
Unassigned

Status

()

Core
Build Config
6 months ago
4 days ago

People

(Reporter: rillian, Unassigned)

Tracking

(Depends on: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

6 months ago
Several times we've found a portability problem with one of our tier-1 platforms when we update to a new Rust stable release. E.g. Rustbuild dropping -fPIC on i686-linux (bug 1336155, https://github.com/rust-lang/rust/pull/39523) and armv7-linux-androideabi requiring neon (bug 1323773)

We should run some kind of period integration test against forthcoming releases: rust nightly builds, and especially each beta release to detect these problems sooner.
(Reporter)

Updated

6 months ago
Blocks: 1135640
glandium's working on bug 1313111, which will make things like this a lot more tractable.
Comment hidden (mozreview-request)
(Reporter)

Comment 3

6 months ago
Linux32 builds fail on rust 1.16.0-beta.1 with the same -fPIC issue 1.15.0 had. This should be fixed in the next beta release.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ab06245bb10c
FWIW rust 1.16.0-beta.2 is  out now which should have the -fPIC issue fixed
Comment hidden (mozreview-request)
(Reporter)

Comment 6

6 months ago
Thanks Alex. Confirmed 1.16.0-beta.2 passes our integration test. https://treeherder.mozilla.org/#/jobs?repo=try&revision=4d7f94486456
Comment hidden (mozreview-request)
(Reporter)

Comment 8

6 months ago
1.160.0-beta.3 is also fine. https://treeherder.mozilla.org/#/jobs?repo=try&revision=c195c1feb15db85ac37abcf68646830f8059bfc0
Thanks for testing Ralph!
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Reporter)

Comment 14

5 months ago
Build remains green with 1.16.0-beta.4
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d399ba37eef6e29392e0297fad600ba4d8cd559e

Comment 15

5 months ago
Thanks for testing rillian!
(Reporter)

Comment 16

5 months ago
Otoh, we seem to have a problem with 1.17-nightly on MacOS:

> 15:26:55     INFO -  checking rustc version...
> 15:26:55     INFO -  DEBUG: Executing: `/builds/slave/try-m64-0000000000000000000000/build/src/rustc/bin/rustc --version --verbose`
> 15:26:55     INFO -  DEBUG: The command returned non-zero exit status -11.
> 15:26:55     INFO -  ERROR: Command `/builds/slave/try-m64-0000000000000000000000/build/src/rustc/bin/rustc --version --verbose` failed with exit status -11.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8de49f284c13a6d702ef60906752eb76b3505fa4

I see this problem with the March 12 build, and with a March 3 build from last week. Did anything change in the build environment between 1.16 and 1.17 which would prevent the binaries from running on MacOS 10.7? I know you've been moving things from buildbot to docker and travis. Maybe a newer MACOS_DEPLOYMENT_TARGET?
Flags: needinfo?(acrichton)
Ah yes I believe that in migration to the new build system we forgot that env var, here's an issue for it https://github.com/rust-lang/rust/issues/40481

I'll try to have it fixed soon (and we'll need to backport this to the soon-to-be-released 1.17.0 beta).
Flags: needinfo?(acrichton)
(Reporter)

Comment 18

5 months ago
Thanks, Alex! I'll keep an eye on it.
Comment hidden (mozreview-request)
(Reporter)

Updated

5 months ago
Attachment #8846740 - Attachment is obsolete: true
(Reporter)

Updated

5 months ago
Attachment #8846741 - Attachment is obsolete: true
(Reporter)

Comment 20

5 months ago
rust 1.17.0-beta.2 is still failing like #16 on macOS, even though that build should have the fix from rust-lang/rust#40600.

Alex, any further ideas here? I thought I tested 1.17 nightly, but it's not written down here, so maybe not. Perhaps the change to `MACOSX_DEPLOYMENT_TARGET=10.8` (rust-lang/rust#40482) is insufficient? The failing builds are on macOS 10.7. I'll try 1.18-nightly to confirm.
Flags: needinfo?(acrichton)
Oh right I think Gecko *builds* on 10.7, right? As opposed to building on a newer platform and targeting 10.7?

If that's the case then we may be running out of luck unfortunately. LLVM no longer builds when using a newer toolchain (like the one we're using on Travis) with MACOSX_DEPLOYMENT_TARGET=10.7. I haven't looked into what would be necessary to rectify the situation, but can you confirm that you're attempting to run rustc on a 10.7 mac? (that at least I'm pretty sure is likely to fail)
Flags: needinfo?(acrichton)
That's correct - our OSX build machines are running 10.7.
(Reporter)

Comment 23

5 months ago
Thanks for confirming, Ryan. Our build machines are running macOS 10.7, but we *target* 10.9 since Firefox 49. So we need a toolchain which can run in the 10.7 build environment.

It may be we can race a gecko requirement for rust >= 1.17.0 with a ci upgrade. IIRC the build machines are on 10.7 because:

 - We needed to maintain a pool of 10.7 machines for the Firefox extended support releases, and there were concerns about splitting the available hardware pool.

 - We couldn't obtain resources to re-image them.

 - We want to transition to linux-hosted cross-compile builds for macOS targets.

The last extended support release targeting macOS 10.6 is Firefox 45.9 esr, scheduled for April 18th. It may be after that we can upgrade the build environment. I don't know the status of the taskcluster mac build promotion to tier-1. Amy, could you please comment on relative timelines for the three paths? It's been about a year since bug 1269798, and this will become a blocker for quantum in a few months.
Flags: needinfo?(arich)

Comment 24

5 months ago
>  - We want to transition to linux-hosted cross-compile builds for macOS
> targets.
> 

Currently we are investigating a performance regression using the TaskCluster cross compiled builds.  This investigation will help us decide if we can continue with our use of cross compiled builds in the near future or if we need to continue building on mac hardware.  ted and wcosta are digging into this.  By the end of this week we hope to have some better answers if we are going to be investing more effort into these builds or do we need to plan an interim solution of using mac builds.  I can update this bug once we have some more information (EOW).
Flags: needinfo?(arich)
See Also: → bug 1338651
(Reporter)

Comment 25

5 months ago
Great, thanks.
(Reporter)

Comment 26

5 months ago
Confirming that today's rustc 1.17.0-nightly (ccce2c6eb 2017-03-27) fails the same way, consistent with #21. https://treeherder.mozilla.org/#/jobs?repo=try&revision=3ff004ab7603ba623237dfaefa25836251c52006
I will resume investigation into why we compile rustc for 10.8 soon. Hopefully there's some escape hatch to use to avoid blocking on upgrading infrastructure.
Ok I've done some more testing on our end. The state of play is that I've found is that we specifically cannot compile LLVM for the 10.7 target from the current OSX image that we're using. We apparently are certainly not the first (https://github.com/JuliaLang/julia/issues/19762) to have run into this issue either.

This is a regression on our end. We haven't changed LLVM versions in a long time, but we've changed infrastructure. It turns out that the Xcode version we're using generates this error, but *older* Xcode versions do not generate the error. Travis, where we build our releases, has the ability to switch Xcode versions so I tested out a few:

* Xcode 6.4 - CMake was too old to compile LLVM
* Xcode 7 - compiled LLVM successfully
* Xcode 8+ - failed to compile LLVM

So from our end we could fix this regression by switching to compiling the Rust compiler with Xcode 7. Unfortunately though this version comes with lldb 350 which means that we can't run any of our debuginfo/lldb tests. These would otherwise regress fairly often and are sometimes difficult to fix, so that's not something we'd like to do lightly just yet.

So from the rust-lang/rust end we have a few options to fix this regression:

* Switch to xcode 7, turn off our LLDB tests
* Compile releases with xcode 7, leave nightly on xcode 8
* Continue compiling releases with buildbot instead of Travis

To test the waters, what's the sense of urgency with fixing this regression? It sounds like it will specifically prevent Gecko from upgrading rustc version until it is fixed. We could fix it on our end with one of the above strategies (none of which are great unfortunately). If Gecko is soon to be cross-compiled however then this should become a non-issue anyway. 

Does that all make sense? Are there perhaps other opinions about how to best fix this?
I just talked with Brian and we believe we have a solution for this, I will send PRs to rust-lang/rust shortly
I'm hoping that https://github.com/rust-lang/rust/pull/40967 will solve this regression. If that lands I'll backport it to beta and we'll get a new beta out soon.
That change is now being backported to beta as well in https://github.com/rust-lang/rust/issues/40995. That doesn't bump the beta version just yet, but I suspect we will do so soon.
Ok 1.17.0 beta 3 is out, Ralph mind testing it to see if it works?
(Reporter)

Comment 33

5 months ago
Looks like 1.17.0-beta.3 restored macOS 10.7 support. Thanks!

https://treeherder.mozilla.org/#/jobs?repo=try&revision=28bd7587a1bfd350b8d13718eb7f1d6057b88738
See Also: → bug 1321847
(Reporter)

Comment 34

5 months ago
rustc 1.18.0-nightly (91ae22a01 2017-04-05) also works on try.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=93d2c11642d6df87cae96f2362fc1d7962dd0780

Last week we had an issue with a unit test failure on 1.18 opt builds; we were checking for a null fn() passed over ffi, which the rust compiler now optimizes away. This was discussed in rust-lang/rust#40913 and bug 1351497. Lang-team consensus was that this was intended behaviour, and :kinetik fixed our code and binding generator to use `Option<fn>` with the nullptr-optimization instead. Thanks also to :kinetik for explaining to me why we want fn() to be non-null even in unsafe rust. :) Those fixes have landed in gecko now.
See Also: → bug 1354994

Comment 35

4 months ago
(In reply to Greg Arndt [:garndt] from comment #24)
> >  - We want to transition to linux-hosted cross-compile builds for macOS
> > targets.
> > 
> 
> Currently we are investigating a performance regression using the
> TaskCluster cross compiled builds.  This investigation will help us decide
> if we can continue with our use of cross compiled builds in the near future
> or if we need to continue building on mac hardware.  ted and wcosta are
> digging into this.  By the end of this week we hope to have some better
> answers if we are going to be investing more effort into these builds or do
> we need to plan an interim solution of using mac builds.  I can update this
> bug once we have some more information (EOW).

We have suspended efforts of trying to improve the timings with the cross compiled build to stand up the buildbot builds and have them scheduled by taskcluster.  Once that's complete, we will spend some time trying to investigate the regressions because in the long term we want to use these builds.  I do not have an ETA of when that will be resolved.  We've investigated most things that we have thought of so far without any luck.  If you have any ideas, you can contact wcosta in #taskcluster.
(Reporter)

Comment 36

4 months ago
Thanks for the update Greg.

This week's rust nightly still green. https://treeherder.mozilla.org/#/jobs?repo=try&revision=febfddd1ebbb021bddd05e6437cdc313cf965556
Comment hidden (mozreview-request)
(Reporter)

Comment 38

4 months ago
1.18.0-beta.1 looks like it's ready to go. https://treeherder.mozilla.org/#/jobs?repo=try&revision=8f4f72d66442
Comment hidden (mozreview-request)
(Reporter)

Updated

3 months ago
Blocks: 1365300
(Reporter)

Comment 40

3 months ago
1.18.0-beta.2 fails building stylo's gecko bindings. I've filed rust-lang/rust#42042 for the ICE and bug 1365300 for stylo tracking and possible work-arounds. https://treeherder.mozilla.org/#/jobs?repo=try&revision=86ebd4aa3836&selectedJob=99289427
(Reporter)

Comment 41

3 months ago
rustc 1.19.0-nightly (75b056812 2017-05-15) hits rust-lang/rust#41620 in the stylo code (new warning deprecating float literals in match patterns) but it looks like Simon is aware of the issue, so I think this will be resolved without action on the gecko side (either 1.19 will drop the warning or servo will change their code).
(Reporter)

Updated

3 months ago
Depends on: 1367932
(Reporter)

Updated

3 months ago
Depends on: 1367934
(Reporter)

Comment 42

3 months ago
No issues with 1.18.0-beta.4.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1d298f8ecb132014fb181f0fea2540573733a6c2
(Reporter)

Comment 43

2 months ago
No issues with 1.19.0-beta.2.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=930b19545e03acca9fc0ba07d969d1d18ced815e
(Reporter)

Updated

2 months ago
Depends on: 1376010
Comment hidden (mozreview-request)
Also no issues with 1.19.0-beta.3 and beta.4.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7491ff82a99799c393d515d77f4037f7fc4409a8
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c32b7fa6ec659e7ba4ac7ff898b3095dae9e5d84
With 1.20.0-beta.1, we get a llvm-dsymutil crash:
https://public-artifacts.taskcluster.net/DPiAQgT5SNqfF2tlmkG5Lw/0/public/logs/live_backing.log
(In reply to Mike Hommey [:glandium] from comment #46)
> With 1.20.0-beta.1, we get a llvm-dsymutil crash:
> https://public-artifacts.taskcluster.net/DPiAQgT5SNqfF2tlmkG5Lw/0/public/
> logs/live_backing.log

This /could/ be bug 1381043
(In reply to Mike Hommey [:glandium] from comment #47)
> (In reply to Mike Hommey [:glandium] from comment #46)
> > With 1.20.0-beta.1, we get a llvm-dsymutil crash:
> > https://public-artifacts.taskcluster.net/DPiAQgT5SNqfF2tlmkG5Lw/0/public/
> > logs/live_backing.log
> 
> This /could/ be bug 1381043

It is.
(Reporter)

Updated

18 days ago
Depends on: 1386414
Comment hidden (mozreview-request)
You need to log in before you can comment on or make changes to this bug.