Closed
Bug 1336155
Opened 8 years ago
Closed 8 years ago
Update builders to rust 1.15
Categories
(Firefox Build System :: General, defect)
Firefox Build System
General
Tracking
(firefox54 fixed)
RESOLVED
FIXED
mozilla54
Tracking | Status | |
---|---|---|
firefox54 | --- | fixed |
People
(Reporter: rillian, Assigned: rillian)
References
Details
Attachments
(3 files)
Rust 1.15.0-stable is released today. Update our integration build config to use it for Firefox 54.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 3•8 years ago
|
||
mozreview-review |
Comment on attachment 8833017 [details]
Bug 1336155 - Update rust repack script for 1.15.0 stable.
https://reviewboard.mozilla.org/r/109254/#review110400
Attachment #8833017 -
Flags: review?(mshal) → review+
Comment 4•8 years ago
|
||
mozreview-review |
Comment on attachment 8833018 [details]
Bug 1336155 - Update builders to rust 1.15.0 stable.
https://reviewboard.mozilla.org/r/109256/#review110402
These look fine to me, but it seems the linux32 builds don't agree. Any idea what's up with those?
Assignee | ||
Comment 5•8 years ago
|
||
The link is failing:
> /home/worker/workspace/build/src/gcc/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.4/../../../../x86_64-unknown-linux-gnu/bin/ld: read-only segment has dynamic relocations.
> collect2: error: ld returned 1 exit status
I have no insight beyond that. I can reproduce locally (with gold from fedora 25) so I don't think it's an issue with our packaged linker. Seems to be a regression from 1.14. Someday I'll get around to testing all platforms before release...
I suggest we just keep linux32 on rust 1.14 for now, removing it from the patch.
Comment 6•8 years ago
|
||
I suggest we try to find what's wrong first. In all likeliness, something is wrong in one of rust's static libraries. Push another try with -Wl,-z,text removed from LDFLAGS (it's added in old-configure.in and js/src/old-configure.in). It's going to be green, and will allow to look at the resulting libxul.so to see what ld is complaining about when -Wl,-z,text is passed. (LD sucks for not having more useful errors in that case)
Comment 7•8 years ago
|
||
Specifically, we're not in a rush to use this new rust release on the very day it's released, especially if we end up not using it on all platforms.
Comment 8•8 years ago
|
||
1.15 has stable procedural macros, which are the new and only way to derive impls for serde in its 0.9 version. My pull requests are pretty much ready to go for the serde bump. But serde_codegen doesn't exist anymore for serde 0.9, so to not block people working on WR with Gecko, we need Rust 1.15 here.
So it would be great if that bump could be done fast, I don't like serde bumps to linger for long.
Comment 9•8 years ago
|
||
(In reply to Anthony Ramine [:nox] from comment #8)
> 1.15 has stable procedural macros, which are the new and only way to derive
> impls for serde in its 0.9 version. My pull requests are pretty much ready
> to go for the serde bump. But serde_codegen doesn't exist anymore for serde
> 0.9, so to not block people working on WR with Gecko, we need Rust 1.15 here.
The suggestion in comment 5 wouldn't help you in any way.
Comment 10•8 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #9)
> The suggestion in comment 5 wouldn't help you in any way.
I'm pretty sure I didn't imply that.
Comment 11•8 years ago
|
||
FWIW I've never seen that before, much less anywhere in in the rust-lang/rust issue tracker. I'd be quite interested to learn about the fix though!
The biggest change I can think of is that we're compiling compilers with a new build system, but other than that no change leaps to mind in 1.15.0. Even that change itself I wouldn't know why would cause a bug such as this...
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 15•8 years ago
|
||
I've not sure what glandium was looking for in comment 6, but I pushed to try with the -Wl,-z,text ("Treat DT_TEXTREL in shared object as error.") option removed. The link succeeds, but the build subsequently fails with
> TEST-UNEXPECTED-FAIL | check_textrel | We do not want text relocations in libraries and programs
Which confirms it's a text segment relocation, but I don't see any information about which symbol has the relocation. I'll try running that check locally.
There's also an xpcshell crash which might tell us something. How do I put symbols to the stack dump in the log?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=563f566da23d&selectedJob=74379135
Comment 16•8 years ago
|
||
Gah forgot about that check. Remove it so that the build doesn't fail.
The xpcshell crash is a red herring. It's a normal occurrence, and the log viewer is just confused by it.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 19•8 years ago
|
||
For posterity, here's how you check those things:
- Get the address of the end of the executable segment(s)
e.g. readelf -lW libxul.so | awk '$1 == "LOAD" && $7 != "RW" {a=$3;b=$6}END{print strtonum(a)+strtonum(b)}'
in the build I looked at, it's 61803364 (decimal, 0x3af0b64 hex)
- Find all relocations in the executable segment(s)
e.g. readelf -r libxul.so | awk 'strtonum("0x" $1) <= 61803364 && NF >= 3 && NF < 6'
in the build I looked at, that returns the following:
02af13cf 00000008 R_386_RELATIVE
02af1505 00000008 R_386_RELATIVE
02af150f 00000008 R_386_RELATIVE
02af18cb 00000008 R_386_RELATIVE
02af19f9 00000008 R_386_RELATIVE
02af1a03 00000008 R_386_RELATIVE
02af1d0c 00000008 R_386_RELATIVE
02af2164 00000008 R_386_RELATIVE
02af112b 00035002 R_386_PC32 00000000 fmax@GLIBC_2.1
02af1133 00024602 R_386_PC32 00000000 logb@GLIBC_2.0
02af11a3 000d2102 R_386_PC32 00000000 scalbn@GLIBC_2.0
02af11b7 000d2102 R_386_PC32 00000000 scalbn@GLIBC_2.0
02af11f9 000d2102 R_386_PC32 00000000 scalbn@GLIBC_2.0
02af1225 000d2102 R_386_PC32 00000000 scalbn@GLIBC_2.0
02af165e 0004eb02 R_386_PC32 00000000 fmaxf@GLIBC_2.1
02af1666 00048002 R_386_PC32 00000000 logbf@GLIBC_2.0
02af16c8 00021002 R_386_PC32 00000000 scalbnf@GLIBC_2.0
02af16dc 00021002 R_386_PC32 00000000 scalbnf@GLIBC_2.0
02af171a 00021002 R_386_PC32 00000000 scalbnf@GLIBC_2.0
02af1742 00021002 R_386_PC32 00000000 scalbnf@GLIBC_2.0
The important part is really the first column. For each of those, we can use addr2line -e libxul.so 0xaddr to see what symbol the relocation is happening in. (That requires downloading the crashreporter-symbols-full.zip and getting libxul.so.dbg from there)
Unfortunately, that only gave ??:? results, which suggests we don't have debuginfo for whatever this is happening in.
Next step is to look at the symbols table:
e.g. readelf -sW firefox/libxul.so.dbg | sort -k 2
After correlating with the addresses from the relocations, the culprits are:
118343: 02af1f90 1105 FUNC LOCAL DEFAULT 12 __mulsc3
151843: 02af1640 986 FUNC LOCAL DEFAULT 12 __divsc3
240176: 02af1b30 1113 FUNC LOCAL DEFAULT 12 __muldc3
240995: 02af10e0 1117 FUNC LOCAL DEFAULT 12 __divdc3
Those are things that come from libgcc.a. If I look at current linux32 nightly, those symbols are also there too. But I'm pretty sure before we were using rust, we were pulling them from libgcc.so.
Looking at the rust release:
readelf -sW rust-1.15.0-i686-unknown-linux-gnu/rust-std-i686-unknown-linux-gnu/lib/rustlib/i686-unknown-linux-gnu/lib/libcompiler_builtins-b4ef8c9d93bc879a.rlib | grep '__\(div\|mul\)[ds]c3'
10: 00000000 1117 FUNC GLOBAL HIDDEN 4 __divdc3
10: 00000000 986 FUNC GLOBAL HIDDEN 4 __divsc3
10: 00000000 1113 FUNC GLOBAL HIDDEN 4 __muldc3
10: 00000000 1105 FUNC GLOBAL HIDDEN 4 __mulsc3
They were there in 1.14:
readelf -sW rust-1.14.0-i686-unknown-linux-gnu/rust-std-i686-unknown-linux-gnu/lib/rustlib/i686-unknown-linux-gnu/lib/libcompiler_builtins-f5a209a9.rlib | grep '__\(div\|mul\)[ds]c3'
21: 00000000 1189 FUNC GLOBAL HIDDEN 2 __divdc3
21: 00000000 1077 FUNC GLOBAL HIDDEN 2 __divsc3
20: 00000000 1129 FUNC GLOBAL HIDDEN 2 __muldc3
20: 00000000 1121 FUNC GLOBAL HIDDEN 2 __mulsc3
Note the size differences.
So, to me, there are two problems here:
- In all likeliness, the entire libcompiler_builtins library is not built with -fPIC in 1.15.
- It looks like libcompiler_builtins is essentially libgcc, and it feels like we should be using libgcc instead.
Comment 20•8 years ago
|
||
> But I'm pretty sure before we were using rust, we were pulling them from libgcc.so.
Confirmed with 45.7.0esr:
$ readelf --dyn-sym firefox/libxul.so | grep '__\(div\|mul\)[ds]c3'
383: 00000000 0 FUNC GLOBAL DEFAULT UND __muldc3@GCC_4.0.0 (23)
1545: 00000000 0 FUNC GLOBAL DEFAULT UND __divsc3@GCC_4.0.0 (23)
3588: 00000000 0 FUNC GLOBAL DEFAULT UND __divdc3@GCC_4.0.0 (23)
3622: 00000000 0 FUNC GLOBAL DEFAULT UND __mulsc3@GCC_4.0.0 (23)
Comment 21•8 years ago
|
||
Thanks for the investigation! Apparently me from a long time ago [1] thought that 32-bit didn't need -fPIC, but I can't remember why at this point. I've re-enabled this [2], published a new version, and send a PR upstream [3]. Once that lands we can backport to beta as well. We've also already got a 1.15.1 point release coming out soon, so if urgent we can put this in there as well.
[1]: https://github.com/alexcrichton/gcc-rs/commit/362bdf20
[2]: https://github.com/alexcrichton/gcc-rs/commit/7d37a9a6
[3]: https://github.com/rust-lang/rust/pull/39523
Comment 22•8 years ago
|
||
(In reply to Alex Crichton [:acrichto] from comment #21)
> We've also already got a 1.15.1 point release coming out soon,
> so if urgent we can put this in there as well.
Not exactly urgent but we’d like to upgrade Servo/Stylo/WebRender to serde 0.9 with Macros 1.1, and doing so without waiting for 1.16 (and of course without breaking Firefox) would be nice.
Comment 23•8 years ago
|
||
Can we do something on Firefox's end to use the system libgcc on Linux, instead of that rlib?
Flags: needinfo?(acrichton)
Comment 24•8 years ago
|
||
Unfortunately I don't think there's a way to say "don't link compiler-rt". In general that's not expected to work in all situations anyway, but it's likely to work for "standard" situations.
Flags: needinfo?(acrichton)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Attachment #8833428 -
Flags: review?(mshal)
Assignee | ||
Comment 28•8 years ago
|
||
I'm doing a custom build of 1.15.0 with the -fPIC change applied. This version doesn't work because I forgot I can't use my normal build environment because of GLIBC version conflicts with the build machines. I'll redo in the docker container and update the patch again.
Comment 29•8 years ago
|
||
(In reply to Ralph Giles (:rillian) | needinfo me from comment #28)
> I'm doing a custom build of 1.15.0 with the -fPIC change applied. This
> version doesn't work because I forgot I can't use my normal build
> environment because of GLIBC version conflicts with the build machines. I'll
> redo in the docker container and update the patch again.
We should wait for 1.15.1, assuming it gets the fix.
Comment 30•8 years ago
|
||
1.15.1 will not have that fix. We learned about this right now, and the release is going out right now.
Comment 31•8 years ago
|
||
Note that currently we're on track to release 1.15.1 today and the PR to fix this change (https://github.com/rust-lang/rust/pull/39523) hasn't landed yet on master. We can likely consider a 1.15.2 release, but if Gecko's got custom builds locally anyway perhaps the patch could be applied locally for now? We can be sure to get this fix in for 1.16.0 regardless, though.
Comment 32•8 years ago
|
||
We're holding of on 1.15.1 to gather more information about this bug from glandium.
Comment 33•8 years ago
|
||
Mike, do you know why Firefox's build hits this but we haven't seen it elsewhere? Is there some linker setting we don't have coverage for?
Comment 34•8 years ago
|
||
I ask partially because I don't understand the likely impact on this bug for non-Firefoxes.
Assignee | ||
Comment 35•8 years ago
|
||
I'm not glandium, but per above, we pass -Wl,-z,text to the linker, which is "Treat DT_TEXTREL in shared object as error." Adding that to a test build would give you coverage. We also do a separate pass to check the linker output for the same issue. I don't understand the impact though.
Comment 36•8 years ago
|
||
(In reply to Brian Anderson from comment #33)
> Mike, do you know why Firefox's build hits this but we haven't seen it
> elsewhere? Is there some linker setting we don't have coverage for?
Firefox builds a staticlib with the Rust runtime and then links that staticlib into a dynamic library with `-Wl,-z,text`, which forbids text relocations. Are you not building something with `-Wl,-z,-text`?
Assignee | ||
Comment 37•8 years ago
|
||
Bug 388971 mentions problems with selinux. The Gentoo wiki cites load time. I guess it's both in that (a) doing the relocations every time a dynamic object is loaded takes time and (b) if the loader can't mark the pages read-only after applying the relocation table. FWIW. https://wiki.gentoo.org/wiki/Hardened/Position_Independent_Code_internals#Relocations_in_the_TEXT_segment_of_shared_libraries_used_by_dynamically_linked_executables
Assignee | ||
Comment 38•8 years ago
|
||
According to Ted, we originally added these checks because building without -fPIC on x86_64 caused breakage, and we didn't have test coverage on that platform at the time. Unlike x86_64, it's optional on i686-linux, in the sense that the loader will make it work, which explains how Rust's test coverage didn't find any issue. We kept the requirement in Firefox for the security, startup time, and run-time memory usage advantages.
Comment 39•8 years ago
|
||
Thanks for the clarifications rillian.
Comment 40•8 years ago
|
||
Comment on attachment 8833018 [details]
Bug 1336155 - Update builders to rust 1.15.0 stable.
Clearing review for now, since it sounds like we'll need 1.15.1 or later. Feel free to flag me again when it's ready.
Attachment #8833018 -
Flags: review?(mshal)
Comment 41•8 years ago
|
||
Actually, the linker won't create text relocations on x86_64, so those checks weren't added for that. They were added explicitly with x86 in mind.
For the rest, Ralph's comments are pretty much covering it. Another concern is memory usage, where text pages are dirtied by being relocated, and can't be shared across processes.
Comment 42•8 years ago
|
||
We're going to release a fix for this with 1.15.1, aiming for release tomorrow. Thanks for helping us sort it out, all.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Attachment #8833018 -
Flags: review?(mshal)
Assignee | ||
Updated•8 years ago
|
Attachment #8833428 -
Flags: review?(mshal)
Assignee | ||
Comment 46•8 years ago
|
||
Comment on attachment 8833018 [details]
Bug 1336155 - Update builders to rust 1.15.0 stable.
Custom build works, so I'd like to go ahead and land it. Will bump to 1.15.1 when it's available.
Attachment #8833018 -
Flags: review?(mshal)
Comment 47•8 years ago
|
||
mozreview-review |
Comment on attachment 8833018 [details]
Bug 1336155 - Update builders to rust 1.15.0 stable.
https://reviewboard.mozilla.org/r/109256/#review112578
Seems reasonable to me.
Attachment #8833018 -
Flags: review?(mshal) → review+
Comment 48•8 years ago
|
||
mozreview-review |
Comment on attachment 8833428 [details]
Bug 1336155 - Update linux32 rust builders to use -fPIC.
https://reviewboard.mozilla.org/r/109658/#review112580
Attachment #8833428 -
Flags: review?(mshal) → review+
Comment 49•8 years ago
|
||
Rust 1.15.1 is out.
Comment 50•8 years ago
|
||
Pushed by rgiles@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/668b30061518
Update rust repack script for 1.15.0 stable. r=mshal
https://hg.mozilla.org/integration/autoland/rev/8f30dbd7f51c
Update builders to rust 1.15.0 stable. r=mshal
https://hg.mozilla.org/integration/autoland/rev/62adf4084bae
Update linux32 rust builders to use -fPIC. r=mshal
Comment 51•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/668b30061518
https://hg.mozilla.org/mozilla-central/rev/8f30dbd7f51c
https://hg.mozilla.org/mozilla-central/rev/62adf4084bae
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox54:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Updated•7 years ago
|
Product: Core → Firefox Build System
You need to log in
before you can comment on or make changes to this bug.
Description
•