Build Stylo for 32-bit Linux on automation

NEW
Assigned to

Status

()

Core
CSS Parsing and Computation
P2
normal
2 months ago
15 hours ago

People

(Reporter: froydnj, Assigned: shinglyu)

Tracking

(Blocks: 1 bug)

Trunk
All
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments)

(Reporter)

Description

2 months ago
Stylo for 32-bit Linux should Just Work in a normal cross-compile-from-linux64 situation, but with our automation, something is preventing clang from getting the correct headers in a cross-compile situation.  We should fix that if possible.
Blocks: 1356991
No longer blocks: 1243581
Priority: -- → P2
We aren't going to block Stylo built-but-disabled on linux32 support.
Blocks: 1330412
No longer blocks: 1356991
Blocks: 1345321, 1356991
No longer blocks: 1330412
The things using bindgen probably just need to call like `clang_arg("-m32")`:
https://docs.rs/bindgen/0.25.1/bindgen/struct.Builder.html#method.clang_arg

I poked around at bindgen/libclang recently. It would be nice if this was something bindgen could handle automatically, like if you could pass it the target and it could detect when you're targeting x86 but building on x86-64.
(Reporter)

Comment 3

2 months ago
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #2)
> The things using bindgen probably just need to call like `clang_arg("-m32")`:
> https://docs.rs/bindgen/0.25.1/bindgen/struct.Builder.html#method.clang_arg

We do this, and it still does not fix the problem.  clang is still picking up the 64-bit headers, rather than the 32-bit ones.
(Reporter)

Comment 4

2 months ago
(In reply to Nathan Froyd [:froydnj] from comment #3)
> (In reply to Ted Mielczarek [:ted.mielczarek] from comment #2)
> > The things using bindgen probably just need to call like `clang_arg("-m32")`:
> > https://docs.rs/bindgen/0.25.1/bindgen/struct.Builder.html#method.clang_arg
> 
> We do this, and it still does not fix the problem.  clang is still picking
> up the 64-bit headers, rather than the 32-bit ones.

I should note here that clang itself includes the necessary 64-bit headers, but it's only our gcc package that includes the 32-bit headers.  clang has some logic to figure out where the gcc include paths are if it's installed alongside gcc, but it's not finding the gcc include directories and/or not understanding that it needs to look in the 32-bit directories as well.
So, I bet this is because this line[1] doesn't pass our custom clang flags.

Shouldn't be easy to add support for it, should I go for it and see if it fixes the problem? A different alternative is telling bindgen to not do the path fixup and pass the paths manually, but that can be cumbersome.

[1]: https://github.com/KyleMayes/clang-sys/blob/master/src/support.rs#L183
Err, shouldn't be hard, I mean, ofc
(In reply to Emilio Cobos Álvarez [:emilio] from comment #5)
> So, I bet this is because this line[1] doesn't pass our custom clang flags.
> 
> Shouldn't be easy to add support for it, should I go for it and see if it
> fixes the problem? A different alternative is telling bindgen to not do the
> path fixup and pass the paths manually, but that can be cumbersome.

In the meantime, Shing is also looking into this issue.
That would be great if you could quickly verify your fix or Shing is able to do that as well.
Assigning to Shing because Astley said he would look into this bug.
Assignee: nobody → shing.lyu
(Reporter)

Comment 9

a month ago
(In reply to Emilio Cobos Álvarez [:emilio] from comment #5)
> So, I bet this is because this line[1] doesn't pass our custom clang flags.
>
> Shouldn't be easy to add support for it, should I go for it and see if it
> fixes the problem? A different alternative is telling bindgen to not do the
> path fixup and pass the paths manually, but that can be cumbersome.

Why do we even need to fixup include paths in the first place?  That seems like a bad idea at first glance.  And we pass -m32 to the clang instance anyway, which should find the appropriate include paths too...
(Assignee)

Comment 10

a month ago
Emilio, so you are suggesting we pass our custom clang argument at [1]?
And could you also point me to the include path fixup code in bindgen? Thank you.

[1]: https://github.com/KyleMayes/clang-sys/blob/master/src/support.rs#L183
Flags: needinfo?(emilio+bugs)
(In reply to Nathan Froyd [:froydnj] from comment #9)
> Why do we even need to fixup include paths in the first place?  That seems
> like a bad idea at first glance.

Well, yeah, I agree a priori it's not a great idea, but long story short, libclang doesn't properly find system headers without it. See [1] for example, where other alternatives are discussed.

> And we pass -m32 to the clang instance
> anyway, which should find the appropriate include paths too...

We pass -m32 to bindgen's libclang instantiation, but not to the clang-sys code. That'd be the missing piece to make this work AFAICT.

(In reply to Shing Lyu [:shinglyu] from comment #10)
> Emilio, so you are suggesting we pass our custom clang argument at [1]?

Yes, we need to pass the BindgenContext's clang_argument options to that line of code.

> And could you also point me to the include path fixup code in bindgen? Thank
> you.

The fixup lives pretty much here[2], but now I see the code, I see we avoid it for builds that use `--target`. Presumably we should do the same for other arch-dependent flags? Or just adding an option to bindgen to not mess with the include path may be enough? We already have in ServoBindings.toml a line that reads:

    # To disable the fixup bindgen applies which adds search
    # paths from clang command line in order to avoid potential
    # conflict with -stdlib=libc++.
    "--target=x86_64-apple-darwin",

For the Mac builds.

[1]: http://lists.llvm.org/pipermail/cfe-dev/2017-June/054088.html
[2]: https://github.com/servo/rust-bindgen/blob/cabfac71fc454196656ff63e5cc761294896831d/src/lib.rs#L1041
Flags: needinfo?(emilio+bugs)
Summary: enable Stylo for 32-bit Linux by default on automation → Build Stylo for 32-bit Linux on automation
No longer blocks: 1356991
Blocks: 1356991
No longer blocks: 1345321
(Assignee)

Comment 12

25 days ago
I passed the clang_args down but got this error: 

https://treeherder.mozilla.org/logviewer.html#?job_id=110482069&repo=try&lineNumber=8485

Looks like it's the problem with Servo's style code, is this a known issue?
Flags: needinfo?(emilio+bugs)
Looks like a bindgen bug (we should be generating nsTArray<nsString> instead.

Could you try to get a reduced version of it? It's trying to generate this function: http://searchfox.org/mozilla-central/rev/fdb811340ac4e6b93703d72ee99217f3b1d250ac/layout/style/ServoBindings.cpp#1517

Also, if you have a build locally and can attach the bindings, that'd be awesome.

Also, does it happen if you remove this line? http://searchfox.org/mozilla-central/rev/fdb811340ac4e6b93703d72ee99217f3b1d250ac/layout/style/ServoBindings.toml#482
Flags: needinfo?(emilio+bugs)
(Assignee)

Comment 14

25 days ago
Removing the line in ServoBindings.toml doesn't seem to fix it: https://treeherder.mozilla.org/logviewer.html#?job_id=110840620&repo=try&lineNumber=8815

Building locally is a little bit tricky because I can't faithfully reproduce the linux32 cross compiling environment easily.

How would you suggest we reduce the issue? Are there ways we can test it without compiling everything?

Thanks :)
Flags: needinfo?(emilio+bugs)
(Assignee)

Comment 15

25 days ago
More errors after I rebase to bindgen 0.26.1 https://treeherder.mozilla.org/logviewer.html#?job_id=110881911&repo=try&lineNumber=6981
(Assignee)

Comment 16

21 days ago
Update: 

Since my patch seems to fix the header issue, I landed the clang-sys patch [0]. KyleMayes kindly bump the version number to 0.19.0, so we are ready to land the bindgen change [1]. Then after we update bindgen on m-c everyone can jump in and debug the bindgen bug mentioned in Comment 15.

[0]: https://github.com/KyleMayes/clang-sys/pull/57
[1]: https://github.com/servo/rust-bindgen/pull/781
Emilio bumped rust-bindgen to version 0.26.3 [1] to include Shing's Linux32 fix. (btw, neither v0.26.3 nor v0.26.2 are listed in rust-bindgen's releases page [2].)

Emilio or Shing, can you please update Servo's crate dependencies to use the new rust-bindgen v0.26.3 version? Servo is currently using v0.26.1.

[1] https://github.com/servo/rust-bindgen/commit/1bf0b8a40bbca6ffdfb1cdd116a3f3fa0cdc8f10
[2] https://github.com/servo/rust-bindgen/releases
[3] https://github.com/servo/servo/commit/e8db09766dd70ec0562da3e954f65fb4311962b9
(Assignee)

Comment 18

15 days ago
Looks like is not released yet? 

error: no matching version `^0.26.3` found for package `bindgen` (required by `style`)
location searched: registry https://github.com/rust-lang/crates.io-index
versions found: 0.26.2, 0.26.1, 0.26.0, ...
(In reply to Shing Lyu [:shinglyu] from comment #18)
> Looks like is not released yet? 
> 
> error: no matching version `^0.26.3` found for package `bindgen` (required
> by `style`)
> location searched: registry https://github.com/rust-lang/crates.io-index
> versions found: 0.26.2, 0.26.1, 0.26.0, ...

Seems to now be present, looks like it was published 6 hours after your comment here.
(Assignee)

Comment 20

14 days ago
Hmm, we can't merge it yet, initially I believe the version bump will unblock the 32-bit header issue, and enable us to fix another bindgen issue on Linux32, keeping Linux64 unharmed. But it seems this version bump will also break Linux64 build in a different way: 
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e4c621093543f6a97fe2bad6ad97906797c36648

I'm investigating.
Flags: needinfo?(emilio+bugs)
(Assignee)

Comment 21

14 days ago
Created attachment 8885101 [details] [diff] [review]
bindings.rs.patch

This is a diff of the generated bindings.rs from bindgen 0.26.1 v.s. 0.26.3, It seems to generate a few functions but didn't include the `use` for them. For example there are many functions that takes `nsStringRepr_*` as parameters. 

Any idea on why 0.26.1 didn't include the function but 0.26.3 does?
Flags: needinfo?(emilio+bugs)
So, there aren't many commits in that version range:

 * https://github.com/servo/rust-bindgen/compare/v0.26.1...v0.26.3

The new functions generated seem C++ methods that end up being marked as children of some module and whitelisted. I suspect a lot from:

 * https://github.com/servo/rust-bindgen/commit/979f5aae9a078a54aec18d4a6cfe38f34322f534

Could you try reverting that and see if that stops generating those functions?

I'm less sure about the mishandling of nsTArray, but could either be that or https://github.com/servo/rust-bindgen/commit/d9ede218489e0bf5330d0005676f7687025e44c8, I guess... Any chance you could bisect it?

Thanks!
Flags: needinfo?(emilio+bugs)
(Assignee)

Comment 23

13 days ago
I reverted back to 0.26.1 the error goes away, everything compiles OK.

I'll try to bisect the rust-bindgen code, it might take a while. :)
(Assignee)

Comment 24

13 days ago
first BAD  979f5aa Ensure that every item is in some module's children list
last GOOD  26094ea Allow path separators in test-one.sh

Looks like this is the one causing the problem: 

commit 979f5aae9a078a54aec18d4a6cfe38f34322f534
Author: Nick Fitzgerald <fitzgen@gmail.com>
Date:   Mon Jun 19 14:01:46 2017 -0700

    Ensure that every item is in some module's children list
Flags: needinfo?(emilio+bugs)
Thanks for looking Shing!

Nick, could you investigate this one if you have the time?

Let me know if not, I can try to take a closer look, but if it was blocking SM the trivial way to fix it by backout may not be the optimal :P
Flags: needinfo?(emilio+bugs) → needinfo?(nfitzgerald)
To be clear, the issue is that we're getting compilation errors in the emitted bindings like

> error[E0412]: cannot find type `nsPresContext` in this scope
>      --> /home/worker/workspace/build/src/obj-firefox/toolkit/library/x86_64-unknown-linux-gnu/debug/build/style-a3e69c1fe576a835/out/gecko/bindings.rs:20750:54
>       |
> 20750 |                                     aContext: *const nsPresContext);
>       |                                                      ^^^^^^^^^^^^^ not found in this scope
>       |
>       = help: possible candidate is found in another module, you can import it into scope:
>                 `use gecko_bindings::structs::root::nsPresContext;`

after updating to the latest bindgen release?

A little bit of an aside, but it would be really helpful if Stylo generated self-contained, standalone bindings, instead of generating structs and functions separately with one depending on the other. Then, we could have the compile-stylo-bindings smoke test in bindgen's test suite not just test that bindings can be emitted, but also that the emitted bindings compile and pass the generated layout tests. If we could do all that, then we wouldn't regress Stylo bindings generation with new bindgen releases in the first place. It would also make tracking down and fixing Stylo + bindgen bugunderstands easier because we could properly use C-Reduce, which is difficult to do with the Stylo bindings right now. I'm able to diagnose and fix SpiderMonkey bindings issues *much* faster than Stylo bindings issues, because the SpiderMonkey bindings are standalone and I can easily use C-Reduce on them.

I empathize with the desire to only emit bindings for precisely the types and functions required, but that needs to be weighed against the above drag on productivity. Finally, we can and should also add more options/features to bindgen to cut down on emitting bindings to things that weren't intended, so that this isn't a good excuse not to have standalone bindings for Stylo.

(In reply to Emilio Cobos Álvarez [:emilio] from comment #25)
> if it was blocking SM the trivial way to fix it by backout may not be
> the optimal :P

That commit fixed bogus codegen that resulted in failure to compile the emitted bindings. We were running into this with SM. Definitely not something that would be wise to back out.
Flags: needinfo?(nfitzgerald)
(In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #26)
> A little bit of an aside, but it would be really helpful if Stylo generated
> self-contained, standalone bindings, instead of generating structs and
> functions separately with one depending on the other. Then, we could have
> the compile-stylo-bindings smoke test in bindgen's test suite not just test
> that bindings can be emitted, but also that the emitted bindings compile and
> pass the generated layout tests. If we could do all that, then we wouldn't
> regress Stylo bindings generation with new bindgen releases in the first
> place. It would also make tracking down and fixing Stylo + bindgen
> bugunderstands easier because we could properly use C-Reduce, which is
> difficult to do with the Stylo bindings right now. I'm able to diagnose and
> fix SpiderMonkey bindings issues *much* faster than Stylo bindings issues,
> because the SpiderMonkey bindings are standalone and I can easily use
> C-Reduce on them.

Yeah, this way updates to the generated bindings are less noisy though... I'd be super-happy to get rid of the bindings.rs file when we don't have to upstream them to servo.

> I empathize with the desire to only emit bindings for precisely the types
> and functions required, but that needs to be weighed against the above drag
> on productivity. Finally, we can and should also add more options/features
> to bindgen to cut down on emitting bindings to things that weren't intended,
> so that this isn't a good excuse not to have standalone bindings for Stylo.

So there are two problems here, first is the extra methods generated, which is probably workaroundable/acceptable.

But there's also the template parameter of nsTArray issue, which seems way more serious. In particular, that commit apparently switched us from generating nsTArray<Foo> to nsTArray in the functions, which I'm still not quite sure why, and requires debugging.

> (In reply to Emilio Cobos Álvarez [:emilio] from comment #25)
> > if it was blocking SM the trivial way to fix it by backout may not be
> > the optimal :P
> 
> That commit fixed bogus codegen that resulted in failure to compile the
> emitted bindings. We were running into this with SM. Definitely not
> something that would be wise to back out.

Sure, I agree, but the nsTArray regression also seems scary.
(In reply to Shing Lyu [:shinglyu] from comment #24)
> last GOOD  26094ea Allow path separators in test-one.sh

I'm seeing `nsTArray` without template parameters even with this commit.
(In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #28)
> (In reply to Shing Lyu [:shinglyu] from comment #24)
> > last GOOD  26094ea Allow path separators in test-one.sh
> 
> I'm seeing `nsTArray` without template parameters even with this commit.

Nevermind, it turns out that `clang++ -save-temps` has two issues that just bit me:

1. It doesn't include headers that have been included on the command line like `-include` into the preprocessed .ii file. I've filed https://github.com/servo/rust-bindgen/issues/811 as a place to figure out how to improve the "dump the preprocessed input that bindgen is working with" workflow

2. It strips comments, so any `<div rust-bindgen replaces="..."/>` annotations get lost. nsTArray_Simple replaces nsTArray in this fashion, so I just wasn't performing an equivalent bindgen run.
(In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #29)
> (In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #28)
> > (In reply to Shing Lyu [:shinglyu] from comment #24)
> > > last GOOD  26094ea Allow path separators in test-one.sh
> > 
> > I'm seeing `nsTArray` without template parameters even with this commit.
> 
> Nevermind, it turns out that `clang++ -save-temps` has two issues that just
> bit me:
> 
> 1. It doesn't include headers that have been included on the command line
> like `-include` into the preprocessed .ii file. I've filed
> https://github.com/servo/rust-bindgen/issues/811 as a place to figure out
> how to improve the "dump the preprocessed input that bindgen is working
> with" workflow

This is now fixed.

> 2. It strips comments, so any `<div rust-bindgen replaces="..."/>`
> annotations get lost. nsTArray_Simple replaces nsTArray in this fashion, so
> I just wasn't performing an equivalent bindgen run.

This is fixed in https://github.com/servo/rust-bindgen/pull/817

We can now call `bindgen::Builder::dump_preprocessed_input` to get a single C++ file that contains everything we are generating bindings for, with the source annotations, and without any ifdefs or includes.

It might make sense to call `bindgen::Builder::dump_preprocessed_input` in taskcluster stylo builds so that if/when we hit a bindgen bug, we already have a standalone test case to investigate with, feed into C-Reduce, etc.
(Assignee)

Comment 31

7 days ago
I'll give it a try as soon as it lands. 

> It might make sense to call `bindgen::Builder::dump_preprocessed_input` in taskcluster stylo builds so that if/when we hit a bindgen bug, we already have a standalone test case to investigate with, feed into C-Reduce, etc.

Can you elaborate on this? So the function will dump into a C++ file, and we need to add it to the artifact list? Or does it trigger some form of unit test?
It dumps a `__bindgen.i` or `__bindgen.ii` file (for C or C++ respectively) that contains the complete input that bindgen is processing.

We would add it to the artifacts list.

Then, when we get bindgen bugs in CI, anyone should be able to simply download that file and run bindgen on it with the correct flags to reproduce the bug.

Perhaps we also want to add `println!("{:?}", builder.command_line_flags());` to stylo's build script, which dumps the equivalent command line flags for a bindgen builder, if we're going to go down this route.
(Assignee)

Comment 33

5 days ago
I still got the extra methods and nsTArray errors while building stylo, what else do we need to do to fix this?
Flags: needinfo?(nfitzgerald)
There is ongoing work in https://github.com/servo/rust-bindgen/pull/827 and https://github.com/servo/rust-bindgen/issues/826 for the the unwanted methods.

The nsTarray errors are regarding self_type, right? That has a PR that is r+ with comments (https://github.com/servo/rust-bindgen/pull/823) and which I am landing in a minute.
Flags: needinfo?(nfitzgerald)
(In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #34)
> There is ongoing work in https://github.com/servo/rust-bindgen/pull/827 and
> https://github.com/servo/rust-bindgen/issues/826 for the the unwanted
> methods.
> 
> The nsTarray errors are regarding self_type, right? That has a PR that is r+
> with comments (https://github.com/servo/rust-bindgen/pull/823) and which I
> am landing in a minute.

All of these things have merged.
(In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #34)
> There is ongoing work in https://github.com/servo/rust-bindgen/pull/827 and
> https://github.com/servo/rust-bindgen/issues/826 for the the unwanted
> methods.
> 
> The nsTarray errors are regarding self_type, right? That has a PR that is r+
> with comments (https://github.com/servo/rust-bindgen/pull/823) and which I
> am landing in a minute.

FWIW, the nsTArray errors weren't re. self_type. We were just failing to generate the template specialization in function arguments, for some reason... But I guess it's worth a try with the new fixes, thanks for working on them btw!
(Assignee)

Comment 37

15 hours ago
Created attachment 8889250 [details]
2017-07-24-build-error.log

I built stylo with the latest bindgen, but got the attached error. There seems to be some struct/trait missing in gecko_bindings. And the nsTArray issue is still there. 

(There are some clang path config issue on try, so no try build link for now.)
You need to log in before you can comment on or make changes to this bug.