Closed Bug 1588710 Opened 6 years ago Closed 5 years ago

Enable Stack clash protection on supported OS / arch

Tracking

(relnote-firefox -, firefox86 fixed)

Status:

RESOLVED FIXED

Milestone:

86 Branch

Tracking Flags:

Tracking

Status

relnote-firefox

---

firefox86

---

fixed

People

(Reporter: Sylvestre, Assigned: Sylvestre)

References

Details

(Keywords: sec-want, Whiteboard: [adv-main86-])

Attachments

(6 files, 1 obsolete file)

image.png 6 years ago Sylvestre Ledru [:Sylvestre] 352.65 KB, image/png		Details
Bug 1588710 - Linux/Android: enable -fstack-clash-protection r?#build,tjr 5 years ago Sylvestre Ledru [:Sylvestre] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1588710 - clang stack clash - Cherry-pick an upstream fix r?#build,tjr 5 years ago Sylvestre Ledru [:Sylvestre] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1588710 - Do not fail on stack protector on some asm chromium & breakpad sandboxing code r?#build 5 years ago Sylvestre Ledru [:Sylvestre] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1588710 - Try to use clang 11 for stackwalk r?#build 5 years ago Sylvestre Ledru [:Sylvestre] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1588710 - Add missing upstream llvm patch to fix alloc(0) r?#build 5 years ago Sylvestre Ledru [:Sylvestre] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1588710 - stack-clash-protection: enable when clang 11.0.1 is used 5 years ago Sylvestre Ledru [:Sylvestre] 47 bytes, text/x-phabricator-request		Details \| Review

Sylvestre Ledru [:Sylvestre]

Assignee

Description

•

6 years ago

This is a feature provided by gcc for a few years now.
clang doesn't have it and it has been marked as a blocker for redhat/fedora to use clang to build Firefox.

Serge started the work on the clang side to implement it:
http://lists.llvm.org/pipermail/llvm-dev/2019-October/135786.html
https://reviews.llvm.org/D68720

He reached out to me to evaluate the change for Firefox performances.

Opening a bug for transparency and sharing results

If there isn't performance regression, we might want to add this to our official builds.

Tom Ritter [:tjr]

Comment 1

•

6 years ago

Is it possible for him to make a backport of the patch to the clang 8 branch? That would be easiest for us to apply and test.

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 2

•

6 years ago

We moved to clang 9 and Serge shared with me a backported patch
https://hg.mozilla.org/try/rev/37fe420914ca37a20a75dd8ba42089579fa72b5a
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1db6f0e4ceca73e401ca60d0e93ad29cbe1c3949&selectedJob=271266815

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 3

•

6 years ago

The linux pgo build with the option:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3850d59d0d33e2ca0c746c54f059304306368972

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 4

•

6 years ago

With the perf results:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=3850d59d0d33e2ca0c746c54f059304306368972&framework=1&selectedTimeRange=172800

I do see an improvement from 3 to 4% on Linux.
I am a bit confused to see Windows improvement as I didn't touch it (the stack clash patch is only applied on clang Linux) but the confidence is smaller.
Anyway, it matches what Serge told me. He saw improvements on some projects (and no regression yet).
I would however appreciate a confirmation by an actual perf expert.

So, if this is the case, once landed, we could even take the patch to clang 9 to avoid having to wait a few more months (clang 10 release)

Tom Ritter [:tjr]

Comment 5

•

6 years ago

:jmaher is who I usually go to for such requests, but he is on leave.

Would Serge be open to improving his patch to include Windows? Or at least documenting what would be needed to do so?

Keywords: sec-want

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 6

•

6 years ago

Comparing try with try isn't giving the same kind of results (slower)
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&newProject=try&newRevision=3850d59d0d33e2ca0c746c54f059304306368972&framework=1&selectedTimeRange=172800

I am a bit lost on how to understand these results.

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 7

•

6 years ago

On dmajor suggestion, I reused 1db6f0e4ceca73e401ca60d0e93ad29cbe1c3949 as base line (the clang patch but not enabled in our builds) and retriggered the talos jobs

Here is the comparison:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=1db6f0e4ceca73e401ca60d0e93ad29cbe1c3949&newProject=try&newRevision=3850d59d0d33e2ca0c746c54f059304306368972&framework=1

Seems that there isn't any significant improvement or regression after all. Except maybe a regression in "about_preferences_basic opt e10s stylo" on Linux opt (not pgo)

(Away)

Comment 8

•

6 years ago

Please also run raptor tests, at least speedometer. Should we also test this on Windows builds?

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 9

•

6 years ago

Serge told me that it should not have impact on Windows (and not needed there)

sguelton

Comment 10

•

6 years ago

@dmajor, @tjr : The windows ABI already requires a touch on each stack page, so Windows shouldn't be affected, (not the case of the patch you're testing with, but should be the case in the near future). Basically it means that clang-cl will support -fstack-clash-protection, but it should be a no-op for Windows.

@sylvestre: thanks for the feedback. Having firefox passing its validation with this patch is very good news. I've made several significant changes to the original patch, so an extra round may be needed, keep in touch.

@all: once the patch lands in llvm's master, I'll provide a backport if you want.

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 11

•

6 years ago

No diff with raptor/speedometer
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=1db6f0e4ceca73e401ca60d0e93ad29cbe1c3949&newProject=try&newRevision=3850d59d0d33e2ca0c746c54f059304306368972&framework=10

(thanks again david ;)

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 12

•

6 years ago

Serge updated the patch and provided me a clang-9 patch.
Trying it here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0e38b4343b7b64bdea67a6adca12844631357280
https://hg.mozilla.org/try/rev/bec46adab294c7b1547d447ac5f8d15b34b87fd0

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 13

•

6 years ago

Seems that there isn't any performance change (but I might be wrong as I am no expert in perfherder)

Sylvestre Ledru [:Sylvestre]

Assignee

Updated

•

6 years ago

Assignee: nobody → sledru

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 14

•

6 years ago

•

Edited

Dave Hunt told me the following:

If so, this suggests that your patch has caused a high confidence regression to the kraken results across all platforms. The kraken benchmark measures JavaScript performance, details can be found at https://wiki.mozilla.org/TestEngineering/Performance/Talos/Tests#kraken. There also appears to be a high confidence improvement to tabpaint across all platforms, which is documented here: https://wiki.mozilla.org/TestEngineering/Performance/Talos/Tests#tabpaint. There are a few other medium to high confidence improvements and regressions.

I see references to linux64, macos64, and win64 in your patch here https://hg.mozilla.org/try/rev/408e7270af623961e8cb466aa31b18248aa3e512 could these explain the cross-platform impact? Were the tasks executed around the same time? The reason I ask is that if the base tasks were triggered first, and the new triggered much later, perhaps an infrastructure change could have been introduced in the meantime.

Serge told me that he will have a look why Windows performances are impacted

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 15

•

6 years ago

After discussing with dmajor, he told me that I have been probably impacted by another unrelated issue.

So, comparing with and without the option activated, I saw that:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=11d4b305e91184937dc2a2f3db7974653faabf87&newProject=try&newRevision=90d5778298672bc8e1b7f961606225ca9a867330&framework=1

Seems that we are only regressing on Windows 7:
tp5n main_startup_fileio opt e10s stylo
tp5n time_to_session_store_window_restored_ms opt e10s stylo

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 16

•

6 years ago

Attached image image.png — Details

for legacy

(Away)

Comment 17

•

6 years ago

Seems that we are only regressing on Windows 7:
tp5n main_startup_fileio opt e10s stylo
tp5n time_to_session_store_window_restored_ms opt e10s stylo

I'm not worried about these two, historically they haven't been super reliable for me.

sguelton

Comment 18

•

6 years ago

@sylvestre: I've explicitly disabled the flag for non-Linux targets, users receive a warning if they try to use it for non-Linux, non-x86:x64 platform. I've also setup an extra validation step for the review, it should be fine : https://github.com/serge-sans-paille/llvm-project/pull/3/checks

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 19

•

6 years ago

Serge provided me with a new patch to skip the execution of this code on Windows & Mac.

I just started some talos job;

without the option enabled:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b67d19f2d016cfd19e2854fdd2334c7540a96b31
(just clang + a normal fx build)
https://hg.mozilla.org/try/rev/41b41afc9bc18557a86db05becdb0f6b6f154a10
with the option activated:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a405edf572eb19ba33b829bb691a3a3095e68cc6
https://hg.mozilla.org/try/rev/cc533026241361425a0991c01e4bff3bb4239605

I kept the Windows & Mac part to make sure that it behaves correctly

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 20

•

6 years ago

Looking at the results ( https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=a405edf572eb19ba33b829bb691a3a3095e68cc6&newProject=try&newRevision=b67d19f2d016cfd19e2854fdd2334c7540a96b31&framework=1 ) . I don't seen any regression on Windows or Mac anymore

Just a small regression with displaylist_mutate opt e10s stylo on Linux 64 (-1.2%)

(Away)

Comment 21

•

6 years ago

(In reply to Sylvestre Ledru [:Sylvestre] from comment #20)

Looking at the results ( https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=a405edf572eb19ba33b829bb691a3a3095e68cc6&newProject=try&newRevision=b67d19f2d016cfd19e2854fdd2334c7540a96b31&framework=1 ) . I don't seen any regression on Windows or Mac anymore

Just a small regression with displaylist_mutate opt e10s stylo on Linux 64 (-1.2%)

It looks like there are only five runs before and after. At that level, we can't draw a conclusion because the tests have too much variance. Also there aren't any raptor runs in there. Depending on how intense you want to be about this experiment, I would say we need more data. But if I understand correctly this is sort of just a casual run, so maybe this combined with the previous testing is sufficient.

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 22

•

5 years ago

Attached file Bug 1588710 - Linux/Android: enable -fstack-clash-protection r?#build,tjr — Details

Not sure we want to enable it as it might have perf impact

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 23

•

5 years ago

Serge told me that we would need:
https://github.com/llvm/llvm-project/commit/0f60bcc36c34522618bd1425a45f8c6006568fb6

sguelton

Comment 24

•

5 years ago

(In reply to Sylvestre Ledru [:Sylvestre] from comment #23)

Serge told me that we would need:
https://github.com/llvm/llvm-project/commit/0f60bcc36c34522618bd1425a45f8c6006568fb6

For the record: that patch fixes probing for dynamic alloca. If firefox is not using VLA or explicit call to alloca (or any other construct that generates dynamic alloca generation (?)), this patch should have no impact.

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 25

•

5 years ago

Attached file Bug 1588710 - clang stack clash - Cherry-pick an upstream fix r?#build,tjr — Details

See https://bugs.llvm.org/show_bug.cgi?id=48007#c6

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 26

•

5 years ago

Attached file Bug 1588710 - Do not fail on stack protector on some asm chromium & breakpad sandboxing code r?#build — Details

caused:
[task 2020-11-05T10:14:26.012Z] 10:14:26 INFO - In file included from Unified_cpp_sandbox_linux2.cpp:137:
[task 2020-11-05T10:14:26.012Z] 10:14:26 ERROR - /builds/worker/checkouts/gecko/security/sandbox/chromium/sandbox/linux/seccomp-bpf/syscall.cc:369:3: error: Unable to protect inline asm that clobbers stack pointer against stack clash [-Werror,-Wstack-protector]
[task 2020-11-05T10:14:26.012Z] 10:14:26 INFO - asm volatile(
[task 2020-11-05T10:14:26.013Z] 10:14:26 INFO - ^
[task 2020-11-05T10:14:26.013Z] 10:14:26 INFO - 1 error generated.

Depends on D97566

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 27

•

5 years ago

Attached file Bug 1588710 - Try to use clang 11 for stackwalk r?#build — Details

Depends on D97567

Phabricator Automation

Updated

•

5 years ago

Attachment #9185941 - Attachment description: Bug 1588710 - Linux: enable -fstack-clash-protection → Bug 1588710 - Linux: enable -fstack-clash-protection r?#build,tjr

Phabricator Automation

Updated

•

5 years ago

Attachment #9188818 - Attachment description: Bug 1588710 - clang stack clash - Cherry-pick an upstream fix #build,tjr,sguelton → Bug 1588710 - clang stack clash - Cherry-pick an upstream fix

Phabricator Automation

Updated

•

5 years ago

Attachment #9188818 - Attachment description: Bug 1588710 - clang stack clash - Cherry-pick an upstream fix → Bug 1588710 - clang stack clash - Cherry-pick an upstream fix r?#build,tjr

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 28

•

5 years ago

AFAIK, no perf regression:
https://treeherder.mozilla.org/perfherder/compare?originalProject=try&originalRevision=a47c98b909b61035dae2e1e00883f2ade0fef129&newProject=try&newRevision=62108fa48bd15fe01f1a0f1ffab133af9b4207cc&framework=13

For now, enabling it when built with clang >= 11.0.0 on x86, x86_64, s390x & ppc64 on non Windows.
According to the clang author (Serge), it is built-in on Windows

Summary: Evaluate the performance impact of the Stack clash protection → Enable Stack clash protection on support OS / arch

Sylvestre Ledru [:Sylvestre]

Assignee

Updated

•

5 years ago

Summary: Enable Stack clash protection on support OS / arch → Enable Stack clash protection on supported OS / arch

(Away)

Comment 29

•

5 years ago

(In reply to Sylvestre Ledru [:Sylvestre] from comment #28)

AFAIK, no perf regression:

I added a few more test suites that usually give me trouble.

The webaudio test, which is very sensitive to c++ changes, does take a couple-percent hit in vanilla builds: https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=62108fa48bd15fe01f1a0f1ffab133af9b4207cc&originalSignature=1726480&newSignature=1726480&framework=10&originalRevision=a47c98b909b61035dae2e1e00883f2ade0fef129

But in PGO builds, which are what really matter, it's fine: https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=62108fa48bd15fe01f1a0f1ffab133af9b4207cc&originalSignature=1904137&newSignature=1904137&framework=10&originalRevision=a47c98b909b61035dae2e1e00883f2ade0fef129

Phabricator Automation

Updated

•

5 years ago

Attachment #9188820 - Attachment description: Bug 1588710 - Disable stack protector on some asm chromium & breakpad sandboxing code r?#build → Bug 1588710 - Do not fail on stack protector on some asm chromium & breakpad sandboxing code r?#build

Pulsebot

Comment 30

•

5 years ago

Pushed by sledru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/aaa9af4e5eba Do not fail on stack protector on some asm chromium & breakpad sandboxing code r=mhentges

Sylvestre Ledru [:Sylvestre]

Assignee

Updated

•

5 years ago

Keywords: leave-open

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 31

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/aaa9af4e5eba

Pulsebot

Comment 32

•

5 years ago

Pushed by sledru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/31f2b92312ab Try to use clang 11 for stackwalk r=firefox-build-system-reviewers,andi,mhentges

Razvan Maries

Comment 33

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/31f2b92312ab

Phabricator Automation

Updated

•

5 years ago

Attachment #9188818 - Attachment description: Bug 1588710 - clang stack clash - Cherry-pick an upstream fix r?#build,tjr → Bug 1588710 - clang stack clash - Cherry-pick an upstream fixes r?#build,tjr

Phabricator Automation

Updated

•

5 years ago

Attachment #9185941 - Attachment description: Bug 1588710 - Linux: enable -fstack-clash-protection r?#build,tjr → Bug 1588710 - Linux/Android: enable -fstack-clash-protection r?#build,tjr

Phabricator Automation

Updated

•

5 years ago

Attachment #9188818 - Attachment description: Bug 1588710 - clang stack clash - Cherry-pick an upstream fixes r?#build,tjr → Bug 1588710 - clang stack clash - Cherry-pick an upstream fix r?#build,tjr

Pulsebot

Comment 34

•

5 years ago

Pushed by sledru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5f196172bb22 clang stack clash - Cherry-pick an upstream fix r=firefox-build-system-reviewers,mhentges,dmajor

Cosmin Sabou [:CosminS]

Comment 35

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/5f196172bb22

Pulsebot

Comment 36

•

5 years ago

Pushed by sledru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/abafe6c923eb Linux/Android: enable -fstack-clash-protection r=tjr,firefox-build-system-reviewers,glandium

Andreea Pavel [:apavel]

Comment 37

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/abafe6c923eb

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 38

•

5 years ago

The currently nightly has it \o/
Tom, As I was saying, we are going to write a blog post with Serge about it.

Do you think it should be mentioned in the release notes? I know it isn't much but some press like this kind security improvements.

Status: NEW → RESOLVED

Closed: 5 years ago

Flags: needinfo?(tom)

Resolution: --- → FIXED

Sylvestre Ledru [:Sylvestre]

Assignee

Updated

•

5 years ago

Depends on: 1679985

BugBot [:suhaib / :marco/ :calixte]

Updated

•

5 years ago

Keywords: leave-open

Tom Ritter [:tjr]

Comment 39

•

5 years ago

I'm not sure about the release notes; I suppose add it and see what the people who manage that say.

Flags: needinfo?(tom)

Jan de Mooij [:jandem]

Comment 40

•

5 years ago

•

Edited

We've seen some test_jsctypes.js test failures on Try and locally. I tracked it down to alloca(0) with -fstack-clash-protection. It generates buggy machine code, a fix for that landed in LLVM:

https://github.com/llvm/llvm-project/commit/9573c9f2a363da71b2c07a3add4e52721e6028a0
https://www.mail-archive.com/llvm-branch-commits@lists.llvm.org/msg05187.html

I can reproduce this locally on central tip (hg revision f4001dfef5bc). I don't know why this is not perma-orange though.

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

5 years ago

Regressions: 1679994

Sylvestre Ledru [:Sylvestre]

Assignee

Updated

•

5 years ago

Depends on: 1680276

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 41

•

5 years ago

Jandem, I opened bug 1680276 for this

Cosmin Sabou [:CosminS]

Comment 42

•

5 years ago

Backed out from central for causing frequently test failures on try. (eg: https://treeherder.mozilla.org/jobs?repo=try&resultStatus=testfailed%2Cbusted%2Cexception&revision=754947316770e3b297b5b18a99ad39ab98cd77a5&selectedTaskRun=As7HGID9TNqBbHJ5eLaJhQ.0)

Backout link: https://hg.mozilla.org/mozilla-central/rev/4f4c8ad9952a6a3e74de5c4419e121c2f4ccd831

Status: RESOLVED → REOPENED

Flags: needinfo?(sledru)

Resolution: FIXED → ---

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 43

•

5 years ago

ok
I will rebase upstream fixes or wait for 11.0.1

Flags: needinfo?(sledru)

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 44

•

5 years ago

Attached file Bug 1588710 - Add missing upstream llvm patch to fix alloc(0) r?#build (obsolete) — Details

From upstream repo:
a1e0363c7402f7aa58e24e0e6dfa447ebabc1910 to bbe6cbbed8c7460a7e8477373b9250543362e771

Pulsebot

Comment 45

•

5 years ago

Pushed by sledru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e45001cddaf4 Add missing upstream llvm patch to fix alloc(0) r=firefox-build-system-reviewers,dmajor https://hg.mozilla.org/integration/autoland/rev/f0b02db01894 Linux/Android: enable -fstack-clash-protection r=tjr,firefox-build-system-reviewers,glandium

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 46

•

5 years ago

Attached file Bug 1588710 - stack-clash-protection: enable when clang 11.0.1 is used — Details

For now, it needs a patch version of clang 11.0.0 to work correctly.
I will remove it once 11.0.1 is released.

Phabricator Automation

Updated

•

5 years ago

Attachment #9193198 - Attachment description: Bug 1588710 - stack-clash-protection: Only enable in automation for now r?#build → Bug 1588710 - stack-clash-protection: enable when clang 11.0.1 is used r?#build

Cosmin Sabou [:CosminS]

Comment 47

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/e45001cddaf4
https://hg.mozilla.org/mozilla-central/rev/f0b02db01894

Status: REOPENED → RESOLVED

Closed: 5 years ago → 5 years ago

status-firefox86: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 86 Branch

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 48

•

5 years ago

This might be too technical but it shows some security improvement on Linux and I often see the press mentioning that.
Feel free to ignore it!

Release Note Request (optional, but appreciated)
[Why is this notable]: Prevent some classes of security issues
[Affects Firefox for Android]: Yes
[Suggested wording]: On Linux and Android, the protection to mitigate the stack clash attack has been activated.
[Links (documentation, blog post, etc)]: I have been writing a draft with Red Hat & Rust for the llvm blog.

A few more links for the curious:
https://blog.qualys.com/vulnerabilities-research/2017/06/19/the-stack-clash
https://pagure.io/fesco/issue/2020
https://reviews.llvm.org/D68720
https://www.phoronix.com/scan.php?page=news_item&px=LLVM-Stack-Clash-Protection-20

relnote-firefox: --- → ?

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 49

•

5 years ago

This had to be backed out for frequent crashes on Linux x64 debug affecting Try pushes (Bug 1679994):

https://hg.mozilla.org/mozilla-central/rev/5e25722bcc7c838934d95ca5b9ba11da0b16d1ab
https://hg.mozilla.org/integration/autoland/rev/5e25722bcc7c838934d95ca5b9ba11da0b16d1ab

Status: RESOLVED → REOPENED

status-firefox86: fixed → ---

Flags: needinfo?(sledru)

Resolution: FIXED → ---

Target Milestone: 86 Branch → ---

(Away)

Comment 50

•

5 years ago

•

Edited

(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #49)

This had to be backed out for frequent crashes on Linux x64 debug affecting Try pushes (Bug 1679994):

The issue in bug 1679994 should have been fixed by the earlier backout here in comment 42, and we should not be seeing those issues anymore on the relanding in comment 47. Was there a different set of failures that prompted the second backout? Are there any links to the logs available?

Flags: needinfo?(aryx.bugmail)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 51

•

5 years ago

Focus on the Linux x64 debug tasks

Flags: needinfo?(aryx.bugmail)

(Away)

Comment 52

•

5 years ago

Thanks Aryx.

The minidump from this GTest task has

00007f63`d80b1fab 4883c00f        add     rax,0Fh
00007f63`d80b1faf 4883e0f0        and     rax,0FFFFFFFFFFFFFFF0h
00007f63`d80b1fb3 4989e5          mov     r13,rsp
00007f63`d80b1fb6 4929c5          sub     r13,rax
00007f63`d80b1fb9 4c89a5c8feffff  mov     qword ptr [rbp-138h],r12
00007f63`d80b1fc0 4939e5          cmp     r13,rsp
00007f63`d80b1fc3 7c11            jl      libxul+0x870ffd6 (00007f63`d80b1fd6)
00007f63`d80b1fc5 48c7042400000000 mov     qword ptr [rsp],0
00007f63`d80b1fcd 4881ec00100000  sub     rsp,1000h
00007f63`d80b1fd4 ebea            jmp     libxul+0x870ffc0 (00007f63`d80b1fc0)
00007f63`d80b1fd6 4c89ec          mov     rsp,r13

I don't think this does the right thing when rax is initially 0.

The compiler for this build had this squashed patch on top of llvmorg-11.0.0.

@sguelton: Does the above code look as expected to you?

Flags: needinfo?(sguelton)

(Away)

Comment 53

•

5 years ago

Well, the story gets worse: I pulled the preprocessed quant_bands.c from CI, and downloaded the same compiler artifact that the builds are using, and on my machine I get different output (jge).

Could this be a caching problem? It looks very strange that https://treeherder.mozilla.org/jobs?repo=autoland&revision=f0b02db0189473f12beb4b483dab17708ffde615&selectedTaskRun=eSFOksz8Rd67FeVxcky7Vw.0 built a brand-new clang, yet the build task claims sccache hit rate debug was 0.6!

Flags: needinfo?(sguelton)

Mike Hommey [:glandium]

Comment 54

•

5 years ago

It's not strange, actually. That's the power of dynamic libraries. sccache uses the compiler binary (clang) for hashing, but the patch only touches libllvm... so from sccache's perspective, there was no change to the compiler. I don't think this is the first time this happens.

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 55

•

5 years ago

So, clearing the cache before landing this patch would fix the issue?
If so, how can we clear the cache ? :)
merci!

Flags: needinfo?(sledru)

Mike Hommey [:glandium]

Comment 56

•

5 years ago

Clearing the cache would mean that builds using the older clang may get objects from a build with the new clang. It would be better to find a way to make the clang binary different.

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 57

•

5 years ago

Would it be possible to include libllvm in the hashing? (and/or all llvm/clang key files)
This kind of issue is pretty tricky to debug

Mike Hommey [:glandium]

Comment 58

•

5 years ago

That would need sccache to be aware of such details of the compiler...

Sylvestre Ledru [:Sylvestre]

Assignee

Updated

•

5 years ago

Depends on: 1685335

Phabricator Automation

Updated

•

5 years ago

Attachment #9193040 - Attachment is obsolete: true

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 59

•

5 years ago

Let's try to land it again now that we are using clang 11.0.1 (which includes the fixes), so that we don't have the cache issue one more time :)

Pulsebot

Comment 60

•

5 years ago

Pushed by sledru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cb69fe3e4c46 Linux/Android: enable -fstack-clash-protection r=tjr,firefox-build-system-reviewers,glandium

Phabricator Automation

Updated

•

5 years ago

Attachment #9193198 - Attachment description: Bug 1588710 - stack-clash-protection: enable when clang 11.0.1 is used r?#build → Bug 1588710 - stack-clash-protection: enable when clang 11.0.1 is used

Pulsebot

Comment 61

•

5 years ago

Pushed by mh@glandium.org: https://hg.mozilla.org/integration/autoland/rev/04b7cd1b0d65 stack-clash-protection: enable when clang 11.0.1 is used r=firefox-build-system-reviewers,glandium

Atila Butkovits

Comment 62

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/cb69fe3e4c46

Status: REOPENED → RESOLVED

Closed: 5 years ago → 5 years ago

status-firefox86: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 86 Branch

Sandor Molnar[:smolnar]

Comment 63

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/04b7cd1b0d65

Sylvestre Ledru [:Sylvestre]

Assignee

Comment 64

•

5 years ago

If release managers want to relnote it, we (redhat, rust and firefox) wrote:
https://blog.llvm.org/posts/2021-01-05-stack-clash-protection/

Pascal Chevrel:pascalc

Comment 65

•

5 years ago

I think it's too niche for our beta/release note, thanks.

relnote-firefox: ? → 87+

Pascal Chevrel:pascalc

Updated

•

5 years ago

relnote-firefox: 87+ → -

Tom Ritter [:tjr]

Updated

•

5 years ago

Whiteboard: [adv-main86-]

You need to log in before you can comment on or make changes to this bug.