Closed Bug 1660340 Opened 4 years ago Closed 4 years ago

Update to clang 11.0.0 rc2

Categories

(Firefox Build System :: Toolchains, task)

task

Tracking

(firefox82 fixed)

RESOLVED FIXED
82 Branch
Tracking Status
firefox82 --- fixed

People

(Reporter: away, Assigned: away)

References

Details

(Keywords: perf-alert)

Attachments

(1 file)

Next week is the beginning of Nightly 82, and clang 11.0.0 rc2 was tagged today. After the multiple backouts of bug 1616692, we plan to skip clang 10 and go straight to clang 11. In order to maximize the amount of testing time on Nightly, we'll start with rc2 and pick up the final tag of clang 11 when it becomes available.

Blocks: 1657913
Depends on: 1660341
Blocks: 1644624
Depends on: 1660828
Depends on: 1660896
Depends on: 1661126
Depends on: 1661129

This changes most of our automation builds to clang 11.0.0 rc2.

Not included:

  • code coverage builds, per bug 1660341
  • mingw builds, which have traditionally been on their own update cadence, and in this case are blocked anyway by bug 1658632

This will leave some unused clang-9 task definitions. I intend to clean them up, but at a later date. For now I want to focus on making sure this update sticks, since patches like this have a tendency to bounce.

Assignee: nobody → dmajor
Status: NEW → ASSIGNED
Pushed by dmajor@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/edb1d37f48f4
Switch builds to clang 11.0.0 rc2 r=froydnj
Pushed by rmaries@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d445d03a6ca0
Switch builds to clang 11.0.0 rc2 r=froydnj

There's a crash in the logcat output:

08-28 08:37:08.009  8236  8259 W google-breakpad: ExceptionHandler::GenerateDump cloned child 
08-28 08:37:08.009  8236  8259 W google-breakpad: 8419
08-28 08:37:08.009  8236  8259 W google-breakpad: 
08-28 08:37:08.009  8236  8259 W google-breakpad: ExceptionHandler::SendContinueSignalToChild sent continue signal to child
08-28 08:37:08.010  8419  8259 W google-breakpad: ExceptionHandler::WaitForContinueSignal waiting for continue signal...

But we don't get more than that. No minidump, no stack in the logs.

There's no crash reporting available on browsertime yet because of some issues in geckodriver.

I ran the geckoview example locally on the MotoG5 and as soon as we try to load a page it crashes. I tested the app itself without a harness testing it and as soon as I open it, it crashes before I can even input a url.

I received a notification on the phone telling me to report the the crash to Mozilla, is there somewhere we can find these crash reports after reporting?

Flags: needinfo?(gmierz2)
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 82 Branch

Unfortunately I don't even know how to get started investigating that failure since I have very little experience with Android builds. As far as I can tell, we don't even have symbols for that crash report? Whatever Mike and Greg say, I'll trust them.

Flags: needinfo?(dmajor)

:dmajor, do you know anyone that can debug that crash report or debug geckoview_example locally? The problem isn't related to the harness because I can trigger the startup crash by manually opening geckoview.

This is a serious issue and all perf tests are busted on G5 right now because of this: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&group_state=expanded&resultStatus=pending%2Crunning%2Csuccess%2Ctestfailed%2Cbusted%2Cexception&searchStr=browsertime%2Candroid%2C7.0%2Cmotog5%2Cshippable%2Copt&revision=56166cae2e26429f4786ad1013adae78189a12e8&selectedTaskRun=QSU3V3n8T9K5Q-jHONBmJQ.0

Flags: needinfo?(dmajor)

(In reply to Greg Mierzwinski [:sparky] from comment #11)

:dmajor, do you know anyone that can debug that crash report or debug geckoview_example locally? The problem isn't related to the harness because I can trigger the startup crash by manually opening geckoview.

This is a serious issue and all perf tests are busted on G5 right now because of this: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&group_state=expanded&resultStatus=pending%2Crunning%2Csuccess%2Ctestfailed%2Cbusted%2Cexception&searchStr=browsertime%2Candroid%2C7.0%2Cmotog5%2Cshippable%2Copt&revision=56166cae2e26429f4786ad1013adae78189a12e8&selectedTaskRun=QSU3V3n8T9K5Q-jHONBmJQ.0

froydnj, nalexander: is this something that one of you has experience with?

Flags: needinfo?(nfroyd)
Flags: needinfo?(nalexander)
Flags: needinfo?(dmajor)

Hi :snorp, can you help us out here? There's a startup crash in geckoview that's causing all of our perf tests to fail.

Flags: needinfo?(snorp)

Greg, GVE will report crashes if you click the notification that appears when the crash occurs. It will then print the crash ID to logcat. If you grep for "sent crash report" you should find it. Then we can look up the crash-stats link and see what's going on.

Flags: needinfo?(snorp) → needinfo?(gmierz2)
Flags: needinfo?(gmierz2) → needinfo?(snorp)

It looks like it's a problem only on 32bit ARM. I installed such a build on my Pixel 4 and immediately reproduced the crash: https://crash-stats.mozilla.org/report/index/ffcd0667-12af-4ade-8915-486220200828

This appears to be some NEON going wrong in NSS somehow.

Flags: needinfo?(snorp)

Given that we get SIGBUS, I would guess there is some kind of alignment problem. Perhaps we were just getting lucky with the prior toolchain?

:m_kato you've done a bunch of ARM stuff in NSS, any ideas?

Flags: needinfo?(m_kato)
Depends on: 1661810
Flags: needinfo?(nfroyd)
Flags: needinfo?(nalexander)
Flags: needinfo?(m_kato)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 82 Branch → ---
Depends on: 1661749
Depends on: 1660509
Flags: needinfo?(dmajor)
Pushed by dmajor@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/57b1de9d90dc
Switch builds to clang 11.0.0 rc2 r=froydnj
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 82 Branch

(In reply to Pulsebot from comment #4)

Pushed by rmaries@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d445d03a6ca0
Switch builds to clang 11.0.0 rc2 r=froydnj

== Change summary for alert #26815 (as of Sun, 30 Aug 2020 17:15:54 GMT) ==

Regressions:

9% build times windows2012-64-shippable opt nightly taskcluster-c5d.4xlarge 2,084.19 -> 2,270.97
8% build times windows2012-64-shippable opt nightly taskcluster-c5.4xlarge 2,126.13 -> 2,290.79
0.46% installer size osx-shippable opt nightly 75,996,904.81 -> 76,345,195.00

Improvements:

1% installer size osx-cross opt 75,296,820.08 -> 74,567,956.00

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=26815

(In reply to Narcis Beleuzu [:NarcisB] from comment #19)

Backed out for causing Btime failures on Android 7.0

Backout link: https://hg.mozilla.org/integration/autoland/rev/592e556af3ecee3700c80dac778585e93cbe755f
Log link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=314317947&repo=autoland&lineNumber=1864

== Change summary for alert #26822 (as of Mon, 31 Aug 2020 05:21:17 GMT) ==

Regressions:

1% installer size osx-cross opt 74,600,928.67 -> 75,361,047.17

Improvements:

9% build times windows2012-64-shippable opt nightly taskcluster-c5.4xlarge 2,290.79 -> 2,090.83

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=26822

I did a backfill but I'm 99% that the regression is caused by this bug. I am waiting for the test to trigger the alert so I can file the regreesion bug, but please be aware.
https://treeherder.mozilla.org/perf.html#/graphs?highlightAlerts=1&series=autoland,2614662,1,2&timerange=1209600&zoom=1598808958916,1598967830474,74301150.08049797,75650004.23706341

Flags: needinfo?(dmajor)

I'm having trouble understanding the last couple of comments. Comment 27 seems to be highlighting an improvement in installer size. Comment 28 is pointing to a push range that doesn't show this landing. If there is something that you would like me to look at, please let me know what regression you had in mind and why the link demonstrates it. Thanks!

Flags: needinfo?(dmajor)
Regressions: 1663138
Regressions: 1663139

the improvment was caused by the backout of Bug 1661786 as you can see from this graph:
https://treeherder.mozilla.org/perf.html#/graphs?highlightAlerts=1&series=autoland,1921077,1,2&timerange=1209600&zoom=1598257383000,1599465765369,1735.8558003107707,2689.278694709142

there are other regresssions in the bugs attached to the regression list: 1663138, 1663139

Flags: needinfo?(fstrugariu)
No longer depends on: 1660509
Blocks: 1574989
Regressions: 1665351
Blocks: 1666525
Regressions: 1665486
Keywords: perf-alert
Regressions: 1680837
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: