Open Bug 1731142 Opened 4 years ago Updated 2 years ago

Crash in [@ style::stylist::CascadeData::add_rule_list<T>]

Tracking

()

Status:

NEW

Tracking Flags:

Tracking

Status

firefox-esr78

---

unaffected

firefox-esr91

---

unaffected

firefox92

---

unaffected

firefox93

---

wontfix

firefox94

---

wontfix

firefox95

---

wontfix

People

(Reporter: aryx, Assigned: emilio)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

Bug 1731142 - Print lock address on assert. r=#style,#layout-reviewers 4 years ago Emilio Cobos Álvarez [:emilio] 48 bytes, text/x-phabricator-request		Details \| Review

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Reporter

Description

•

4 years ago

8 crashes from 7 different installations, all Windows

Maybe Fission related. (DOMFissionEnabled=1)

Crash report: https://crash-stats.mozilla.org/report/index/fb9f038d-8903-4e13-9e70-6cd180210916

MOZ_CRASH Reason: Locked::read_with called with a guard from an unrelated SharedRwLock

Top 10 frames of crashing thread:

0 xul.dll RustMozCrash mozglue/static/rust/wrappers.cpp:17
1 xul.dll mozglue_static::panic_hook mozglue/static/rust/lib.rs:91
2 xul.dll core::ops::function::Fn::call<fn ../a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/ops/function.rs:227
3 xul.dll std::panicking::rust_panic_with_hook ../a178d0322ce20e33eac124758e837cbd80a6f633//library/std/src/panicking.rs:626
4 xul.dll std::panicking::begin_panic::{{closure}}<str> ../a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:542
5 xul.dll std::sys_common::backtrace::__rust_end_short_backtrace<closure-0, !> ../a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/sys_common/backtrace.rs:141
6 xul.dll std::panicking::begin_panic<str> ../a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:541
7 xul.dll style::stylist::CascadeData::add_rule_list<style::gecko::data::GeckoStyleSheet> servo/components/style/stylist.rs:2394
8 xul.dll style::stylist::CascadeData::add_stylesheet<style::gecko::data::GeckoStyleSheet> servo/components/style/stylist.rs:2453
9 xul.dll style::stylist::CascadeData::rebuild<style::gecko::data::GeckoStyleSheet> servo/components/style/stylist.rs:2042

Flags: needinfo?(emilio)

Emilio Cobos Álvarez [:emilio]

Assignee

Comment 1

•

4 years ago

I think this probably just changed signatures with those changes, so shouldn't be a regression per se... But I'll add some more diagnostic information.

Flags: needinfo?(emilio)

Emilio Cobos Álvarez [:emilio]

Assignee

Comment 2

•

4 years ago

Attached file Bug 1731142 - Print lock address on assert. r=#style,#layout-reviewers — Details

Note that the crash reason is sanitized so we're not exposing anything
sensitive.

I think my patch just changed the signature of the stack, as it didn't
change anything related to guards or what not. But without knowing why
is failing or a repro it's hard to know what's going on.

Printing the address at list would give us some indication of what might
be going wrong (perhaps we're using a static lock when we don't expect
one or such?).

Phabricator Automation

Updated

•

4 years ago

Assignee: nobody → emilio

Status: NEW → ASSIGNED

Pulsebot

Comment 3

•

4 years ago

Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d0661e7dd0c9 Print lock address on assert. r=firefox-style-system-reviewers,layout-reviewers,boris

Cosmin Sabou [:CosminS]

Comment 4

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/d0661e7dd0c9

Status: ASSIGNED → RESOLVED

Closed: 4 years ago

status-firefox94: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → 94 Branch

Emilio Cobos Álvarez [:emilio]

Assignee

Comment 5

•

4 years ago

(It is only a diagnostic patch)

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Pascal Chevrel:pascalc

Comment 6

•

4 years ago

Low volume crasher on beta and we are nearing the end of the beta cycle, fix-optional 93.

status-firefox93: affected → fix-optional

Ryan VanderMeulen [:RyanVM]

Updated

•

4 years ago

Status: REOPENED → NEW

status-firefox93: fix-optional → wontfix

status-firefox94: fixed → affected

status-firefox95: --- → affected

Target Milestone: 94 Branch → ---

Ryan VanderMeulen [:RyanVM]

Comment 7

•

4 years ago

Anything useful in the Fx94 reports w/ diagnostic patch?

Flags: needinfo?(emilio)

Emilio Cobos Álvarez [:emilio]

Assignee

Comment 8

•

4 years ago

Looking at some of the crashes I see messages like:

Locked::read_with called with a guard from an unrelated SharedRwLock: 0x4010 vs. 0x168e0680
Locked::read_with called with a guard from an unrelated SharedRwLock: 0x20 vs. 0x13f1ca8

That looks really weird, the locks are static and shouldn't change address and should be alive forever... I see other crashes that don't have crash reason and just are a very-near-null crash...

So nothing particularly actionable, I suspect some sort of corruption... I'll try to chkimg some of the reports...

Flags: needinfo?(emilio)

Dianna Smith [:diannaS]

Updated

•

4 years ago

status-firefox94: affected → wontfix

Lougenia Bailey

Updated

•

4 years ago

status-firefox95: affected → wontfix

BMO Automation

Updated

•

4 years ago

Has Regression Range: --- → yes

Daniel Holbert [:dholbert]

Comment 9

•

4 years ago

(In reply to Emilio Cobos Álvarez (:emilio) from comment #1)

I think this probably just changed signatures with those changes, so shouldn't be a regression per se...

--> Dropping 'regression' keyword and moving 'regressed by' bug to 'depends on' instead.

Has Regression Range: yes → ---

Depends on: 1728754

Keywords: regression

No longer regressed by: 1728754

Gabriele Svelto [:gsvelto]

Comment 10

•

4 years ago

Found another signature and indeed it looks like memory corruption. The lock address is either totally bogus (e.g. 329c7dfa-3136-4135-993f-a4c3e0220414 SharedRwLock: 0x100000011 vs. 0x10c960ad0) or obvious bit flips (e.g. b127a677-6123-4f60-bb29-3b4200220411 SharedRwLock: 0x79fc8738e0 vs. 0x79fc873ae0).

Crash Signature: [@ style::stylist::CascadeData::add_rule_list<T>] → [@ style::stylist::CascadeData::add_rule_list<T>] [@ style::stylist::CascadeData::add_rule_list]

Daniel Holbert [:dholbert]

Comment 11

•

3 years ago

(In reply to Gabriele Svelto [:gsvelto] from comment #10)

Found another signature and indeed it looks like memory corruption.

S2 doesn't seem appropriate then. Let's call this S4 I guess, given that there's presumably not much we can do aside from hypothetically shipping new memory modules to our users. :)

(gsvelto, do we have any other sort of resolution or end-state for bugs for crash signatures with ongoing crash volume that seem to be memory-corruption?)

Severity: S2 → S4

Flags: needinfo?(gsvelto)

Gabriele Svelto [:gsvelto]

Comment 12

•

3 years ago

Agreed, we can re-evaluate these bugs once we have bit-flip detection and we can finally tell most of the just bad ones apart.

Flags: needinfo?(gsvelto)

Daniel Holbert [:dholbert]

Comment 13

•

2 years ago

It looks like we've got folks reliably reproducing a crash with a backtrace that matches comment 0 (in TorBrowser on FreeBSD, I think?) over in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=272445

See in particular the first attachment there, which is a backtrace that matches comment 0 exactly. I'll quote the relevant bits of the backtrace from that attachment (after the core::panicking::panic_fmt invocation):

#15 0x000036208ee0b752 in style::stylist::CascadeData::add_rule_list::h10605d5720993ba3 () at /usr/local/lib/tor-browser/libxul.so
#16 0x000036208ee0bac1 in style::stylist::CascadeData::add_stylesheet::hd116a1b1152580f6 () at /usr/local/lib/tor-browser/libxul.so
#17 0x000036208ee0c13c in style::stylist::CascadeData::rebuild::hec212d139969252d () at /usr/local/lib/tor-browser/libxul.so
#18 0x000036208ee5c643 in Servo_StyleSet_FlushStyleSheets () at /usr/local/lib/tor-browser/libxul.so
#19 0x000036208ca731b2 in mozilla::ServoStyleSet::UpdateStylist() () at /usr/local/lib/tor-browser/libxul.so
#20 0x000036208cad318e in mozilla::PresShell::DoFlushPendingNotifications(mozilla::ChangesToFlush) () at /usr/local/lib/tor-browser/libxul.so
#21 0x000036208caaa4cb in nsRefreshDriver::Tick(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp, nsRefreshDriver::IsExtraTick) () at /usr/local/lib/tor-browser/libxul.so
#22 0x000036208cab04b9 in mozilla::RefreshDriverTimer::TickRefreshDrivers(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp, nsTArray<RefPtr<nsRefreshDriver> >&) ()

Daniel Holbert [:dholbert]

Updated

•

2 years ago

See Also: → https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=272445

Daniel Holbert [:dholbert]

Comment 14

•

2 years ago

ni=emilio to be sure he sees comment 13. Seems like we might have some kinda-reliable STR (in TorBrowser on FreeBSD) over on that above-linked bug; though it probably requires a bit of setup to get into a state where we can poke. Still: that might be our best clue yet about how to proceed here.

(For the record, it does look like we continue to see crash volume in recent versions here. If I add "major version" on the aggregations page for the add_rule_list signature -- no <T> -- I see we've had 37 crashes for v117 (current release), and 1 crash for v118 (currently on beta, though the one crash was from when 118 was nightly). There are zero crashes for 119 Nightly, but that's not too surprising given the lower nightly vs. release population. Not sure if there was anything in the current 119 Nightly cycle that would've helped here; I assume probably-not.)

Flags: needinfo?(emilio)

Graham Perrin

Comment 15

•

2 years ago

Thank you,

(In reply to Daniel Holbert [:dholbert] from comment #13)

… (in TorBrowser on FreeBSD, I think?) …

Correct. Orientation: https://www.freshports.org/www/tor-browser/ below the description, alongside the home icon there are four icons for mirrors of the part of the FreeBSD 'ports' tree that relates to this port.

At https://github.com/freebsd/freebsd-ports/tree/main/www/tor-browser (in the GitHub mirror) the files subdirectory contains patches, and so on.

I'll happily do what I can to debug, although I'm not a developer.

Emilio Cobos Álvarez [:emilio]

Assignee

Comment 16

•

2 years ago

I can reproduce various crashes if I run their tor-browser package on a FreeBSD VM.

Questions are:

Are there debug symbols for these packages?
It seems there are fixes to build with rust 1.70, but tor browser is based on ESR. There were a bunch of build fixes when we updated to rust 1.70 that fixed various memory issues with that compiler, see dependencies of bug 1821228. But those don't seem to be on the Tor repo. If it's built with rust 1.70 I suspect it's one of the reasons for this. Would there be a chance to use the same rust compiler version as our official builds?
Does this reproduce on Firefox ESR package? I couldn't repro at a glance. Is there any significant difference on how the firefox-esr port and the tor-browser port are built that could explain this?

Flags: needinfo?(emilio)

Graham Perrin

Comment 17

•

2 years ago

Are there debug symbols for these packages?

Re: https://www.freshports.org/www/tor-browser/#config, FreeBSD-provided packages for this port are built with DEBUG off by default.

I built with DEBUG on, the resulting package is unexpectedly DEBUG off, I'll seek an explanation in a FreeBSD area.

Daniel Holbert [:dholbert]

Updated

•

2 years ago