1451741 - Crash in _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h01fb6becc6636cd4 on Snapdraon 820/821

Reporter

Description

•

7 years ago

This bug was filed from the Socorro interface and is report bp-4f442901-4009-4e71-8fdd-453e50180405. ============================================================= New crash in Fennec B9: https://bit.ly/2q6OLJl. Changes between B7-B9: https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=FENNEC_60_0b7_RELEASE&tochange=d2564a0c25e5084dbd064cf546d98a9f72eed1f3&full=1 https://mzl.la/2uFHvcP Top 10 frames of crashing thread: 0 @0xaaa8a050 1 libxul.so _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h01fb6becc6636cd4 servo/components/style/style_resolver.rs:304 2 libxul.so Gecko_AttrEquals dom/base/Element.h:1498 3 libxul.so wcsrtombs 4 libxul.so wcsrtombs 5 libxul.so _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::resolve_style::hd220abc1e26e8639 servo/components/style/style_resolver.rs:383 6 libxul.so style::traversal::compute_style::h63ed2411e2498e09 servo/components/style/style_resolver.rs:268 7 libxul.so wcsrtombs 8 libxul.so wcsrtombs 9 libxul.so wcsrtombs =============================================================

Emilio Cobos Álvarez [:emilio]

Comment 1

•

7 years ago

At least the stack in comment 0 is not sane, unfortunately :(

Marcia Knous [:marcia]

Reporter

Comment 2

•

7 years ago

This is now the top crash in Fennec B9. APIs from 23-27 are affected. Here are some URLs: *http://m.ign.com/wikis/the-legend-of-zelda-breath-of-the-wild/Trial_of_the_Sword *https://arstechnica.com/gadgets/2018/04/intel-drops-plans-to-develop-spectre-microcode-for-ancient-chips/?comments=1 *http://himasoku.com/archives/52033529.html *https://www.reddit.com/r/news/comments/89phlf/theyve_given_us_a_raise_and_nothing_for_students/?utm_content=comments&utm_medium=hot&utm_source=reddit&utm_name=frontpage Highest percentage of crashes on device ONEPLUS A3003.

Keywords: topcrash

Julien Cristau [:jcristau]

Comment 3

•

7 years ago

The reports I looked at don't have a sensible stack either :(

status-firefox59: --- → unaffected

status-firefox61: affected → ?

status-firefox-esr52: --- → unaffected

tracking-firefox60: ? → +

Xidorn Quan [:xidorn] UTC+10

Comment 4

•

7 years ago

The summary page doesn't make sense either... How can 158 crash reports come from 181 installations?

Marcia Knous [:marcia]

Reporter

Comment 5

•

7 years ago

This is currently the top crash in Beta 9 - we shall see if it shows up in Beta 11 as well.

Makoto Kato [:m_kato]

Comment 6

•

7 years ago

(In reply to Julien Cristau [:jcristau] from comment #3) > The reports I looked at don't have a sensible stack either :( crash reporter won't generate better stack well due to https://github.com/rust-lang/rust/issues/49867.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 7

•

7 years ago

:bholley - just keeping this on your radar

Flags: needinfo?(bobbyholley)

Bobby Holley (:bholley)

Comment 8

•

7 years ago

(In reply to Milan Sreckovic [:milan] from comment #7) > :bholley - just keeping this on your radar I don't think anyone is actively working on this. I had a look at the regression range, and none of the patches touch the style system. The crashing thread stacks are all, garbage, as are most of the main thread ones. I did find one main-thread stack that was not garbage, and it ran through AndroidDynamicToolbarAnimator::UpdateFrameMetrics, which would presumably not be doing a style flush. So I think this is in the category of things where android crash diagnostic expertise is more relevant than style system expertise.

Flags: needinfo?(bobbyholley)

Julien Cristau [:jcristau]

Comment 9

•

7 years ago

This is the #2 top crash on beta this week. I'm concerned that we're shipping stylo-android in 60 and it sounds from comment 6 that we're going blind wrt crashes.

Bobby Holley (:bholley)

Updated

•

7 years ago

Flags: needinfo?(m_kato)

Makoto Kato [:m_kato]

Comment 10

•

7 years ago

Does anyone modify rust, or servo code between 60.0b7 to 60.0b9? I cannot make sense this crash.... LTO?

Julien Cristau [:jcristau]

Comment 11

•

7 years ago

Another crash signature that looks similar and also started showing up in 60.0b9

Crash Signature: [@ _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h01fb6becc6636cd4] → [@ _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h01fb6becc6636cd4] [@ style::parallel::traverse_nodes::h59de8f1a2ce4849c]

Makoto Kato [:m_kato]

Updated

•

7 years ago

Crash Signature: [@ _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h01fb6becc6636cd4] [@ style::parallel::traverse_nodes::h59de8f1a2ce4849c] → [@ _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h01fb6becc6636cd4] [@ style::parallel::traverse_nodes::h59de8f1a2ce4849c] [@ style::stylist::Stylist::cascade_st…

Flags: needinfo?(m_kato)

Makoto Kato [:m_kato]

Updated

•

7 years ago

Flags: needinfo?(m_kato)

Makoto Kato [:m_kato]

Comment 12

•

7 years ago

interesting point is that most (all?) devices are Snapdragon 820/821.

Makoto Kato [:m_kato]

Comment 13

•

7 years ago

snorp, do you know any errata information for snapdragon 820/821 that is memory synchronization issue? This crash is only on snapdragon 820/821, but no crash on 835 and others Also, I found a chrome bug that is cache line bug of Kryo. (https://bugs.chromium.org/p/chromium/issues/detail?id=656693)

Flags: needinfo?(m_kato) → needinfo?(snorp)

Makoto Kato [:m_kato]

Updated

•

7 years ago

Crash Signature: style::stylist::Stylist::cascade_style_and_visited::hc986e55be637888c] → style::stylist::Stylist::cascade_style_and_visited::hc986e55be637888c] [@ @0x0 | wcsrtombs]

Makoto Kato [:m_kato]

Comment 14

•

7 years ago

"@0x0 | wcsrtombs" is related, but not all.

Crash Signature: style::stylist::Stylist::cascade_style_and_visited::hc986e55be637888c] [@ @0x0 | wcsrtombs] → style::stylist::Stylist::cascade_style_and_visited::hc986e55be637888c]

Makoto Kato [:m_kato]

Updated

•

7 years ago

Summary: Crash in _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h01fb6becc6636cd4 → Crash in _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h01fb6becc6636cd4 on Snapdraon 820/821

Xidorn Quan [:xidorn] UTC+10

Comment 15

•

7 years ago

(In reply to Julien Cristau [:jcristau] from comment #9) > This is the #2 top crash on beta this week. I'm concerned that we're > shipping stylo-android in 60 and it sounds from comment 6 that we're going > blind wrt crashes. I think this is a reasonable concern. We probably should ask the Rust team to prioritize rust-lang/rust#49867. Even if it would miss 60, we still want this to be available as soon as possible. It would be even better if Rust can issue a patch version for the compiler we are using for 60. Alex, could you help on this?

Flags: needinfo?(acrichton)

Makoto Kato [:m_kato]

Comment 16

•

7 years ago

(hmm, this crash may not occur on b13...)

Alex Crichton [:acrichto]

Comment 17

•

7 years ago

Sure yeah I can try to help with this! I could definitely see how https://github.com/rust-lang/rust/issues/49867 is the underlying issue here, but I'd want to understand a few things first. * What version of Rust are y'all using for 60? * If you compile with `-C panic=unwind` instead of `-C panic=abort`, do better stack traces happen? * How can you tell, inspecting the binary, whether it's built correctly? * What target are you using for rustc for this platform?

Flags: needinfo?(acrichton)

James Willcox (:snorp) (jwillcox@mozilla.com) (he/him)

Comment 18

•

7 years ago

(In reply to Makoto Kato [:m_kato] from comment #13) > snorp, do you know any errata information for snapdragon 820/821 that is > memory synchronization issue? This crash is only on snapdragon 820/821, but > no crash on 835 and others Sorry, no idea. This is a fascinating problem, though!

Flags: needinfo?(snorp)

Xidorn Quan [:xidorn] UTC+10

Comment 19

•

7 years ago

(In reply to Alex Crichton [:acrichto] from comment #17) > Sure yeah I can try to help with this! > > I could definitely see how https://github.com/rust-lang/rust/issues/49867 is > the underlying issue here, but I'd want to understand a few things first. > > * What version of Rust are y'all using for 60? The minimum requirement is 1.24 but it's not clear whether we can use a newer version for the official build, at least for Android. This would be a question for build team. > * If you compile with `-C panic=unwind` instead of `-C panic=abort`, do > better stack traces happen? > * How can you tell, inspecting the binary, whether it's built correctly? > * What target are you using for rustc for this platform? These are questions for Kato-san.

Xidorn Quan [:xidorn] UTC+10

Updated

•

7 years ago

Flags: needinfo?(m_kato)

Makoto Kato [:m_kato]

Comment 20

•

7 years ago

(In reply to Alex Crichton [:acrichto] from comment #17) > Sure yeah I can try to help with this! > > I could definitely see how https://github.com/rust-lang/rust/issues/49867 is > the underlying issue here, but I'd want to understand a few things first. > > * What version of Rust are y'all using for 60? > * If you compile with `-C panic=unwind` instead of `-C panic=abort`, do > better stack traces happen? Yes, panic=unwind is expected result for unwind information generation. > * How can you tell, inspecting the binary, whether it's built correctly? Bug 1453220 is original bug. When using -C panic=abort, object file doesn't have unwind information. When using readelf command, you can show unwind information. When using -C panic=abort, all functions are "cantunwind". Unwind section '.ARM.exidx.text.main' at offset 0x1e0 contains 1 entry: 0x0 <main>: 0x1 [cantunwind] But using -C panic=unwind, Unwind section '.ARM.exidx.text.main' at offset 0x1e0 contains 1 entry: 0x0 <main>: 0x80018408 Compact model index: 0 0x01 vsp = vsp + 8 0x84 0x08 pop {r7, r14} > * What target are you using for rustc for this platform? armv7-linux-androideabi.

Flags: needinfo?(m_kato)

Marcia Knous [:marcia]

Reporter

Comment 21

•

7 years ago

[@ @0x0 | _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h334d4cb73837e28a ] is another signature showing up on Nightly, but it looks more like a different chipset (looks as if the devices are mostly the 820)

Marcia Knous [:marcia]

Reporter

Comment 22

•

7 years ago

(In reply to Marcia Knous [:marcia - needinfo? me] from comment #21) > [@ @0x0 | > _$LT$style..style_resolver.. > StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$ > $GT$::cascade_style_and_visited::h334d4cb73837e28a ] is another signature > showing up on Nightly, but it looks more like a different chipset (looks as > if the devices are mostly the 820) I added the signature since I missed the 820 in the subject line.

Crash Signature: [@ _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h01fb6becc6636cd4] [@ style::parallel::traverse_nodes::h59de8f1a2ce4849c] [@ style::stylist::Stylist::cascade_st… → [@ _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20$$u27$le$C$$u20$E$GT$$GT$::cascade_style_and_visited::h01fb6becc6636cd4] [@ @0x0 | _$LT$style..style_resolver..StyleResolverForElement$LT$$u27$a$C$$u20$$u27$ctx$C$$u20…

Alex Crichton [:acrichto]

Comment 23

•

7 years ago

Ok thanks for the info I've opened a PR to fix this in rust-lang/rust at https://github.com/rust-lang/rust/pull/50093. If that lands and you'd like we can also backport it to be released with 1.26. Unfortunately releasing it in 1.24 or 1.25 is probably not going to happen.

Xidorn Quan [:xidorn] UTC+10

Comment 24

•

7 years ago

Do you think we can use a patched Rust 1.26 (maybe 1.26.1 or something?) for Firefox 60 release? Or at least for Fennec?

Flags: needinfo?(mh+mozilla)

Mike Hommey [:glandium]

Comment 25

•

7 years ago

Most certainly not. 1.26 is not even going to release before Firefox 60, and we're still trying to cope with the regressions from 1.25.

Flags: needinfo?(mh+mozilla)

Xidorn Quan [:xidorn] UTC+10

Comment 26

•

7 years ago

Oh, 1.26 is still in beta... hmmm... Sounds like there is no way we would be able to get reasonable stack of style system for Fennec 60 :(

Mike Hommey [:glandium]

Comment 27

•

7 years ago

(In reply to Makoto Kato [:m_kato] from comment #6) > (In reply to Julien Cristau [:jcristau] from comment #3) > > The reports I looked at don't have a sensible stack either :( > > crash reporter won't generate better stack well due to > https://github.com/rust-lang/rust/issues/49867. Come to think of it, I don't think that's true. Crash reports get their stack trace using breakpad info, which is derived from the DWARF info in the binaries, which is independent from the unwind EHABI data. However, considering how many times we've had crashes in llvm-dsymutil on mac builds because of malformed DWARF data generated by the rust compiler, I wouldn't be entirely surprised if the DWARF data is broken in the first place.

Mike Hommey [:glandium]

Updated

•

7 years ago

Flags: needinfo?(m_kato)

Makoto Kato [:m_kato]

Comment 28

•

7 years ago

(In reply to Mike Hommey [:glandium] from comment #27) > (In reply to Makoto Kato [:m_kato] from comment #6) > > (In reply to Julien Cristau [:jcristau] from comment #3) > > > The reports I looked at don't have a sensible stack either :( > > > > crash reporter won't generate better stack well due to > > https://github.com/rust-lang/rust/issues/49867. > > Come to think of it, I don't think that's true. Crash reports get their > stack trace using breakpad info, which is derived from the DWARF info in the > binaries, which is independent from the unwind EHABI data. However, > considering how many times we've had crashes in llvm-dsymutil on mac builds > because of malformed DWARF data generated by the rust compiler, I wouldn't > be entirely surprised if the DWARF data is broken in the first place. Hmm, does it mean that is another bug of LLVM?

Flags: needinfo?(m_kato)

Mike Hommey [:glandium]

Comment 29

•

7 years ago

Actually, it might or might not, but it might also not be relevant. I took a look at two of the crashes. The first thing I looked for is how the frames were found, and they all come from stack scanning. Which means they may just well be garbage. And the reason they come from stack scanning is that in the top frame, the instruction pointer, which is also the address of the crash, doesn't actually point in executable code. The return pointer does, however, point to the style code, but that may or may not be a red herring. In the two crashes I looked at, the crash address is suspiciously close to the stack pointer, which led me to look at a third, where it wasn't (but still not in executable code). This means the lr pointer should be right, but when comparing its value to what the crash reports say the frame is at, that doesn't match (it doesn't even align to an instruction). If I manually look where lr points back to, that points back to code that did a `bl` (function call), but it does so with an immediate value, not a pointer from a vtable or otherwise, meaning there's no way /that/ `bl` call led to the crash itself. Something else along the way did a jump that didn't set lr.

Marcia Knous [:marcia]

Reporter

Comment 30

•

7 years ago

It looks as if for 60 this was last seen in b12, but not seen in subsequent betas (for all three signatures).

Maire Reavy [:mreavy]

Updated

•

7 years ago

Priority: -- → P2

Mike Taylor [:miketaylr]

Comment 31

•

7 years ago

It seems like this went away.

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → WORKSFORME

Julien Cristau [:jcristau]

Updated

•

7 years ago

status-firefox60: affected → wontfix

[:philipp]

Updated

•

7 years ago