1404050 - stylo: Crash in cssparser::tokenizer::consume_name on family 6 model 61 stepping 4 (Broadwell-U) with old microcode (<=0x19)

Reporter

Description

•

8 years ago

This bug was filed from the Socorro interface and is report bp-1f4e5f2f-7945-4b86-90f7-db36b0170928. ============================================================= There are 39 crashes in nightly 57-58 starting with buildid 20170903100443. There are 4 crashes in beta 57.

Bobby Holley (:bholley)

Comment 1

•

8 years ago

Hrm, no clear correlations in the crash reports. It seems to be crashing around next_byte_unchecked, which might suggest we're reading past the end somehow, but it's really hard to say exactly what's going on without analyzing a minidump.

Priority: -- → P3

Bobby Holley (:bholley)

Comment 2

•

8 years ago

These crash reports started showing up on september 3rd, so it seems plausible it's a regression from bug 1354989. Henri/jdm, thoughts?

Flags: needinfo?(josh)

Flags: needinfo?(hsivonen)

Josh Matthews [:jdm]

Comment 3

•

8 years ago

Crashing in next_byte_unchecked due to reading past the end would be surprising, since we check whether the tokenizer has reached eof right before matching on the next byte: https://github.com/servo/rust-cssparser/blob/bd213ed25eab0fd61094c556a6326fcf1240ad9c/src/tokenizer.rs#L918-L921

Flags: needinfo?(josh)

Simon Sapin (:SimonSapin)

Comment 4

•

8 years ago

As Josh said, next_byte_unchecked is "unchecked" in the sense that it returns u8 rather than Option<u8>, but it panics if called at EOF. In that sense it is different from "unchecked" in the standard library where it is bound checking that’s not done, so incorrect use of [T]::get_unchecked could cause a segfault. (Maybe we should rename next_byte_unchecked to something else.) But it is still possible for next_byte_unchecked to segfault if the &str that was originally given to csssparser is not valid. At the FFI boundary, Servo_StyleSheet_FromUTF8Bytes receives (*const u8, usize), trusts them, and uses std::slice::from_raw_parts and std::str::from_utf8_unchecked. So a bug introduced in bug 1354989 is plausible.

Henri Sivonen (:hsivonen)

Comment 5

•

8 years ago

The Windows crash location info looks bogus. The Linux crash location info looks to me like the &self of the &str might have gotten corrupted. The call to mozilla::css::Loader::ParseSheet() (which in turn calls Servo_StyleSheet_FromUTF8Bytes()) relies on XPCOM strings converting to Span correctly. Per bug 1403083, it seems that bug 1393230 partially broke the conversions but that change landed a few days later than these crash reports started. Also, it those conversions being broken were the cause, I'd expect things to be much more broken much more clearly. (In reply to Simon Sapin (:SimonSapin) from comment #4) > But it is still possible for next_byte_unchecked to segfault if the &str > that was originally given to csssparser is not valid. At the FFI boundary, > Servo_StyleSheet_FromUTF8Bytes receives (*const u8, usize), trusts them, and > uses std::slice::from_raw_parts and std::str::from_utf8_unchecked. So a bug > introduced in bug 1354989 is plausible. Do we have any knowledge of input that triggers this bug? cargo-fuzz hasn't found cases of encoding_rs producing invalid output, so if the bug is in encoding_rs, it isn't an easy-to-find one.

Flags: needinfo?(hsivonen)

Henri Sivonen (:hsivonen)

Comment 6

•

8 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #5) > Do we have any knowledge of input that triggers this bug? cargo-fuzz hasn't > found cases of encoding_rs producing invalid output, so if the bug is in > encoding_rs, it isn't an easy-to-find one. In addition to <link rel=stylesheet> cases that involve encoding_rs, there are stacks that involve <style>, which goes through UTF-16 to UTF-8 conversion provided by XPCOM strings (i.e. encoding_rs not participating).

Henri Sivonen (:hsivonen)

Comment 7

•

8 years ago

I have fuzzed encoding_rs more, and no new bugs have been found on the decode code path. (On bug was found on the encode code path, but that's irrelevant to this bug.) Also, a bug in encoding_rs wouldn't explain the <style> case. I guess one thing we could do is temporarily changing std::str::from_utf8_unchecked to UTF-8 validation to see if the timing in coincidental and there's a parser bug that was introduced around the time bug 1354989 landed.

Henri Sivonen (:hsivonen)

Comment 8

•

8 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #6) > there > are stacks that involve <style>, which goes through UTF-16 to UTF-8 > conversion provided by XPCOM strings (i.e. encoding_rs not participating). It looks like XPCOM string conversions deal gracefully with bogus UTF-16, so that's not the cause, but that not being the cause makes the <style> crashes all the more interesting.

Henri Sivonen (:hsivonen)

Comment 9

•

8 years ago

Also: Why aren't we seeing any Mac crashes?

Henri Sivonen (:hsivonen)

Comment 10

•

8 years ago

And why are there crashes on illegal instruction on both Windows and Linux?

Henri Sivonen (:hsivonen)

Comment 11

•

8 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #10) > And why are there crashes on illegal instruction on both Windows and Linux? All the Linux crashes have CPU info: family 6 model 61 stepping 4 | 1. All the Windows crashes have CPU info: family 6 model 61 stepping 4 | 4.

Henri Sivonen (:hsivonen)

Comment 12

•

8 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #11) > All the Linux crashes have CPU info: family 6 model 61 stepping 4 | 1. > All the Windows crashes have CPU info: family 6 model 61 stepping 4 | 4. It appears that those numbers mean various i3-xxxxU, i5-xxxxU and i7-xxxxU-branded Broadwell-U CPUs. This looks weird and concerning. We have a small number of crashes, but can they all be Broadwell-U by chance?

Henri Sivonen (:hsivonen)

Comment 13

•

8 years ago

I sent a PI request about fuzzing encoding_rs on Broadwell-U.

Henri Sivonen (:hsivonen)

Comment 15

•

8 years ago

Observations: * This continues to happen only with family 6 model 61 stepping 4 | N where N is 1 on Linux and N is 4 or 2 on Windows. (I don't now what N, i.e. the number after the vertical bar, means.) This family/model/stepping seems to mean Broadwell-U. * No occurrences on Mac, even though "Early 2015" MacBook Pros and MacBook Airs use Broadwell-U. * It looks like we don't collect microcode revision info on Mac. * The highest microcode revision this crash occurs with is 0x19. * Other crash reports for the same family/model/stepping/verticalbar show that there have been many microcode revisions after 0x19. * Bug 1401060 is a panic where the panic message shows an out-of-bounds index whose value doesn't look like it could plausibly arise from a bug in either encoding_rs or in mozilla::Span. I'm confident enough to guess that the explanation is that this is a Broadwell-U CPU bug that was fixed/mitigated in microcode revision 0x1A and later and Macs consistently have new-enough microcode that I'm resolving this as WONTFIX. Users who encounter this bug should unblock system updates so that they get newer Intel microcode via Windows Update or their Linux distro.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → WONTFIX

Summary: stylo: Crash in cssparser::tokenizer::consume_name → stylo: Crash in cssparser::tokenizer::consume_name on family 6 model 61 stepping 4 (Broadwell-U) with old microcode (<=0x19)

Henri Sivonen (:hsivonen)

Comment 16

•

8 years ago

Filed bug 1410841 about adding a SUMO article about microcode updates.

Henri Sivonen (:hsivonen)

Comment 17

•

8 years ago

Release Note Request (optional, but appreciated) > [Why is this notable]: Firefox 57+ crash that can be remedied by user action (but not by Mozilla), and the fix will likely make the system less crashy in general. > [Affects Firefox for Android]: No > [Suggested wording]: On Windows and Linux, Firefox crashes occasionally on Intel Broadwell-U processors with old microcode. Windows users should ensure Windows Update is set to install updates. Linux users should ensure that the distribution package for Intel microcode is installed. > [Links (documentation, blog post, etc)]: https://support.mozilla.org/en-US/kb/broadwell-u-microcode/

relnote-firefox: --- → ?

Sylvestre Ledru [:Sylvestre]

Comment 18

•

8 years ago

Added to the release notes using Henri's wording. Thanks

relnote-firefox: ? → 57+

Bugzilla

stylo: Crash in cssparser::tokenizer::consume_name on family 6 model 61 stepping 4 (Broadwell-U) with old microcode (<=0x19)

Categories

(Core :: CSS Parsing and Computation, defect, P3)

Tracking

()

People

(Reporter: calixte, Unassigned)

References

Details

(Keywords: crash)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 15

Comment 16

Comment 17

Comment 18