Closed Bug 1403845 Opened 7 years ago Closed 5 years ago

stylo: Crash in style::properties::cascade

Categories

(Core :: CSS Parsing and Computation, defect, P3)

57 Branch
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox57 + wontfix
firefox58 + wontfix
firefox59 --- wontfix
firefox60 --- ?

People

(Reporter: philipp, Assigned: emilio)

References

(Depends on 1 open bug)

Details

(4 keywords, Whiteboard: [sec-triage-backlog])

Crash Data

This bug was filed from the Socorro interface and is 
report bp-22bb0310-2c79-4039-afc9-5b3e40170928.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	xul.dll 	std::panicking::rust_panic_with_hook 	src/libstd/panicking.rs:555
1 	xul.dll 	std::panicking::begin_panic<collections::string::String> 	src/libstd/panicking.rs:511
2 	xul.dll 	std::panicking::begin_panic_fmt 	src/libstd/panicking.rs:495
3 	xul.dll 	core::panicking::panic_fmt 	src/libcore/panicking.rs:61
4 	xul.dll 	core::panicking::panic 	src/libcore/panicking.rs:49
5 	xul.dll 	style::properties::cascade 	toolkit/library/x86_64-pc-windows-msvc/release/build/style-7d743b4d7af5be49/out/properties.rs:142759
6 	xul.dll 	style::style_resolver::StyleResolverForElement<style::gecko::wrapper::GeckoElement>::cascade_style<style::gecko::wrapper::GeckoElement> 	servo/components/style/style_resolver.rs:586
7 	xul.dll 	style::style_resolver::StyleResolverForElement<style::gecko::wrapper::GeckoElement>::cascade_style_and_visited<style::gecko::wrapper::GeckoElement> 	servo/components/style/style_resolver.rs:309
8 	xul.dll 	style::style_resolver::StyleResolverForElement<style::gecko::wrapper::GeckoElement>::cascade_primary_style<style::gecko::wrapper::GeckoElement> 	servo/components/style/style_resolver.rs:214
9 	xul.dll 	style::style_resolver::StyleResolverForElement<style::gecko::wrapper::GeckoElement>::resolve_style<style::gecko::wrapper::GeckoElement> 	servo/components/style/style_resolver.rs:225
10 	xul.dll 	style::traversal::compute_style<style::gecko::wrapper::GeckoElement> 	servo/components/style/traversal.rs:672
11 	xul.dll 	geckoservo::glue::traverse_subtree 	servo/ports/geckolib/glue.rs:261
12 	xul.dll 	geckoservo::glue::Servo_TraverseSubtree 	servo/ports/geckolib/glue.rs:300
13 	xul.dll 	mozilla::ServoStyleSet::StyleDocument(mozilla::ServoTraversalFlags) 	layout/style/ServoStyleSet.cpp:977
14 	xul.dll 	mozilla::ServoRestyleManager::DoProcessPendingRestyles(mozilla::ServoTraversalFlags) 	layout/base/ServoRestyleManager.cpp:1106
15 	xul.dll 	mozilla::PresShell::DoFlushPendingNotifications(mozilla::ChangesToFlush) 	layout/base/PresShell.cpp:4170
16 	xul.dll 	nsIPresShell::FlushPendingNotifications(mozilla::ChangesToFlush) 	layout/base/nsIPresShell.h:566
17 	xul.dll 	nsRefreshDriver::Tick(__int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:1921
18 	xul.dll 	mozilla::RefreshDriverTimer::TickDriver(nsRefreshDriver*, __int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:337
19 	xul.dll 	mozilla::RefreshDriverTimer::TickRefreshDrivers(__int64, mozilla::TimeStamp, nsTArray<RefPtr<nsRefreshDriver> >&) 	layout/base/nsRefreshDriver.cpp:307
20 	xul.dll 	mozilla::RefreshDriverTimer::Tick(__int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:329
21 	xul.dll 	mozilla::VsyncRefreshDriverTimer::RunRefreshDrivers(mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:770
22 	xul.dll 	mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:683
23 	xul.dll 	mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync(mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:584
24 	xul.dll 	mozilla::layout::VsyncChild::RecvNotify(mozilla::TimeStamp const&) 	layout/ipc/VsyncChild.cpp:67
25 	xul.dll 	mozilla::layout::PVsyncChild::OnMessageReceived(IPC::Message const&) 	ipc/ipdl/PVsyncChild.cpp:155
26 	xul.dll 	mozilla::ipc::PBackgroundChild::OnMessageReceived(IPC::Message const&) 	ipc/ipdl/PBackgroundChild.cpp:1695
27 	xul.dll 	mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) 	ipc/glue/MessageChannel.cpp:2115
28 	xul.dll 	mozilla::ipc::MessageChannel::DispatchMessageW(IPC::Message&&) 	ipc/glue/MessageChannel.cpp:2045
29 	xul.dll 	mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) 	ipc/glue/MessageChannel.cpp:1891
30 	xul.dll 	mozilla::ipc::MessageChannel::MessageTask::Run() 	ipc/glue/MessageChannel.cpp:1924
31 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1039
32 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/threads/nsThreadUtils.cpp:521
33 	xul.dll 	mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:125
34 	xul.dll 	mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:301
35 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:319
36 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:299
37 	xul.dll 	nsBaseAppShell::Run() 	widget/nsBaseAppShell.cpp:158
38 	xul.dll 	nsAppShell::Run() 	widget/windows/nsAppShell.cpp:230
39 	xul.dll 	XRE_RunAppShell() 	toolkit/xre/nsEmbedFunctions.cpp:880
40 	xul.dll 	mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:269
41 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:319
42 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:299
43 	xul.dll 	XRE_InitChildProcess(int, char** const, XREChildData const*) 	toolkit/xre/nsEmbedFunctions.cpp:705
44 	firefox.exe 	content_process_main(mozilla::Bootstrap*, int, char** const) 	ipc/contentproc/plugin-container.cpp:63
45 	firefox.exe 	NS_internal_main(int, char**, char**) 	browser/app/nsBrowserApp.cpp:285
46 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:115
47 	firefox.exe 	__scrt_common_main_seh 	f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253
48 	kernel32.dll 	BaseThreadInitThunk 	
49 	ntdll.dll 	RtlUserThreadStart

this signature seems to spring up during the firefox 57 cycle - so far i haven't seen it on 58.0a1 yet though.
around half the reports come with moz crash reason "called `Option::unwrap()` on a `None` value", one report at bp-afbb5af8-5894-4ce8-9963-be7070170808 with "Found unexpected value in style struct for border_right_style property: 112".
Summary: Crash in style::properties::cascade → stylo: Crash in style::properties::cascade
Crash Signature: [@ style::properties::cascade] → [@ style::properties::cascade] [@ style::custom_properties::cascade] [@ style::custom_properties::finish_cascade]
Most of these seem to be:

called `Option::unwrap()` on a `None` value

Somewhere in cascade(), but the line number is just at the end of the generated properties::cascade() function, so it's hard to tell exactly where the problem is.

If nothing jumps out at us, a next reasonable step would be to replace the unwrap() calls with expect(), so that we get more useful panic messages.
Priority: -- → P3
We can also recompile the same revision and look at the generated file somewhere like ./obj-x86_64-pc-linux-gnu/toolkit/library/x86_64-unknown-linux-gnu/release/build/style-186da10a8220d0f4/out/properties.rs
(In reply to Simon Sapin (:SimonSapin) from comment #2)
> We can also recompile the same revision and look at the generated file
> somewhere like
> ./obj-x86_64-pc-linux-gnu/toolkit/library/x86_64-unknown-linux-gnu/release/
> build/style-186da10a8220d0f4/out/properties.rs

We don't need to recompile. The crash reports website can display generated files now.
Yeah, the issue is just that the source line doesn't tell us very much, which I'm assuming is related to inlining / optimizations.
I wonder, couldn't unwrap get the exact line it is called? IIRC it can in Rust? There is even a RFC [1] which wants to bring this to other functions. Could we include that information in crash reports?

[1] https://github.com/rust-lang/rfcs/pull/2091
Caller information is not available without resorting to things like libbacktrace. You might be thinking of the line!() macro, which is like __LINE__ in C++.
> There is even a RFC [1] which wants to bring this to other functions

This RFC doesn't just bring it to "other functions", it brings it to unwrap() too. unwrap() doesn't currently do this.

Macros like panic do.
There are 2 crashes in nightly 58 starting with buildid 20171003100226.
All sorts of bad addresses, read and write, and illegal instructions.  Restricting.  sec-high; might be worse given the spew of crash types/addresses

Bobby - can you re-evaluate next steps and priority?  Thanks
Group: core-security
Flags: needinfo?(bobbyholley)
There are some pretty random crashes that don't look like panic, like:

 > Found unexpected value in style struct for border_left_style property: 127

It may be related to the hashmap stuff we're trying to track down in bug 1404791 (and it kinda makes sense if these are due to custom properties).

Are there any patterns in the URLs and such? I know for example youtube uses custom properties quite heavily (I can't access the URLs).
Flags: needinfo?(rjesup)
(In reply to Emilio Cobos Álvarez [:emilio] from comment #10)
> There are some pretty random crashes that don't look like panic, like:

s/don't look like panic/I cannot make any sense of
Flags: needinfo?(rjesup)
Looks like something goes wrong in Stylo, Rust catches the error as it should, and then the panic code screws up. Fixing CSS code to prevent the panic might leave whatever bug is allowing the panic code to go off the rails. That's the one I'm worried about--if it's happened here why couldn't it happen with other Rust panics?
Are you sure? Some of them don't look panicking. There's nothing panic-related on https://crash-stats.mozilla.com/report/index/34eb30d6-7d7e-44e5-9e2a-49b9f0171004, for example.
(And, I know what piece of code is triggering the particular panic about border_foo_style, but that's... not suspicious).

Anyway this morning I opened https://github.com/servo/servo/pull/18745 to reduce sources of panic in the `custom_properties` module, which is the one discussed at the top of this bug.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #10)
> There are some pretty random crashes that don't look like panic, like:
> 
>  > Found unexpected value in style struct for border_left_style property: 127
> 
> It may be related to the hashmap stuff we're trying to track down in bug
> 1404791 (and it kinda makes sense if these are due to custom properties).
> 
> Are there any patterns in the URLs and such? I know for example youtube uses
> custom properties quite heavily (I can't access the URLs).

Most of the urls are related to youtube.
Group: core-security → layout-core-security
Depends on: 1405879
The fact that we're hitting garbage pointers on youtube sure makes me suspect that this is the same hashtable corruption as we're looking at in bug 1404791.

Various in-flight things of interest:
* bug 1405879 - poison the hashtable buffers with a safe value.
* https://github.com/servo/servo/pull/18745 - which cleans up the consuming code, and could change patterns.
* bug 1385971#c40 - Changing the hash function, which _might_ alter the frequency.

Gankro's fuzzer didn't turn up anything, and the crashes are almost non-existent on nightly, which hampers out diagnostic ability. However, I'm hoping that we can uplift the poisoning (bug 1405879) as a security/mitigation measure, which would simultaneously allow us to confirm that theory.

Also note that I may head out on paternity leave soon, which would impact my ability to pursue this.
Flags: needinfo?(bobbyholley)
[Tracking Requested - why for this release]:
sec-high revolving around rust and stylo. sounds like a blocker.
Assignee: nobody → bobbyholley
Priority: P3 → P1
I'm hopeful that, at the very least, bug 1405879 will turn these into safe crashes.
(In reply to Jim Mathies [:jimm] from comment #18)
> [Tracking Requested - why for this release]:
> sec-high revolving around rust and stylo. sounds like a blocker.

Tracking 57/58+ for this issue, especially given the popularity of the main site affected.
Tracking this investigation in bug 1406996.
Priority: P1 → P3
Assignee: bobbyholley → emilio
status-firefox57=wontfix unless someone thinks this bug should block 57
Whiteboard: [sec-triage-backlog]
As part of layout security bug scrub, marking bugs being tracked in the Rust HashMap investigation (bug 1406996) as stalled for now.
Keywords: stalled

In the last 6 months (the full crash-stats window) I don't see any crashes newer than ESR-60 on desktop or Fx62 on Android

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
Group: layout-core-security
You need to log in before you can comment on or make changes to this bug.