ThreadSanitizer: data race [@ mozilla::layers::WebRenderBridgeParent::ClearResources] vs. [@ mozilla::layers::CompositorBridgeParent::GetCompositorBridgeParentFromWindowId]
Categories
(Core :: Graphics: WebRender, defect)
Tracking
()
People
(Reporter: tsmith, Assigned: aosmond)
References
(Blocks 2 open bugs)
Details
(Keywords: csectype-race, sec-moderate)
Crash Data
Attachments
(2 files)
17.67 KB,
text/plain
|
Details | |
48 bytes,
text/x-phabricator-request
|
tjr
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-release+
tjr
:
sec-approval+
|
Details | Review |
The attached crash information was detected by ThreadSanitizer while fuzzing build mozilla-central 20210409-7bc2dd06085f. Unfortunately a test case is not available.
General information about TSan reports
Why fix races?
Data races are undefined behavior and can cause crashes as well as correctness issues. Compiler optimizations can cause racy code to have unpredictable and hard-to-reproduce behavior.
Rating
If you think this race can cause crashes or correctness issues, it would be great to rate the bug appropriately as P1/P2 and/or indicating this in the bug. This makes it a lot easier for us to assess the actual impact that these reports make and if they are helpful to you.
False Positives / Benign Races
Typically, races reported by TSan are not false positives [1], but it is possible that the race is benign. Even in this case it would be nice to come up with a fix if it is easily doable and does not regress performance. Every race that we cannot fix will have to remain on the suppression list and slows down the overall TSan performance. Also note that seemingly benign races can possibly be harmful (also depending on the compiler, optimizations and the architecture) [2][3].
[1] One major exception is the involvement of uninstrumented code from third-party libraries.
[2] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
[3] How to miscompile programs with "benign" data races: https://www.usenix.org/legacy/events/hotpar11/tech/final_files/Boehm.pdf
Suppressing unfixable races
If the bug cannot be fixed, then a runtime suppression needs to be added in mozglue/build/TsanOptions.cpp. The suppressions match on the full stack, so it should be picked such that it is unique to this particular race. The bug number of this bug should also be included so we have some documentation on why this suppression was added.
Updated•4 years ago
|
Comment 1•4 years ago
|
||
From the stacks, it looks like WebRenderBridgeParent::RecvShutdownSync() is causing WebRenderBridgeParent::ClearResources() to run on the compositor thread, and that's racing with some actual use of a WebRender data structure on a WRScene~ilder#6 thread.
Assignee | ||
Comment 2•4 years ago
|
||
The scene builder thread really shouldn't be calling that API to get the CompositorBridgeParent. I suspect we can defer to this the compositor thread and avoid the race.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 3•4 years ago
|
||
We have a similar issue at the other call site at https://searchfox.org/mozilla-central/rev/f018480dfed4fc583703a5770a6db9ab9dc0fb99/gfx/webrender_bindings/RenderThread.cpp#1214
Assignee | ||
Comment 4•4 years ago
|
||
Comment 5•4 years ago
|
||
This is a shutdown race, so I'm going to mark it as sec-moderate.
Assignee | ||
Comment 6•4 years ago
|
||
Comment on attachment 9217909 [details]
Bug 1704227.
Security Approval Request
- How easily could an exploit be constructed based on the patch?: Given how simple the function we should be calling on the compositor thread is, I imagine they can figure out there is a race with CompositorBridgeParent/WebRenderAPI somewhat easily.
- Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
- Which older supported branches are affected by this flaw?: All, disabled in ESR by default
- If not all supported branches, which bug introduced the flaw?: None
- Do you have backports for the affected branches?: No
- If not, how different, hard to create, and risky will they be?: Same patch should apply
- How likely is this patch to cause regressions; how much testing does it need?: Minimal chance of regressions
Assignee | ||
Comment 7•4 years ago
|
||
FWIW, I believe a similar race could occur when a window is closed, not just shutdown.
Comment 8•4 years ago
|
||
Thanks for the clarification. I think I'll just leave it as sec-moderate as it still seems tricky to exploit, but maybe it should be sec-high.
Comment 9•4 years ago
|
||
Comment on attachment 9217909 [details]
Bug 1704227.
Approved to land and uplift
Comment 10•4 years ago
|
||
https://hg.mozilla.org/integration/autoland/rev/a9690e74a6bdfa95138c5a489ece78e72d718af7
https://hg.mozilla.org/mozilla-central/rev/a9690e74a6bd
Comment 11•4 years ago
|
||
uplift |
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 12•4 years ago
|
||
Comment on attachment 9217909 [details]
Bug 1704227.
Beta/Release Uplift Approval Request
- User impact if declined: Potential vulnerability and related crash at low rates observed in the wild
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The scene builder thread is posting a message to the compositor thread to do some work. We just move more of the work to the compositor thread, which is normally where that sort of thing is done which by its very nature avoids all sorts of potential races.
- String changes made/needed:
Comment 13•4 years ago
|
||
Comment on attachment 9217909 [details]
Bug 1704227.
Approved for 88.0.1.
Comment 14•4 years ago
|
||
uplift |
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•3 years ago
|
Description
•