Closed
Bug 1442804
Opened 7 years ago
Closed 7 years ago
Heap write analysis is detecting but not reporting errors
Categories
(Core :: CSS Parsing and Computation, defect, P2)
Core
CSS Parsing and Computation
Tracking
()
RESOLVED
FIXED
mozilla61
People
(Reporter: sfink, Assigned: sfink)
References
Details
(Keywords: sec-audit, Whiteboard: [adv-main61-][post-critsmash-triage])
Attachments
(6 files)
51.20 KB,
text/plain
|
Details | |
1.51 KB,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
1.17 KB,
patch
|
bholley
:
review+
|
Details | Diff | Splinter Review |
1.09 KB,
patch
|
bholley
:
review+
|
Details | Diff | Splinter Review |
2.25 KB,
patch
|
bholley
:
review+
|
Details | Diff | Splinter Review |
2.43 KB,
patch
|
bholley
:
review+
|
Details | Diff | Splinter Review |
The heap write hazard analysis job is running and currently detecting 43 errors, but is showing up green on treeherder. The status code appears to not be getting propagated properly.
That means it is probably reporting quite a few legitimate races in the current code.
Assignee | ||
Comment 1•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → sphink
Comment 2•7 years ago
|
||
Most of them look sane fwiw.
Updated•7 years ago
|
Assignee | ||
Comment 3•7 years ago
|
||
The problem appears to already be in mozilla-release, where it reports 30 hazards.
Assignee | ||
Updated•7 years ago
|
status-firefox-esr52:
--- → unaffected
tracking-firefox-esr52:
--- → ?
Updated•7 years ago
|
status-firefox58:
--- → wontfix
tracking-firefox59:
--- → +
Comment 4•7 years ago
|
||
So, just took a quick look:
* Gecko_ContentList_AppendAll is mt-only.
* I don't know what Gecko_GetNextStyleChild is complaining about, but nsCanvasFrame::AppendAnonymousContentTo seems thread-safe to me, assuming the argument is owned, which it is.
* Gecko_GetLookAndFeelSystemColor: We explicitly call LookAndFeel::NativeInit on the main thread before calling into Servo to prevent this race:
https://searchfox.org/mozilla-central/rev/908e3662388a96673a0fc00b84c47f832a5bea7c/layout/style/ServoStyleSet.cpp#461
The rest of them look like from bug 1418161 which explicitly handles the race, or from bug 1410276, which given it runs during destruction of the string buffer I _think_ it's ok (since the thread freeing the last reference must be necessarily the only thread referencing the string buffer still for the code to be sound).
Assignee | ||
Comment 5•7 years ago
|
||
The reporting error was introduced in bug 1339989. I did 'exit_status = 1' in a shell script. Which means to run a command named 'exit_status' with two arguments, '=' and '1', rather than setting the variable exit_status as I intended. :(
Assignee | ||
Comment 6•7 years ago
|
||
Not sure who to pick for review, but it's trivial, and it can't land until we do something about the hazards anyway.
Attachment #8955721 -
Flags: review?(ted)
Comment 7•7 years ago
|
||
I'm going to call _this_ bug "sec-audit" because it sounds like you're still working through the list. If some or all of those errors need to be fixed we might want to spawn new bugs for each instance just to keep it all straight and can rate the security severity of each as appropriate.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #4)
> The rest of them look like [...] which explicitly handles the race,
Is there a way to annotate the code so the tool won't warn in these cases? Let's say we fix some but still end up with a report that has 28 "OK" errors every time it runs. We likely won't notice when 29 and 30 creep in there that might be real problems.
Keywords: sec-audit
Comment 8•7 years ago
|
||
If it ends up being easier to patch them all in this bug (because "no more errors" ensures we didn't forget one) then please clear sec-audit and replace it with the severity of the worst issue we're fixing. Probably sec-high for a csectype-race.
Assignee | ||
Comment 9•7 years ago
|
||
(In reply to Daniel Veditz [:dveditz] from comment #7)
> I'm going to call _this_ bug "sec-audit" because it sounds like you're still
> working through the list.
I think that's right. I attached a patch to fix the reporting problem, but that patch fixes a sec-audit problem, nothing more.
> If some or all of those errors need to be fixed we
> might want to spawn new bugs for each instance just to keep it all straight
> and can rate the security severity of each as appropriate.
Agreed.
> (In reply to Emilio Cobos Álvarez [:emilio] from comment #4)
> > The rest of them look like [...] which explicitly handles the race,
>
> Is there a way to annotate the code so the tool won't warn in these cases?
> Let's say we fix some but still end up with a report that has 28 "OK" errors
> every time it runs. We likely won't notice when 29 and 30 creep in there
> that might be real problems.
Yes, and that's exactly what we did when this was reporting correctly. There are annotation mechanisms, but we need to look at each one to figure out where to apply the annotation. Most of them are obvious. We want to get this back to zero to make it easy to know what's new.
The way this is set up, we *could* allow 28 errors through, in which case it would only turn the job red if you introduced a 29th, but then it would be a pain to look through the log to identify which is the new one. So we prefer to keep the allowed count at zero. Which worked great until I broke the reporting and it started lying about being green.
Comment 10•7 years ago
|
||
> I don't know what Gecko_GetNextStyleChild is complaining about
Presumably the virtual call to AppendAnonymousContentTo. I'm _guessing_ that the nsCanvasFrame version is just one possible target that was picked for that virtual call? In general, we could be calling any frame's AppendAnonymousContentTo here.
Comment 11•7 years ago
|
||
Comment on attachment 8955721 [details] [diff] [review]
Fix setting exit_status
Review of attachment 8955721 [details] [diff] [review]:
-----------------------------------------------------------------
That sucks, I hate shell scripting. Have you ever run this script through shellcheck? Every time I do that to any script I've written it points out that like 90% of my script is wrong.
Attachment #8955721 -
Flags: review?(ted) → review+
Assignee | ||
Comment 12•7 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #11)
> Comment on attachment 8955721 [details] [diff] [review]
> Fix setting exit_status
>
> Review of attachment 8955721 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> That sucks, I hate shell scripting. Have you ever run this script through
> shellcheck? Every time I do that to any script I've written it points out
> that like 90% of my script is wrong.
I just started using shellcheck via flycheck, and it was correctly marking this line red. But I'm so used to seeing the green warnings that I didn't pay attention and figured it out the old way instead. :-(
Assignee | ||
Comment 14•7 years ago
|
||
I haven't really thought through this one. Is it really ok to whitelist atomics? I mean, they tend to be used when you know there are multiple threads around, and the order of operations doesn't matter, but it's not like it would be hard to create a race condition with atomics. Is this a good assumption or not?
I did this because my local run got a huge number of hazards from __atomic_compare_exchange_8 or something like that.
Attachment #8956498 -
Flags: review?(bobbyholley)
Assignee | ||
Comment 15•7 years ago
|
||
I was originally planning to file separate bugs for all the problems found, but most of them are relatively minor annotation tweaks.
I'm sort of assuming that the mCanary field is intended to be racy. It gets set just before you free things, after all.
Attachment #8956499 -
Flags: review?(bobbyholley)
Assignee | ||
Comment 16•7 years ago
|
||
Also, I'm r?bholley on everything, on the assumption that you'll distribute them out if needed.
If I understand the comment correctly, GetAutoArrayBuffer gives back a pointer that is another way of accessing 'this', and so should have the same safety. Uh... I don't actually remember which hazard this was for, unless it was for the mHdr one, but I have an additional fix for that coming up.
Attachment #8956500 -
Flags: review?(bobbyholley)
Assignee | ||
Comment 17•7 years ago
|
||
Example hazard:
Error: Field write nsTArrayHeader.mCapacity
Location: _ZN13nsTArray_baseI27nsTArrayInfallibleAllocator25nsTArray_CopyWithMemutilsE14ShrinkCapacityEmm$void nsTArray_base<Alloc, Copy>::ShrinkCapacity(nsTArray_base<Alloc, Copy>::size_type, size_t) [with Alloc = nsTArrayInfallibleAllocator; Copy = nsTArray_CopyWithMemutils; nsTArray_base<Alloc, Copy>::size_type = long unsigned int; size_t = long unsigned int] @ obj-analyzed/dist/include/nsTArray-inl.h#241 ### SafeArguments: this
Stack Trace:
_ZN13nsTArray_ImplI6RefPtrI6nsAtomE27nsTArrayInfallibleAllocatorE7CompactEv$void nsTArray_Impl<E, Alloc>::Compact() [with E = RefPtr<nsAtom>; Alloc = nsTArrayInfallibleAllocator] @ xpcom/ds/nsTArray.h#2005 ### SafeArguments: this
_ZN13nsTArray_ImplI6RefPtrI6nsAtomE27nsTArrayInfallibleAllocatorE5ClearEv$void nsTArray_Impl<E, Alloc>::Clear() [with E = RefPtr<nsAtom>; Alloc = nsTArrayInfallibleAllocator] @ xpcom/ds/nsTArray.h#1785 ### SafeArguments: this
Gecko_nsStyleSVG_SetContextPropertiesLength @ layout/style/ServoBindings.cpp#2075 ### SafeArguments: aSvg
coming from the lines of code
size_type size = sizeof(Header) + length * aElemSize;
void* ptr = nsTArrayFallibleAllocator::Realloc(mHdr, size);
if (!ptr) {
return;
}
mHdr = static_cast<Header*>(ptr);
mHdr->mCapacity = length;
The analysis thinks 'ptr' is safe because it comes directly from a realloc, but once it's stored into a member (mHdr) that becomes irrelevant. However, it seems like an nsTArray's header is owned by the nsTArray, so if the array itself is safe, then its header should be too. This patch allows mHdr to inherit its safety from 'this'.
Attachment #8956502 -
Flags: review?(bobbyholley)
Comment 18•7 years ago
|
||
Comment on attachment 8956498 [details] [diff] [review]
heap write analysis: whitelist all atomics
Review of attachment 8956498 [details] [diff] [review]:
-----------------------------------------------------------------
Yeah, I think the reduced friction of people introducing new atomics without having to tweak the analysis is worth it.
Attachment #8956498 -
Flags: review?(bobbyholley) → review+
Updated•7 years ago
|
Attachment #8956500 -
Flags: review?(bobbyholley) → review+
Comment 19•7 years ago
|
||
Comment on attachment 8956499 [details] [diff] [review]
heap write analysis: explicitly whitelist stringbuffer canary field
Review of attachment 8956499 [details] [diff] [review]:
-----------------------------------------------------------------
It's not racey because it only gets written on the last release, when the caller has exclusive access.
Attachment #8956499 -
Flags: review?(bobbyholley) → review+
Comment 20•7 years ago
|
||
Comment on attachment 8956502 [details] [diff] [review]
heap write analysis: nsTArray owns its header
Review of attachment 8956502 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks for fixing all of this!
Attachment #8956502 -
Flags: review?(bobbyholley) → review+
Updated•7 years ago
|
Flags: needinfo?(bobbyholley)
Updated•7 years ago
|
Updated•7 years ago
|
Group: core-security → layout-core-security
Assignee | ||
Updated•7 years ago
|
Keywords: leave-open
Comment 22•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/3ae1535c7890b8d3e12e48a21ec3841ba8e09fc3
https://hg.mozilla.org/integration/mozilla-inbound/rev/277393616f91d3a05a264b318c3ffb3780c6d5e5
https://hg.mozilla.org/integration/mozilla-inbound/rev/228f3feb90616a96a70b3cd90217c37325e996b7
https://hg.mozilla.org/integration/mozilla-inbound/rev/c407e9656999eec919da180ba4ddd0b4e456971c
https://hg.mozilla.org/mozilla-central/rev/3ae1535c7890
https://hg.mozilla.org/mozilla-central/rev/277393616f91
https://hg.mozilla.org/mozilla-central/rev/228f3feb9061
https://hg.mozilla.org/mozilla-central/rev/c407e9656999
Comment 23•7 years ago
|
||
Steve, this bug is marked as tracking 60 & 61. As we're starting to get late in the cycle, is there more work planned before the trains start moving?
Flags: needinfo?(sphink)
Updated•7 years ago
|
Priority: -- → P2
Assignee | ||
Comment 24•7 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #23)
> Steve, this bug is marked as tracking 60 & 61. As we're starting to get late
> in the cycle, is there more work planned before the trains start moving?
There is certainly more work planned, but I don't expect to get to it soon, so we should probably fork bugs. One for making the job properly report errors, one (probably this one) for the fixes already made and landed, and at least one for further fixes (that are required before we can allow the job to report errors again.) But I'm not sure if that's the most convenient way for you tracking people, so I won't create them until you weigh in.
Flags: needinfo?(sphink) → needinfo?(ryanvm)
Comment 25•7 years ago
|
||
That all sounds good to me. Given where we are in the cycle, I like the idea of spinning off remaining work to new bugs which can be tracked independently of what's already been landed here.
Flags: needinfo?(ryanvm)
Comment 26•7 years ago
|
||
(In reply to Steve Fink [:sfink] [:s:] (PTO Apr9-12) from comment #24)
> (In reply to Ryan VanderMeulen [:RyanVM] from comment #23)
> > Steve, this bug is marked as tracking 60 & 61. As we're starting to get late
> > in the cycle, is there more work planned before the trains start moving?
>
> There is certainly more work planned, but I don't expect to get to it soon,
> so we should probably fork bugs. One for making the job properly report
> errors, one (probably this one) for the fixes already made and landed, and
> at least one for further fixes (that are required before we can allow the
> job to report errors again.) But I'm not sure if that's the most convenient
> way for you tracking people, so I won't create them until you weigh in.
Wait, I'm confused. This is all fixed and landed on FF60, which is currently on beta, right? Is this about uplifting the patches to 59 or something? I don't think we need to do that...
Flags: needinfo?(sphink)
Comment 27•7 years ago
|
||
Per IRC discussion, this bug is fixed and further follow-ups will go into new bugs. It's too late for any Fx60 backports at this point, but is there still something we'd want to land on ESR60 for the 60.1 release shipping in June?
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox-esr60:
--- → affected
tracking-firefox-esr60:
--- → ?
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Assignee | ||
Comment 28•7 years ago
|
||
I'd say wontfix it. Everything so far has been false positives.
Flags: needinfo?(sphink)
Updated•7 years ago
|
Assignee | ||
Comment 29•7 years ago
|
||
I filed bug 1459323 for the continuation.
Updated•7 years ago
|
Group: layout-core-security → core-security-release
Updated•6 years ago
|
Whiteboard: [adv-main61-]
Updated•6 years ago
|
Flags: qe-verify-
Whiteboard: [adv-main61-] → [adv-main61-][post-critsmash-triage]
Updated•5 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•