Closed
Bug 1321357
Opened 8 years ago
Closed 8 years ago
Fix crash in bug 1321066 more thoroughly (so it doesn't crash at all)
Categories
(Core :: SVG, defect)
Core
SVG
Tracking
()
RESOLVED
FIXED
mozilla53
People
(Reporter: dholbert, Assigned: birtles)
References
()
Details
(Keywords: crash, sec-other, testcase, Whiteboard: [adv-main51-])
Crash Data
Attachments
(2 files, 6 obsolete files)
3.17 KB,
patch
|
dholbert
:
review+
jcristau
:
approval-mozilla-aurora+
jcristau
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
4.81 KB,
patch
|
birtles
:
review+
|
Details | Diff | Splinter Review |
In bug 1321066, we crashed because:
- We're iterating across a nsTArray, using a raw pointer to its Elements(). (Actually it's a nsTPriorityQueue, but that's backed by a nsTArray.)
- During that iteration, we add entries to the array.
- This ends up reallocating the array's buffer.
- So from that point on, our iteration is walking through freed memory.
For now, we're taking a band-aid fix that detects the array-modification and safely crashes at that point. This should prevent any badness. HOWEVER: we should dig in & fix this more thoroughly, so that the testcases don't crash at all. I'm filing this bug on that part.
The testcases in bug 1321066 should serve as testcases for this bug as well.
POSSIBLE FIX STRATEGIES:
* Strategy #1: If we're sure that these modifications aren't supposed to be possible during iteration: we need to figure out why they're happening & how to avoid them. (Note that we have a (debug-only) assertion that indicates that we expect the array to be un-changed during iteration right now. So I think this might be feasible.)
* Strategy #2: Change nsTPriorityQueue (or at least the one used here) to be backed by a nsTObserverArray, which *is* safe to modify during iteration. (If we follow this strategy, we could perhaps remove the release-build assertions added in bug 1321066, since the modifications caught by those assertions would become safe.)
Birtles, I'm hoping you're up for looking into this -- IIRC the SMIL milestone stuff is super-complicated, and you're the most likely to remember how it's supposed to fit together. :) Assigning to you for now.
Reporter | ||
Comment 1•8 years ago
|
||
(I've marked this bug as security sensitive, but it should be safe to open this up once we've opened up bug 1321066. This is just a followup for that bug, and it has the same level of sensitivity.)
Reporter | ||
Comment 2•8 years ago
|
||
OS: Unspecified → All
Hardware: Unspecified → All
Reporter | ||
Comment 3•8 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #0)
> POSSIBLE FIX STRATEGIES:
> * Strategy #1: If we're sure that these modifications aren't supposed to be
> possible during iteration: we need to figure out why they're happening & how
> to avoid them. (Note that we have a (debug-only) assertion that indicates
> that we expect the array to be un-changed during iteration right now.
This ^^ was referring to the following MOZ_ASSERT( (w/ comparison to queueLength) here:
=====
const MilestoneEntry* p = mMilestoneEntries.Elements();
#if DEBUG
uint32_t queueLength = mMilestoneEntries.Length();
#endif
while (p < mMilestoneEntries.Elements() + mMilestoneEntries.Length()) {
mozilla::dom::SVGAnimationElement* elem = p->mTimebase.get();
elem->TimedElement().HandleContainerTimeChange();
MOZ_ASSERT(queueLength == mMilestoneEntries.Length(),
"Call to HandleContainerTimeChange resulted in a change to the "
"queue of milestones");
++p;
=====
https://dxr.mozilla.org/mozilla-central/rev/a69583d2dbc6fdc18f63761a89cf539c356668be/dom/smil/nsSMILTimeContainer.cpp#329
Assignee | ||
Comment 4•8 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #0)
> Birtles, I'm hoping you're up for looking into this -- IIRC the SMIL
> milestone stuff is super-complicated, and you're the most likely to remember
> how it's supposed to fit together. :) Assigning to you for now.
Absolutely! Thank you so much for all your work on the emergency fix. I saw the bug report after you'd already reviewed the patch for it and didn't have anything to add at that point. I was thinking about how to fix this properly last night but I agree that both of those approaches appear promising (although fixing nsTPriorityQueue seems safest at this stage).
Updated•8 years ago
|
Group: layout-core-security
Assignee | ||
Comment 5•8 years ago
|
||
Having looked into that assertion, I don't think we can reasonably expect the set of milestones won't change from calling nsSMILTimedElement::HandleContainerTimeChange.
I think there are at least two possible fixes:
1. Iterate over the milestones once and pull out the elements from the milestones into a separate array (containing owning references to the elements). Then iterate over the array of elements and call HandleContainerTimeChange on each one.
2. Change nsTPriorityQueue to use nsTObserverArray and expose the necessary iterators (at least EndLimitedIterator for this case).
(2) seems more robust so I'll make up a patch for that. I might make up a patch for (1) as well so we can compare (and in case it's more suitable for uplifting).
Updated•8 years ago
|
Flags: in-testsuite?
Comment 6•8 years ago
|
||
FYI, I just posted a more real world like test case with an actual animation in bug 1321066.
Comment 7•8 years ago
|
||
(In reply to Yuhong Bao from comment #6)
> FYI, I just posted a more real world like test case with an actual animation
> in bug 1321066.
Thanks. I added a crash signature here in case it starts happening a lot in the wild.
Crash Signature: [@ nsSMILTimeContainer::AddMilestone ]
status-firefox50:
--- → affected
status-firefox51:
--- → affected
status-firefox52:
--- → affected
status-firefox-esr45:
--- → affected
Assignee | ||
Comment 8•8 years ago
|
||
This took a little longer than I anticipated due to bug 1321468.
The patch ended up being larger than I anticipated too so I'll have a look at approach 1 from comment 5 to see how it looks.
Assignee | ||
Comment 10•8 years ago
|
||
Assignee | ||
Comment 11•8 years ago
|
||
Made a few tweaks to the comment.
Attachment #8816027 -
Attachment is obsolete: true
Assignee | ||
Comment 12•8 years ago
|
||
Fix a namespace issue.
Attachment #8816010 -
Attachment is obsolete: true
Assignee | ||
Comment 13•8 years ago
|
||
(In reply to Brian Birtles (:birtles) from comment #10)
> Approach 2:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=c7dabe4d8e3b82e407dfe1f990a4c08609b2ccf3
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b52f53f137c030981cee7a082e4314165590d798
Assignee | ||
Comment 14•8 years ago
|
||
Daniel, I've prepared two different approaches to this and currently I'm thinking we could:
* Land approach 1 on branches since it's quite simple. It keeps the
mHoldingEntries check too just in case, but we might decide to drop that.
* Land approach 2 on trunk and split it up a little:
- Add SwapElementsAt to nsTObserverArray
- Port nsTPriorityQueue to nsTObserverArray and update the two call sites in
nsSMILTimeContainer that rely on Elements() since that would no longer be
available.
(We could possibly add a couple more gtests to TestPriorityQueue.cpp but
I suspect they'd just be testing nsTObserverArray functionality.)
- Drop mHoldingEntries from nsSMILTimeContainer and add the crashtest
If you agree, I'll go ahead and split up part 2 and put it all up for review.
Flags: needinfo?(dholbert)
Assignee | ||
Comment 15•8 years ago
|
||
Oh, that's assuming try looks good!
Assignee | ||
Comment 16•8 years ago
|
||
Actually, I think approach 2 is not ultimately right. If we do hit this situation where we end up adding milestones while iterating over them, then, because we're using a priority queue, we may end up shuffling the array around and any active iterators will visit the wrong set of elements. So, while it won't crash, it won't produce the correct result either.
So, I think we should just land approach 1. Then, in a separate patch, perhaps even a separate bug and on trunk only:
- Drop mHoldingEntries
- Harden nsTPriorityQueue by basing it on nsTObserverArray and only exposing iterators from nsTObserverArray instead of Elements().
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(dholbert)
Attachment #8816028 -
Flags: review?(dholbert)
Reporter | ||
Comment 17•8 years ago
|
||
Comment on attachment 8816028 [details] [diff] [review]
Approach 1 - Copy the elements to a separate array before iterating over them (v2)
Review of attachment 8816028 [details] [diff] [review]:
-----------------------------------------------------------------
Commit message nit:
(1) "This patch leaves mHoldingEntries is place for now"
s/is/in/
"we should probably remove it when/after landing."
(2) IMO, we shouldn't remove it -- we should just convert it to debug-only:
- We should wrap all its usages with #ifdef DEBUG (not making it DebugOnly<> -- since IIRC DebugOnly takes up space as a member-variable)
- We should convert the MOZ_RELEASE_ASSERTs to just MOZ_ASSERT.
::: dom/smil/nsSMILTimeContainer.cpp
@@ +330,4 @@
> ++p;
> }
> +
> + p = nullptr;
The "p = nullptr" here is a bit awkward, since we never use "p" after this point. I think it's an attempt at enforcing scope -- so let's just replace the "while" loop with a "for" loop, so that p is *actually* scoped without the need for this explicit nulling.
I think this should do it:
for (const MilestoneEntry* p = mMilestoneEntries.Elements();
p < mMilestoneEntries.Elements() + mMilestoneEntries.Length();
++p) {
elems.AppendElement(p->mTimebase.get());
}
What do you think?
Reporter | ||
Comment 18•8 years ago
|
||
(In reply to Brian Birtles (:birtles) from comment #16)
> Actually, I think approach 2 is not ultimately right. If we do hit this
> situation where we end up adding milestones while iterating over them, then,
> because we're using a priority queue, we may end up shuffling the array
> around and any active iterators will visit the wrong set of elements. So,
> while it won't crash, it won't produce the correct result either.
Agreed.
> So, I think we should just land approach 1.
Agreed (modulo review comments).
> Then, in a separate patch, perhaps even a separate bug and on trunk only:
>
> - Drop mHoldingEntries
I think we should preserve mHoldingEntries as a debug-only thing, per comment 17.
> - Harden nsTPriorityQueue by basing it on nsTObserverArray and only exposing
> iterators from nsTObserverArray instead of Elements().
I'm not sure this is a good idea at this point. IIUC we won't need to have nsTObserverArray under the hood anymore, for this usage at least, after we've taken Approach 1. And I expect that the conversion to nsTObserverArray isn't free, so it doesn't make sense that we'd shake up this implementation & take a runtime penalty for no clear benefit. (If you disagree, though, feel free to file the followup bug on this part, and we can continue this discussion over there.)
Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(bbirtles)
Assignee | ||
Comment 19•8 years ago
|
||
Yes, thinking about this overnight I came to the same conclusion that it's probably not worth fiddling with nsTPriorityQueue. I'll make the changes to Approach 1.
Flags: needinfo?(bbirtles)
Assignee | ||
Comment 20•8 years ago
|
||
MozReview-Commit-ID: GjkXYlhoeoy
Attachment #8816317 -
Flags: review?(dholbert)
Assignee | ||
Updated•8 years ago
|
Attachment #8816028 -
Attachment is obsolete: true
Attachment #8816028 -
Flags: review?(dholbert)
Assignee | ||
Comment 21•8 years ago
|
||
MozReview-Commit-ID: 2plUpGMj1FB
Attachment #8816318 -
Flags: review?(dholbert)
Assignee | ||
Updated•8 years ago
|
Attachment #8816042 -
Attachment is obsolete: true
Reporter | ||
Comment 22•8 years ago
|
||
Comment on attachment 8816317 [details] [diff] [review]
part 1 - Copy elements from mMilestoneEntries before iterating over them
Review of attachment 8816317 [details] [diff] [review]:
-----------------------------------------------------------------
This is basically r+ with nits addressed, but I'll mark it r- and take a look at the final patch to be on the cautious side.
::: dom/smil/nsSMILTimeContainer.cpp
@@ +318,5 @@
> // with significant transitions yet to fire should have their next milestone
> // registered. Other timed elements don't matter.
> +
> + // Copy the timed elements to a separate array before calling
> + // HandleContainerTimeChange on each them in case doing so mutates
Typo -- s/each them/each of them/
@@ +329,5 @@
> + p < mMilestoneEntries.Elements() + mMilestoneEntries.Length();
> + ++p) {
> + elems.AppendElement(p->mTimebase.get());
> + }
> + mHoldingEntries = false;
Technically, "mHoldingEntries = false" here is bogus. You're assuming that it was false before we set it to true, but you don't actually know that. We should really be letting |saveHolding| restore the saved value here.
I think the only way to do that is to force the AutoRestore to go out of scope at this point -- i.e. to add an extra layer of curly-braces, starting from
nsTArray<RefPtr<mozilla::dom::SVGAnimationElement>> elems;
// Scope for saveHolding:
{
AutoRestore<bool> saveHolding(mHoldingEntries);
mHoldingEntries = true;
for (...) {
...
}
}
Attachment #8816317 -
Flags: review?(dholbert) → review-
Reporter | ||
Comment 23•8 years ago
|
||
Comment on attachment 8816318 [details] [diff] [review]
part 2 - Make nsSMILTimeContainer::mHoldingEntries debug-only
Review of attachment 8816318 [details] [diff] [review]:
-----------------------------------------------------------------
r=me on part 2, though it'll need to be rebased on top of the updates to "part 1" of course.
::: dom/smil/nsSMILTimeContainer.cpp
@@ +340,2 @@
> mHoldingEntries = false;
> +#endif
(This "mHoldingEntries = false;" assignment won't exist anymore, per my previous comment, so this chunk of the patch won't be needed.)
Attachment #8816318 -
Flags: review?(dholbert) → review+
Assignee | ||
Comment 24•8 years ago
|
||
MozReview-Commit-ID: GjkXYlhoeoy
Attachment #8816325 -
Flags: review?(dholbert)
Assignee | ||
Updated•8 years ago
|
Attachment #8816317 -
Attachment is obsolete: true
Assignee | ||
Comment 25•8 years ago
|
||
MozReview-Commit-ID: 2plUpGMj1FB
Attachment #8816326 -
Flags: review?(dholbert)
Assignee | ||
Updated•8 years ago
|
Attachment #8816318 -
Attachment is obsolete: true
Assignee | ||
Comment 26•8 years ago
|
||
Comment on attachment 8816326 [details] [diff] [review]
part 2 - Make nsSMILTimeContainer::mHoldingEntries debug-only
Carry forward r+ from comment 23.
Attachment #8816326 -
Flags: review?(dholbert) → review+
Reporter | ||
Comment 27•8 years ago
|
||
Comment on attachment 8816325 [details] [diff] [review]
part 1 - Copy elements from mMilestoneEntries before iterating over them
Review of attachment 8816325 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks - this looks good! r=me
Attachment #8816325 -
Flags: review?(dholbert) → review+
Assignee | ||
Comment 28•8 years ago
|
||
Comment 29•8 years ago
|
||
Pushed by bbirtles@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8c1e9210ea85
part 1 - Copy elements from mMilestoneEntries before iterating over them; r=dholbert
https://hg.mozilla.org/integration/mozilla-inbound/rev/7004ffd040dc
part 2 - Make nsSMILTimeContainer::mHoldingEntries debug-only; r=dholbert
Comment 30•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/8c1e9210ea85
https://hg.mozilla.org/mozilla-central/rev/7004ffd040dc
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Assignee | ||
Comment 31•8 years ago
|
||
Comment on attachment 8816325 [details] [diff] [review]
part 1 - Copy elements from mMilestoneEntries before iterating over them
Approval Request Comment
[Feature/Bug causing the regression]: bug 474743
[User impact if declined]: (Safe) crash with some content
[Is this code covered by automated tests?]: Yes (included in this patch)
[Has the fix been verified in Nightly?]: It has landed in central, but not shipped in Nightly yet
[Needs manual test from QE? If yes, steps to reproduce]: Who is OE? In any case, I don't think so since the patch includes an automated test. If someone wants to verify, the test steps from bug 1321066 should apply although in this case we don't expect a crash.
[List of other uplifts needed for the feature/fix]: Just this patch is enough (part 2 is optional)
[Is the change risky?]: Not particularly.
[Why is the change risky/not risky?]: It keeps the safe-guard checks introduced in bug 1321066.
[String changes made/needed]: None
Attachment #8816325 -
Flags: approval-mozilla-beta?
Attachment #8816325 -
Flags: approval-mozilla-aurora?
Comment 32•8 years ago
|
||
What about 45ESR, since the (safe) crash is still there?
Reporter | ||
Comment 33•8 years ago
|
||
(In reply to Cameron Kaiser [:spectre] from comment #32)
> What about 45ESR, since the (safe) crash is still there?
This is just a stability fix at this point -- i.e. a fix fora safe crash (and one that's unlikely to be hit by normal web content, since SMIL is so rarely used, and syncbase-timing-with-scripting is even more rarely used).
And the ESR FAQ page says:
> Maintenance of each ESR, through point releases, is limited to
> high-risk/high-impact security vulnerabilities [...].
> Backports of any functional enhancements and/or stability fixes
> are not in scope.
https://www.mozilla.org/en-US/firefox/organizations/faq/
So I think that makes this out-of-scope for ESR.
Updated•8 years ago
|
Comment 34•8 years ago
|
||
I'll backport for TenFourFox separately.
Comment 35•8 years ago
|
||
Comment on attachment 8816325 [details] [diff] [review]
part 1 - Copy elements from mMilestoneEntries before iterating over them
avoid crashing with recent sec fix, take in aurora52
Attachment #8816325 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Reporter | ||
Comment 37•8 years ago
|
||
Yup, just needs rebasing for the crashtest.list insertion.
I've got an aurora clone handy -- I'll rebase & land.
Reporter | ||
Comment 38•8 years ago
|
||
Flags: needinfo?(bbirtles)
Comment 39•8 years ago
|
||
Comment on attachment 8816325 [details] [diff] [review]
part 1 - Copy elements from mMilestoneEntries before iterating over them
let's fix this crash in beta51
Attachment #8816325 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Reporter | ||
Comment 40•8 years ago
|
||
Rebased to beta for crashtests.list contextual differences, and landed on beta:
https://hg.mozilla.org/releases/mozilla-beta/rev/181d381af5dfff2cafcd816be6167a1897d963a0
Updated•8 years ago
|
Updated•8 years ago
|
Whiteboard: [adv-main50.1-]
Updated•8 years ago
|
Whiteboard: [adv-main50.1-] → [adv-main51-]
You need to log in
before you can comment on or make changes to this bug.
Description
•