Closed
Bug 641388
Opened 14 years ago
Closed 14 years ago
Crash in nsRefreshDriver::Notify with assertion in CSSFrameConstructor
Categories
(Core :: General, defect)
Core
General
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | Macaw+ |
status2.0 | --- | .1-fixed |
status1.9.2 | --- | unaffected |
status1.9.1 | --- | unaffected |
People
(Reporter: nils, Assigned: dholbert)
References
Details
(4 keywords, Whiteboard: [sg:critical?])
Attachments
(3 files, 1 obsolete file)
4.54 KB,
text/html
|
Details | |
3.52 KB,
text/html
|
Details | |
4.58 KB,
patch
|
bzbarsky
:
review+
dveditz
:
approval2.0+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0) Gecko/20100101 Firefox/4.0 Build Identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0) Gecko/20100101 Firefox/4.0 Description: The attached testcase has been reduced from a very large testcase (~3 Megs). The resulting testcase is still quite large (~80 lines of JavaScript), however further reduction made were not possible or made it very hard to reproduce. The reason for the crash is not obvious from the testcase. Following assertion can be observed on a Linux debug build: ASSERTION: Dying in the middle of our own update?: mUpdateCount == 0, file ../../../layout/base/nsCSSFrameConstructor.h, line 89 Affected Versions: Firefox 4.0 RC1 Testcase: The testcase is attached as an HTML file. It will crash the browser on opening after several reloads. Testcase Notes: The testcase implements two function to improve the reproducability of the testcase. gc() triggers garbage collection. It requires Jesse's quitter extension (https://www.squarefree.com/extensions/quitter.xpi). fill() works similar to a heap spray and fills the memory with a byte pattern. Stack Backtrace: Following crash has been observed on Windows. This indicates that code exection is possible. Similar behaviour has been seen on linux as well. Windows: (e5c.1360): Access violation - code c0000005 (first chance) First chance exceptions are reported before any exception handling. This exception may be expected and handled. xul!nsRefreshDriver::Notify+0x5b9: 625fa199 8b08 mov ecx,dword ptr [eax] ds:002b:fafafafb=???????? 625fa19b 57 push edi 625fa19c ffd1 call ecx xul!nsRefreshDriver::Notify+0x5b9 xul!nsTimerImpl::Fire+0x103 xul!nsTimerEvent::Run+0x20 xul!nsThread::ProcessNextEvent+0x2d7 xul!mozilla::ipc::MessagePump::Run+0x6e xul!MessageLoop::RunInternal+0x11 xul!MessageLoop::RunHandler+0x1d nspr4!PR_GetThreadPrivate+0x20 xul!nsBaseAppShell::Run+0x34 xul!nsAppShell::Run+0x42 xul!nsAppStartup::Run+0x1e xul!XRE_main+0xdec firefox!wmain+0x34c firefox!__tmainCRTStartup+0x152 kernel32!BaseThreadInitThunk+0xe ntdll32!__RtlUserThreadStart+0x70 ntdll32!_RtlUserThreadStart+0x1b VulnDev reference : vd11002 reported by nils of vulndev ltd. Reproducible: Always
This testcase looks quite different from the first testcase and it does not trigger the same assertion in a debug build. However it triggers the same crash (stack backtrace) on windows. It still might be a separate issue, hard to tell until we know more about the reasons for the crashes.
Comment 3•14 years ago
|
||
With the second testcase, I get an abort when a refresh driver is destroyed before its observers have been unregistered. The observer that's still around is an nsSMILTimeContainer. The testcase does use some <svg:animate> stuff.
Comment 4•14 years ago
|
||
To be clear, that abort is a debug abort due to an NS_ABORT_IF_FALSE. But it indicates that something is not unregistering correctly here. For the first testcase, I don't see why mUpdateCount is 1 there. The stack shows us removing an element which has an iframe as a descendant from the DOM. The frame constructor with mUpdateCount == 1 is the frame constructor for that iframe's document. And then eventually I do crash when reloading the page, because one of the refresh observers in the refresh driver is a dead pointer....
Comment 5•14 years ago
|
||
Note that the mUpdateCount being nonzero is pretty weird, since it's maintaned by stack objects.
Assignee | ||
Comment 6•14 years ago
|
||
(In reply to comment #3) > With the second testcase, I get an abort when a refresh driver is destroyed > before its observers have been unregistered. [...] > that abort is a debug abort due to an NS_ABORT_IF_FALSE. With both attached testcases, my debug build crashes about 10 seconds after loading. It's a real crash -- not an abort -- in both cases. (I did also see one instance of the abort that bz mentioned, though, when I loaded the second testcase and then reloaded a few times.) The crashes are from trying to create a new nsRefPtr, which calls the virtual function "AddRef", on a nsARefreshObserver* whose vtable pointer is garbage: > #5 0x00007f097cdcdb9d in nsRefPtr<nsARefreshObserver>::nsRefPtr (this=0x7fff6692d1b0, aRawPtr=0x7f09613867e0) at nsAutoPtr.h:993 > (gdb) p *mRawPtr > $2 = { > _vptr.nsARefreshObserver = 0x5a5a5a5a5a5a5a5a > } This is scary - flagging as [sg:critical?]
Status: UNCONFIRMED → NEW
Ever confirmed: true
Hardware: x86 → All
Whiteboard: [sg:critical?]
Version: unspecified → Trunk
Assignee | ||
Comment 7•14 years ago
|
||
I don't have a 1.9.2 build handy, but I think it's reasonable to assume this is a regression, since the crash seems to depend on SMIL which we didn't have in 1.9.2
Summary: Crash in nsRefreshDriver::Notify with assertition in CSSFrameConstructor → Crash in nsRefreshDriver::Notify with assertion in CSSFrameConstructor
Comment 8•14 years ago
|
||
Daniel, your crash sounds about the same as what I saw at the end of comment 4. I tried using gdb to dump out the stacks to all the nsCSSFrameConstructor::BeginUpdate calls, hoping that this would catch the errant call for me, but the bug doesn't show up if I do that. Presumably because it makes things run too slow and the SMIL finishes "too early" or something?
Assignee | ||
Comment 11•14 years ago
|
||
So at least in the second case, what's happening is this:
1) Our nsSMILAnimationController gets a call to 'NotifyRefreshDriverCreated', and it registers as a refresh-observer (via StartSampling())
2) Our nsSMILAnimationController gets a new time-container child, which gets us to nsSMILAnimationController::AddChild. This takes the branch that sets
> mDeferredStartSampling = PR_TRUE;
(The idea there is that we don't have any registered animations, so we're trying to hold off on actively sampling until we get some. However, unbeknownst to us, we've actually already called StartSampling, in (1) above...)
3) When the document gets torn down, our nsSMILAnimationController gets a call to "NotifyRefreshDriverDestroying", but that call does nothing, because 'mDeferredStartSampling' is set. So we never unregister as a refresh observer.
This means we the nsRefreshDriver stays alive longer than our animation controller does, and so it later ends up Notify()ing us after our animation controller is already destroyed.
Assignee | ||
Comment 12•14 years ago
|
||
My first instinct is that we want to copy the > if (mAnimationElementTable.Count()) check in nsSMILAnimationController::AddChild() over to NotifyRefreshDriverCreated(). That should keep us from registering as a Refresh Observer in part (1) of comment 11. (We don't want to be a refresh observer at that point, if this proposed check fails, because we don't have any registered animations yet.)
Assignee | ||
Comment 13•14 years ago
|
||
This patch makes the "StartSampling" call in nsSMILAnimationController::NotifyRefreshDriverCreated consistent with all our other calls to "StartSampling". The idea is to only *actually* start sampling if we have registered animation elements -- and if we don't, we defer starting sampling until such an element is added. Without the patch, my debug build (w/ jesse's dom-fuzz-lite) crashes within 10 seconds after loading either attached testcase. With the patch, I can't get either testcase to crash. (I do still see the assertion from comment 0 in the first testcase, though -- I think that might be unrelated.) I'm running this through mochitests + reftests, and then will request review.
Assignee | ||
Comment 14•14 years ago
|
||
Comment on attachment 522464 [details] [diff] [review] fix v1 Passed SVG + SMIL reftests & mochitests. Requesting review.
Attachment #522464 -
Flags: review?(bzbarsky)
Comment 15•14 years ago
|
||
Comment on attachment 522464 [details] [diff] [review] fix v1 Is there a reason not to push this logic down into StartSampling, since now every callsite has it?
Assignee | ||
Comment 16•14 years ago
|
||
Good call. This patch shifts that logic into a wrapper function, actually, which performs the check & then the actual StartSampling call. We need to keep one 'raw' StartSampling call, though -- that's why I didn't push the logic directly into StartSampling as you'd suggested. We need to directly call StartSampling (and clear mDeferredStartSampling) in the one place where we say "ok, actually start sampling now" -- in nsSMILAnimationController::RegisterAnimationElement. (That's the place where we stop deferring, if we were deferring.) So now, with this patch, RegisterAnimationElement and MaybeStartSampling are the only methods that call StartSampling. Everywhere else uses MaybeStartSampling.
Attachment #522464 -
Attachment is obsolete: true
Attachment #522464 -
Flags: review?(bzbarsky)
Attachment #522768 -
Flags: review?(bzbarsky)
Comment 17•14 years ago
|
||
> nsSMILAnimationController::RegisterAnimationElement
So say RegisterAnimationElement looked like this:
mDeferredStartSampling = PR_FALSE;
if (mChildContainerTable.Count()) {
// mAnimationElementTable was empty, but now we've added its 1st element
MaybeStartSampling(GetRefreshDriverForDoc(mDocument));
}
We'd land in MaybeStartSampling. mDeferredStartSampling is false, so we don't take the early return. All of the above code is after a mAnimationElementTable.PutEntry call, so I'd think mAnimationElementTable.Count() is nonzero, and MaybeStartSampling calls StartSampling. I agree that it may be conceptually cleaner to have the separate MaybeStartSampling, and that we'd have an extra pair of branches in the above code, but the behavior would be the same. Up to you whether you want to keep the MaybeStartSampling function.
Comment 18•14 years ago
|
||
Comment on attachment 522768 [details] [diff] [review] fix v2: create wrapper MaybeStartSampling r=me with the above caveat.
Attachment #522768 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 19•14 years ago
|
||
(In reply to comment #17) > I agree that it may be conceptually cleaner to have the > separate MaybeStartSampling, and that we'd have an extra pair of branches in > the above code, but the behavior would be the same. Mm, true -- so we could effectively merge MaybeStartSampling into StartSampling and just call that function everywhere, and still get the same behavior as in the attached patch. Still, I think I like the conceptual separation (& I'd prefer to avoid adding unnecessary checks to RegisterAnimationElement), so I think I'd opt for keeping MaybeStartSampling & StartSampling separate, the way they are in the attached patch. Thanks for the review!
Comment 20•14 years ago
|
||
if this fixes 647044 as expected we should try and get it out before US tax season if possible.
blocking2.0: --- → ?
Assignee | ||
Comment 21•14 years ago
|
||
Running it through TryServer right now.
Assignee | ||
Comment 22•14 years ago
|
||
TryServer liked it. Pushed to m-c: http://hg.mozilla.org/mozilla-central/rev/b71e50bf9afc
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Assignee | ||
Updated•14 years ago
|
Attachment #522768 -
Flags: approval2.0?
Comment 23•14 years ago
|
||
(In reply to comment #20) > if this fixes 647044 as expected we should try and get it out before US tax > season if possible. We think TurboTax is using SMIL? Really?
status1.9.1:
--- → unaffected
status1.9.2:
--- → unaffected
Assignee | ||
Comment 24•14 years ago
|
||
It's feasible that they might be using SMIL-animated SVG for a throbber between steps or something...
Assignee | ||
Comment 25•14 years ago
|
||
Probably not, though. I just clicked through a the first part of a "Try It Out" TurboTax session, and it seems to use a .gif throbber and a lot of Flash. I didn't see anything that was obviously SVG (nor did I see any mention of SVG in 'view source'). So I think bug 641388 isn't this, but it may end up being a *similar* bug to this one, in our plugin code rather than our SVG code.
Comment 26•14 years ago
|
||
Plugins don't use refresh driver for anything, to my knowledge.
Assignee | ||
Comment 27•14 years ago
|
||
Darn - you're right, they don't seem to, based on http://mxr.mozilla.org/mozilla-central/search?string=AddRefreshObserver
Updated•14 years ago
|
Comment 28•14 years ago
|
||
Should this fix both https://crash-stats.mozilla.com/report/list?signature=nsRefreshDriver%3A%3ANotify%28nsITimer%2A%29 (bug 647044) and https://crash-stats.mozilla.com/report/list?signature=nsTimerImpl%3A%3AFire%28%29 that seem to be rising in the last days?
Comment 29•14 years ago
|
||
Comment on attachment 522768 [details] [diff] [review] fix v2: create wrapper MaybeStartSampling Approved for the mozilla2.0 repository, a=dveditz for release-drivers
Attachment #522768 -
Flags: approval2.0? → approval2.0+
Assignee | ||
Comment 30•14 years ago
|
||
(In reply to comment #28) > Should this fix both [signatures omitted for brevity] > that seem to be rising in the last days? That's the question that was discussed in Comment 20 & Comment 23-Comment 27. To sum up: this patch only affects sites with SMIL animation (and subdocuments being created/destroyed on the fly). So if e.g. TurboTax (or other high-profile sites that hit these crash signatures) uses SVG with SMIL animation, then this would help -- but as dveditz implied in Comment 23 (and I semi-confirmed in comment 25), it's pretty unlikely that that's the case.
Assignee | ||
Comment 31•14 years ago
|
||
That is to say: I don't have a lot of confidence that it'll fix those signatures. Still a scary bug, though,, & still worth fixing on 2.0 branch. Pushed to mozilla-2.0: http://hg.mozilla.org/releases/mozilla-2.0/rev/c94d64a0d540
Updated•14 years ago
|
Group: core-security
Comment 32•12 years ago
|
||
Crash tests: https://hg.mozilla.org/integration/mozilla-inbound/rev/88e4547bdbcf
Flags: in-testsuite? → in-testsuite+
You need to log in
before you can comment on or make changes to this bug.
Description
•