Closed
Bug 669903
Opened 13 years ago
Closed 13 years ago
double unlink crashes [@ nsGenericHTMLElement::ClearDataset()] nsDOMStringMap's Unlink
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
VERIFIED
FIXED
mozilla10
People
(Reporter: martijn.martijn, Assigned: wchen)
References
Details
(Keywords: crash, verified-beta, Whiteboard: [qa!][testday-20120203])
Crash Data
Attachments
(1 file, 3 obsolete files)
1.58 KB,
patch
|
sicking
:
review+
johnath
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-66910cea-e075-42a7-919b-98a242110705 .
=============================================================
0 xul.dll nsGenericHTMLElement::ClearDataset content/html/content/src/nsGenericHTMLElement.cpp:394
1 xul.dll nsDOMStringMap::cycleCollection::Unlink content/html/content/src/nsDOMStringMap.cpp:54
2 xul.dll nsCycleCollector::CollectWhite xpcom/base/nsCycleCollector.cpp:1962
3 xul.dll nsCycleCollector::FinishCollection xpcom/base/nsCycleCollector.cpp:2784
4 xul.dll nsCycleCollectorRunner::Collect xpcom/base/nsCycleCollector.cpp:3437
5 xul.dll nsCycleCollector_collect xpcom/base/nsCycleCollector.cpp:3503
6 xul.dll nsJSContext::CycleCollectNow dom/base/nsJSEnvironment.cpp:3271
7 xul.dll CCTimerFired dom/base/nsJSEnvironment.cpp:3311
8 xul.dll nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:424
9 xul.dll nsTimerEvent::Run xpcom/threads/nsTimerImpl.cpp:520
10 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:617
11 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:134
12 xul.dll xul.dll@0xb79167
13 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:202
14 xul.dll xul.dll@0x37125f
15 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:176
16 xul.dll nsEvent::nsEvent obj-firefox/dist/include/nsGUIEvent.h:569
17 xul.dll nsBaseAppShell::Run widget/src/xpwidgets/nsBaseAppShell.cpp:189
18 xul.dll nsAppShell::Run widget/src/windows/nsAppShell.cpp:256
19 @0x7530ffff
While testing, I'm seeing this crash occuring. Sorry, atm I don't have a testcase for this crash.
Updated•13 years ago
|
Product: Firefox → Core
QA Contact: general → general
Comment 1•13 years ago
|
||
Is this a null pointer crash? null slots (maybe even null element) + offset to
mDataset.
Is it possible that something is keeping nsDOMStringMap alive after
first unlink. Then it gets unlinked again, and mElement is already null.
Yeah, looking at the code I couldn't make sense of this.
As long as the dataset is alive, the element should be alive too. *Especially* in the places where ClearDataset is called since we only ever clear the strong mElement reference after calling ClearDataset.
And as long as the dataset is alive, the element should be pointing to it, as that pointer is set up as soon as the dataset is created, and only cleared once the dataset goes away.
And the slots are only deleted when the element is deleted, which again should only happen when the dataset is going away.
Or, at least that was my initial analysis. I just realized that there is one situation when this isn't true. Specifically, if we end up cycle collecting a dataset more than once we'll end up dereferencing a null mElement pointer in the unlink code. I suppose that can happen.
So what we should do is to add a if (mElement) check to the unlink code and see if that helps.
William, can you try that?
Assignee: nobody → wchen
Reporter | ||
Comment 3•13 years ago
|
||
Sorry, I haven't been able to get a minimized or unminimized testcase yet.
I can tell you that in order for this crash to trigger is when I do something like this setTimeout(function(){document.write(document.documentElement.innerHTML)}, 1000);. And the testcase probably keeps a whole lot of dataset references somewhere.
Assignee | ||
Comment 4•13 years ago
|
||
I added a check for the element prior to clearing the dataset. No test case because I was not able to reproduce the crash. I tried a few things but I was not able to get the unlink code to run more than once.
Attachment #548231 -
Flags: review?(jonas)
Comment on attachment 548231 [details] [diff] [review]
Possible fix for crash
Add an assertion as well, as there's definitely a bug somewhere if we unlink twice.
Attachment #548231 -
Flags: review?(jonas) → review+
Assignee | ||
Comment 6•13 years ago
|
||
Added assertion and comment.
Attachment #548231 -
Attachment is obsolete: true
Attachment #548246 -
Flags: review?(jonas)
Assignee | ||
Comment 7•13 years ago
|
||
Forgot to add r=sicking
Attachment #548246 -
Attachment is obsolete: true
Attachment #548250 -
Flags: review?(jonas)
Attachment #548246 -
Flags: review?(jonas)
Attachment #548250 -
Flags: review?(jonas) → review+
Comment 8•13 years ago
|
||
Is this ready to land?
Comment 9•13 years ago
|
||
Jesse came up with a test case that looks very similar to this in bug 706281, that only triggers a crash after I landed the weak map cycle collector integration in bug 668855.
Comment 11•13 years ago
|
||
In the first CC in Jesse's test case, the CC identifies the nsDOMStringMap as garbage, so it is unlinked. There's only one reference to the DOMStringMap, an XPCWrappedNative, which is also identified as garbage and has a single reference. But both of them show up again in the next CC. I'm not sure why that is. So you end up with a double unlink. The attached patch seems like the way to go. Double unlinks are bad, but they do happen. I know of a few ways they can be caused.
Comment 12•13 years ago
|
||
How bad are double unlinks? Should the assertion be removed from the patch, or should a bug be filed on fixing the assertion?
Comment 13•13 years ago
|
||
They are kind of bad, but probably not anything worse than a controlled crash. Unlinks usually just null out references, so I think the worst thing that will happen is a null deref. But I don't think anybody has really audited unlinks.
Updated•13 years ago
|
Summary: crash [@ nsGenericHTMLElement::ClearDataset()] → double unlink crashes nsDOMStringMap's Unlink
Comment 14•13 years ago
|
||
I've always tried to write unlink code so that it can handle more than one unlinking -
I think most of our unlink handling is like that.
Reporter | ||
Updated•13 years ago
|
Summary: double unlink crashes nsDOMStringMap's Unlink → double unlink crashes [@ nsGenericHTMLElement::ClearDataset()] nsDOMStringMap's Unlink
Comment 15•13 years ago
|
||
Maybe I'm reading the code wrong, but it looks like the wrappee of an XPCWrappedNative isn't unlinked by the XPCWN's unlink, so it will stay alive until the next GC. Which means that double unlinks are probably fairly common. I don't know. One XPCWN I looked at disappeared from one log to another, but a second one didn't.
Comment 16•13 years ago
|
||
I just removed the assertion from William's last patch. Double unlinks will happen whenever a wrapped native is marked as garbage, and the cycle collector runs twice in a row without a GC in between, so I think it shouldn't be an assertion.
Attachment #548250 -
Attachment is obsolete: true
Attachment #579901 -
Flags: review?(jonas)
Attachment #579901 -
Flags: review?(jonas) → review+
Comment 17•13 years ago
|
||
Target Milestone: --- → mozilla11
Comment 18•13 years ago
|
||
Comment on attachment 579901 [details] [diff] [review]
check for double unlink
Low risk fix (add a null check), but also an extremely rare crash, so I am flagging for Aurora but not Beta. I could only find 9 crashes with this signature in the last week across all versions.
Attachment #579901 -
Flags: approval-mozilla-aurora?
Updated•13 years ago
|
Crash Signature: [@ nsGenericHTMLElement::ClearDataset()] → [@ nsGenericHTMLElement::ClearDataset]
Comment 19•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•13 years ago
|
Attachment #579901 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 20•13 years ago
|
||
status-firefox10:
--- → fixed
status-firefox11:
--- → fixed
status-firefox9:
--- → affected
Component: General → DOM
OS: Windows NT → All
QA Contact: general → general
Hardware: x86 → All
Target Milestone: mozilla11 → mozilla10
Comment 22•13 years ago
|
||
The steps in bug 706281 comment 0 triggered this crash.
Comment 23•13 years ago
|
||
Verified as fixed with the steps referenced in comment 22 on:
Mozilla/5.0 (Windows NT 6.1; rv:10.0) Gecko/20100101 Firefox/10.0
Mozilla/5.0 (X11; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:10.0) Gecko/20100101 Firefox/10.0
I have also verified the crash stats from the Socorro interface and this crash didn't reproduce on Firefox 10 anymore.
Comment 24•13 years ago
|
||
Verified as fixed with the steps referenced in comment 22 on:
Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20100101 Firefox/11.0
Mozilla/5.0 (Windows NT 6.1; rv:11.0) Gecko/20100101 Firefox/11.0
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:11.0) Gecko/20100101 Firefox/11.0
Status: RESOLVED → VERIFIED
Whiteboard: [qa+][qa!:10] → [qa!][testday-20120203]
Comment 25•13 years ago
|
||
The crash status from the Socorro interface were also verified and this crash didn't happen on Firefox 10 or newer anymore.
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•