Closed Bug 598907 Opened 9 years ago Closed 9 years ago

crash [@ _purecall | nsGenericHTMLFormElement::IsDisabled() ]

Categories

(Core :: DOM: Core & HTML, defect, critical)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla2.0b8
Tracking Status
blocking2.0 --- final+

People

(Reporter: scoobidiver, Assigned: mounir)

References

(Blocks 2 open bugs)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file, 6 obsolete files)

Build : Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b7pre) Gecko/20100922
Firefox/4.0b7pre

This is a new crash signature that was introduced by b7pre/20100920 build.
The crash daily rate is 2-6 crashes/day.
It is #34 top crasher in b7pre/20100922 build.
One comment says : "shutting down"

Signature	_purecall | nsGenericHTMLFormElement::IsDisabled()
UUID	29d0fad7-fe59-4161-83cf-508262100922
Time 	2010-09-22 10:59:00.11939
Uptime	17464
Last Crash	172007 seconds (2.0 days) before submission
Install Age	17464 seconds (4.9 hours) since version was first installed.
Product	Firefox
Version	4.0b7pre
Build ID	20100922041040
Branch	2.0
OS	Windows NT
OS Version	6.1.7600
CPU	x86
CPU Info	GenuineIntel family 6 model 23 stepping 10
Crash Reason	EXCEPTION_NONCONTINUABLE_EXCEPTION
Crash Address	0x0
App Notes 	AdapterVendorID: 8086, AdapterDeviceID: 2a42

Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 	mozcrt19.dll 	_purecall 	obj-firefox/memory/jemalloc/crtsrc/purevirt.c:54
1 	xul.dll 	nsGenericHTMLFormElement::IsDisabled 	content/html/content/src/nsGenericHTMLElement.h:855
2 	xul.dll 	nsGenericHTMLFormElement::IsDisabled 	content/html/content/src/nsGenericHTMLElement.h:855
3 	xul.dll 	nsHTMLTextAreaElement::UpdateBarredFromConstraintValidation 	content/html/content/src/nsHTMLTextAreaElement.cpp:1223
4 	xul.dll 	nsHTMLTextAreaElement::UnbindFromTree 	content/html/content/src/nsHTMLTextAreaElement.cpp:1034
5 	xul.dll 	nsGenericElement::UnbindFromTree 	content/base/src/nsGenericElement.cpp:3046
6 	xul.dll 	nsStyledElement::UnbindFromTree 	content/base/src/nsStyledElement.cpp:244
7 	xul.dll 	nsGenericHTMLElement::UnbindFromTree 	content/html/content/src/nsGenericHTMLElement.cpp:969
8 	xul.dll 	nsGenericElement::UnbindFromTree 	content/base/src/nsGenericElement.cpp:3046
9 	xul.dll 	nsStyledElement::UnbindFromTree 	content/base/src/nsStyledElement.cpp:244
10 	xul.dll 	nsGenericHTMLElement::UnbindFromTree 	content/html/content/src/nsGenericHTMLElement.cpp:969
11 	xul.dll 	nsGenericHTMLFormElement::UnbindFromTree 	content/html/content/src/nsGenericHTMLElement.cpp:2534
12 	xul.dll 	nsGenericElement::cycleCollection::Unlink 	content/base/src/nsGenericElement.cpp:4338
13 	xul.dll 	nsEventListenerManager::cycleCollection::Unlink 	content/events/src/nsEventListenerManager.cpp:348
14 	xul.dll 	xul.dll@0xd9c2b7 	
15 	xul.dll 	nsCycleCollector::FinishCollection 	xpcom/base/nsCycleCollector.cpp:2695
16 	xul.dll 	nsCycleCollector::Collect 	xpcom/base/nsCycleCollector.cpp:2490
17 		@0x6850cf 	
18 	xul.dll 	PL_DHashTableOperate 	obj-firefox/xpcom/build/pldhash.c:625
19 		@0x30e0e193 	

The regression range is :
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=7640eb022be6&tochange=b1d56e69f4d9

More reports at :
http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A4.0b7pre&query_search=signature&query_type=exact&query=_purecall%20|%20nsGenericHTMLFormElement%3A%3AIsDisabled%28%29&date=09%2F23%2F2010%2004%3A59%3A51&range_value=4&range_unit=weeks&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=_purecall%20|%20nsGenericHTMLFormElement%3A%3AIsDisabled%28%29
Looks like mFieldSet is a dead pointer?
blocking2.0: --- → ?
When the element is unbind from tree, mFieldSet is updated depending of the new parent chain. If an entire sub-tree is removed, mFieldSet will still have a value and UnbindFromTree might not be called again.
In that case, the CC should remove the subtree at some point but can it happen to have the fieldset removed and not the elements inside?
Component: XPCOM → DOM: Core & HTML
Keywords: testcase-wanted
OS: Windows 7 → All
QA Contact: xpcom → general
Hardware: x86 → All
> If an entire sub-tree is removed, mFieldSet will still have a
> value and UnbindFromTree might not be called again.

How so?  You mean if the subtree includes the fieldset?  And it won't be called again because in the destructor we only do shallow unbinds?

> can it happen to have the fieldset removed and not the elements inside?

I think that can happen, since we don't hold strong refs to parent nodes...  smaug had a possible patch for that, though.
Yeah, but no idea whether anything for that will land for ff4.

And note, during teardown child nodes will release parent pointers...
Thats fine.  I bet what happens here is that during CC #1 the fieldset goes away but NOT its kids (which get a shallow unbind but don't unbind their descendants).  Then in a later CC the descendants are CCed and we get the stack above, since the fieldset is long-dead.
Assignee: nobody → mounir.lamouri
blocking2.0: ? → final+
In CC-unlink we intentionally do a shallow UnbindFromTree to save cycles, similar to how we do the same thing in the element dtor. An exception is documents whose unlink function which does a deep UnbindFromTree in order to clear the is-in-doc bit of all nodes.
We don't do shallow unbind in CC. We do shallow unbind in dtor.
Ugh, you're right, we should change that
Yeah, the point is that the shallow-unbind optimization is broken in cases when the kid is caching weak pointers to ancestors...  Perhaps mFieldSet should just be a strong pointer?  The other option is for ~nsHTMLFieldSetElement to clear the mFieldSet, just like ~nsHTMLFormElement drops mForm.
> The other option is for ~nsHTMLFieldSetElement to clear
> the mFieldSet, just like ~nsHTMLFormElement drops mForm.

That is an excellent idea!

Note that during CC, none of the elements should be deleted until they are all unlinked, so no pointers should be dangling. Of course, if elements accesses pointers in the dtor you are out of luck.
If we _do_ want to do that, then we need to do it not only in ~nsHTMLFieldSetElement but also in the fieldset's Unlink, because it drops references to its form controls in Unlink (and if its Unlink is being called, it's about to go away anyway).
Though as long as unlink uses deep UnbindFromTree, form controls should be clearing their mFieldSet themselves, no?
Oh, indeed.  It's sort of unfortunate that we do a deep unbind there, but does mean that the dtor is the only place we need to worry about.
See bug 561277 for crash about UnbindFromTree
We can't easily inform all fieldset descendants when the fieldset is destroyed because the fieldsets might not now its descendants (IOW, mElements might be null).
So, to have this working we should eager initialize this but that might create some troubles because no events are sent on parsing and mElements would be wrong.

One solution might be to create mElements in DomeAddingChildren or DoneCreateElement maybe? Do someone see another solution?
(In reply to comment #4)
> Yeah, but no idea whether anything for that will land for ff4.

What's is the bug number?
Blocks: 600773
Attached patch Patch #2 v1 (obsolete) — Splinter Review
Attachment #479685 - Flags: review?(Olli.Pettay)
Comment on attachment 479685 [details] [diff] [review]
Patch #2 v1

Henri, can you review the parser/content sink changes? ... 2 lines :)
Olli, could you review the rest?
Attachment #479685 - Flags: review?(hsivonen)
Status: NEW → ASSIGNED
By the way, this patch is purely theoretical. I wasn't able to reproduce the same situation so I don't know if that's really fixing it...
Comment on attachment 479685 [details] [diff] [review]
Patch #2 v1

http://mxr.mozilla.org/mozilla-central/source/content/xml/document/src/nsXMLContentSink.cpp#573 also needs to be updated. r=me on the parser stuff if you update the XML side, too.
Attachment #479685 - Flags: review?(hsivonen) → review+
Comment on attachment 479685 [details] [diff] [review]
Patch #2 v1

> nsHTMLFieldSetElement::~nsHTMLFieldSetElement()
> {
>+  NS_ASSERTION(mElements, "mElements shouldn't be null!");
>+
>+  // TODO: should be removed with bug 600773.
>+  if (mElements) {
>+    PRUint32 length = mElements->Length(PR_TRUE);
Hmm, is this really safe? Is it ok to flush parser during CC or while?
I'd pass PR_FALSE, especially because I don't understand why you need to flush.
(Although, the element shouldn't be in the document, so no flushing should actually happen)
Attachment #479685 - Flags: review?(Olli.Pettay) → review+
> because the fieldsets might not now its descendants (IOW, mElements might be
> null).

Why not just have the descendants register and unregister with their mFieldset when it changes?  Again, that's how we do forms.
(In reply to comment #22)
> > because the fieldsets might not now its descendants (IOW, mElements might be
> > null).
> 
> Why not just have the descendants register and unregister with their mFieldset
> when it changes?  Again, that's how we do forms.

I was going to say that this would cost more because it would require to look at the entire parent chain but we already do that :-/.
The only con would be that this will require us to create a nsCOMArray to keep track of form control elements that are actually depending of the fieldset. BTW, if we do that, I guess we could remove nsIMutationObserver from nsHTMLFieldSetElement and have something similar to nsHTMLFormElement for .elements.
Well, we'd likely have to walk the parent chain again, so I'd still stay it costs us.

I would prefer not to do that given that fieldsets are very rare. So IMHO it's better to spend a few more cycles when they are used, and instead optimize the case when they aren't.
(In reply to comment #24)
> Well, we'd likely have to walk the parent chain again, so I'd still stay it
> costs us.

Why again? We are currently walking the parent chain looking for a fieldset (which make us walk the entire parent chain in most case). We should be able to use this walk to set everything we need?
I have this crash regular.

If it helps, most of the time it is after I pressed CTRL+R, sometimes also when it is closing firefox to apply the update. But not sure it is really causing it.
Attached patch Patch v3 (obsolete) — Splinter Review
This patch should be better.
Attachment #479685 - Attachment is obsolete: true
Attachment #484703 - Flags: review?(Olli.Pettay)
Attached patch Patch v3 (obsolete) — Splinter Review
Attachment #484703 - Attachment is obsolete: true
Attachment #484709 - Flags: review?(Olli.Pettay)
Attachment #484703 - Flags: review?(Olli.Pettay)
Blocks: 606439
Why a COMArray? It seems like using an nsTArray<nsRefPtr<nsGenericFormElement>> would prevent a lot of casting.
Attached patch Patch v3.1 (obsolete) — Splinter Review
Exact, nsTArray<nsGenericHTMLFormElement*> should even be enough.
Attachment #484709 - Attachment is obsolete: true
Attachment #485276 - Flags: review?(Olli.Pettay)
Attachment #484709 - Flags: review?(Olli.Pettay)
Or maybe a nsTPtrArray<nsGenericHTMLFormElement> ?
Attached patch Patch v3.2 (obsolete) — Splinter Review
I didn't know nsTPtrArray. Thanks,
Attachment #485276 - Attachment is obsolete: true
Attachment #485297 - Flags: review?(Olli.Pettay)
Attachment #485276 - Flags: review?(Olli.Pettay)
Comment on attachment 485297 [details] [diff] [review]
Patch v3.2

>+  // List of elements which have this fieldset as first fieldset ancestor.
>+  nsTPtrArray<nsGenericHTMLFormElement> mDependantElements;
>+

Nit:  I think you mean "dependent".  I wouldn't submit a new patch for that just yet, though.
Comment on attachment 485297 [details] [diff] [review]
Patch v3.2

I don't understand how the unbind case works.
Sure, mFieldSet won't point to deleted object, but
how does mFieldSet get cleared?

Re-ask reviewed if I missed something.
Attachment #485297 - Flags: review?(Olli.Pettay) → review-
Attached patch Patch v3.3 (obsolete) — Splinter Review
This should be better.
Inter-diff can be found at: http://mounir.pastebin.mozilla.org/829050 (+ s/mDependantElements/mDependentElemest/g)
Attachment #485297 - Attachment is obsolete: true
Attachment #486579 - Flags: review?(Olli.Pettay)
Attached patch Patch v3.4Splinter Review
Calling RemoveElement() on dtor.
Attachment #486579 - Attachment is obsolete: true
Attachment #486586 - Flags: review?(Olli.Pettay)
Attachment #486579 - Flags: review?(Olli.Pettay)
Attachment #486586 - Flags: review?(Olli.Pettay) → review+
Whiteboard: [needs landing]
Pushed:
http://hg.mozilla.org/mozilla-central/rev/132b620273ac
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Whiteboard: [needs landing]
Target Milestone: --- → mozilla2.0b8
Depends on: 612094
Duplicate of this bug: 612094
No longer depends on: 612094
I see only 4 crashes that _might_ have happen with this fix:
http://crash-stats.mozilla.com/report/index/f2eed5f4-e96a-40b9-a2aa-85fdf2101107
http://crash-stats.mozilla.com/report/index/90ba1251-3cbd-46d4-8327-245542101109
http://crash-stats.mozilla.com/report/index/036c1c8c-97a3-4077-aeba-b240f2101109
http://crash-stats.mozilla.com/report/index/07754a64-09c9-4930-92be-29b7e2101109
(the three last ones are obviously the same machine)

All other crashes after the fix happened with an outdated version.

It looks like, the crashes happened with these changesets: 56664 and 56683.
And the fix came with 56935

Based on that, I'm going to mark this RESOLVED FIXED again. Feel free to reopen if that feels wrong.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Crash Signature: [@ _purecall | nsGenericHTMLFormElement::IsDisabled() ]
You need to log in before you can comment on or make changes to this bug.