Closed Bug 409473 Opened 14 years ago Closed 14 years ago

New crash caused by bug 405248

Categories

(Core :: Disability Access APIs, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla1.9beta3

People

(Reporter: aaronlev, Assigned: aaronlev)

References

Details

(Keywords: access, crash, regression)

Attachments

(4 files, 1 obsolete file)

There are several different stacks in the crash reports, all lead to

0  BuildTextRunsScanner::ScanFrame(nsIFrame*)
1  BuildTextRuns
2  nsTextFrame::EnsureTextRun(gfxContext*, nsIFrame*, nsLineList_iterator const*, unsigned int*)

It seems like a layout bug other than a regression of bug 405248, though the crash happens more frequently when a11y enabled.
Here is the code where crash happen.

In nsTextFrameThebes.cpp:BuildTextRuns()

 967     nsIFrame* child = line->mFirstChild;
 968     PRInt32 i;
 969     for (i = line->GetChildCount() - 1; i >= 0; --i) {
 970       scanner.ScanFrame(child);
 971       child = child->GetNextSibling();
 972     }

|child| is null/illegal pointer.
It seems that the children are invalid, but I don't know how it could be. 

Robert, could you give some suggestion here?
What's on the stack of GetRenderedText? Can you dump the frame tree at the time, is it valid?
Robert, sorry I can only get info from the crash report, I can't reproduce the crash.
Evan, I see that most of the crashes happen when a node is being deleted in a plain text editor.

I think our notifcation into a11y from nsGenericElement::doRemoveChildAt() is calling when the frame tree may be in a halfway state of some kind.
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/content/base/src/nsGenericElement.cpp&rev=3.617&mark=2753#2753

19  	nsGenericElement::doRemoveChildAt(unsigned int, int, nsIContent*, nsIContent*, nsIDocument*, nsAttrAndChildArray&)  	 mozilla/content/base/src/nsGenericElement.cpp:2753
20 	nsGenericElement::RemoveChildAt(unsigned int, int) 	mozilla/content/base/src/nsGenericElement.cpp:2726
21 	nsGenericElement::doRemoveChild(nsIDOMNode*, nsIContent*, nsIDocument*, nsIDOMNode**) 	mozilla/content/base/src/nsGenericElement.cpp:3372
22 	nsGenericElement::RemoveChild(nsIDOMNode*, nsIDOMNode**) 	mozilla/content/base/src/nsGenericElement.cpp:2942
23 	DeleteElementTxn::DoTransaction() 	mozilla/editor/libeditor/base/DeleteElementTxn.cpp:124
24 	nsTransactionItem::DoTransaction()
I'm also seeing this pattern even before that:
25  	nsTextControlFrame::InitEditor()  	 mozilla/layout/forms/nsTextControlFrame.cpp:1670
26 	nsCSSFrameConstructor::CreateAnonymousFrames(nsFrameConstructorState&, nsIContent*, nsIDocument*, nsIFrame*, int, nsFrameItems&) 	mozilla/layout/base/nsCSSFrameConstructor.cpp:5765
Here are some frames I picked up from the stack:

3  	nsTextFrame::GetRenderedText(nsAString_internal*, gfxSkipChars*, gfxSkipCharsIterator*, unsigned int, unsigned int)
12  	nsAccessibilityService::GetAccessible(nsIDOMNode*, nsIPresShell*, nsIWeakReference*, nsIFrame**, int*, nsIAccessible**)
15  	nsHyperTextAccessible::DOMPointToHypertextOffset(nsIDOMNode*, int, int*, nsIAccessible**, int)
16 	nsDocAccessible::CreateTextChangeEventForNode(nsIAccessible*, nsIDOMNode*, nsIAccessible*, int, int)
17  	nsDocAccessible::InvalidateCacheSubtree(nsIContent*, unsigned int)
20  	nsGenericElement::RemoveChildAt(unsigned int, int)
28  	nsEditor::DeleteNode(nsIDOMNode*)
34  	nsTextControlFrame::InitEditor()
58  	nsCSSFrameConstructor::ConstructFrame(nsFrameConstructorState&, nsIContent*, nsIFrame*, nsFrameItems&)
66  	PresShell::WillPaint()

It seems that we are accessing some frame when it is under construction.

I don't know whether layout has a way to detect whether a frame is still under construction, but I guess for accessible part we shouldn't access such a frame.
Attached patch a guessSplinter Review
remove if (aChangeEventType == nsIAccessibleEvent::EVENT_DOM_CREATE), because it could be EVENT_DOM_DESTROY when document is still loading.
+'ing with a P3 priority.  Please adjust the priority as required.
Flags: blocking1.9? → blocking1.9+
Priority: -- → P3
Evan, what about input controls that are inserted dynamically, after page load. Won't they have the same problem?
(In reply to comment #10)
> Evan, what about input controls that are inserted dynamically, after page load.
> Won't they have the same problem?
> 
Yes, they probably will. Though I'm still not able to reproduce the crash with such a testcase.

Would a check of NS_SUBTREE_DIRTY(aFrame) || (aFrame->GetStateBits() & NS_FRAME_IN_REFLOW) help prevent the problem?

I feel it makes less sense to discuss the solution without a way to reproduce or even a url. Should we leave this until someone can reproduce it?
I'm seeing crashes almost constantly after submitting comments to any Bugzilla bug. When the "bug xxx processed" page comes up, Minefield crashes. I'm currently running a debug build while writing this comment, so hope to get a crash after I submit this, to get a stack in VS 2005.
Note that I cannot send in Crash Reports because I'm on Vista, and on Vista, submitting crash reports is currently broken (known issue). If anyone can use the dumps I have through another means, let me know, and I'll be happy to zip up the "pending" folder of crash reports.
I do not seem to get the crash when running a debug build, which suggests to me it's a timing issue. I did get one crash, and within the signature was nsAccessible::GetFinalState. the parameter aExtraState was NULL, but there's no check for that in this method. Aaron, is this a separate issue?
OK, here are some crash reports that I finally got submitted after Ted told me how to trick Vista:
03448d82-beb4-11dc-ba86-001a4bd46e84
3a84a3eb-beb4-11dc-9c46-001a4bd46e84
5636c191-beb4-11dc-b25d-001a4bd46e84
5e1325b8-beb4-11dc-b66b-001a4bd46e84
6a4029ba-beb4-11dc-b7e6-001a4bd46e84
8cdf494f-beb4-11dc-b164-001a4bd43ed6
f0f81ccc-beb3-11dc-b9ae-001a4bd43ed6

These all point to the same problem area.
(In reply to comment #14)
> I did get one crash, and within the signature was
> nsAccessible::GetFinalState. the parameter aExtraState was NULL, but there's no
> check for that in this method. Aaron, is this a separate issue?

Forget that, it's guarded properly. Sorry for the spam.
This bug seems occurs when <input value="default value"> in inserted. During reflow the InitEditor() is called which removes the bogus <br> node for the input and replaces it with the default value. The text editor creates a <br> child node when there is no other content. Layout is in the middle of reflow when it inits the editor, causing the bogus br to be removed, so we end up doing some processing while layout is in an intermediate state.

The first thing I can think of is that we shouldn't even be creating accessible objects for the bogus <br> node.


This is a one-off solution for this problem, but I don't know of any other scenarios where we would have this problem.
Attachment #296376 - Flags: superreview?(roc)
Attachment #296376 - Flags: review?(roc)
The reason that bug 405248 showed this bug is that we're now asking layout about the rendered text before we decide whether to create an accessible text object. Layout isn't ready to be asked that when the editor is being initialized in frame construction.
+  if (parent && parent->IsNativeAnonymous()) {
+    // This <br> is the only node in a text control, therefore it is the hacky
+    // "bogus node" used when there is no text in the control

How do we know it's the only node in the control?
Good point.
Assignee: Evan.Yan → aaronleventhal
Attachment #296376 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #296376 - Flags: superreview?(roc)
Attachment #296376 - Flags: review?(roc)
Attachment #296469 - Flags: superreview?(roc)
Attachment #296469 - Flags: review?(roc)
Attachment #296469 - Attachment is patch: true
Attachment #296469 - Attachment mime type: application/octet-stream → text/plain
Bumping priority and setting milestone on this one. It's a regression/discovery post-beta 2.
Priority: P3 → P1
Target Milestone: --- → mozilla1.9 M11
Attachment #296469 - Flags: superreview?(roc)
Attachment #296469 - Flags: superreview+
Attachment #296469 - Flags: review?(roc)
Attachment #296469 - Flags: review+
Attachment #296469 - Flags: approval1.9?
Comment on attachment 296469 [details] [diff] [review]
Address Roc's comment -- make sure it is an only child

No approval needed, as this is a blocker.
Attachment #296469 - Flags: approval1.9?
Version: unspecified → Trunk
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
I'm still seeing crashes in the same area with the latest build 2008011504:
bp-1618f3ba-c384-11dc-a0c5-001a4bd46e84
bp-3ca9022e-c3a0-11dc-8d13-001a4bd43ed6
bp-4bed3dcf-c3a0-11dc-8cde-001a4bd46e84
bp-58309ccf-c41e-11dc-ae35-001a4bd43ef6
The one thing these have in common is that I had at least two tabs open. One instance where I could repro it twice in succession was this:
1. While some page I don#t recall was opened in the first tab, in the second tab I had bug 407359 open.
2. Pressed CTRL+L to go to Location bar, and typed 412288, selected bug 412288.
3. Pressed ENTER.
Result: Crash upon load of bug 412288 page.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Marco, it's a slightly different crash stack. So it appears that we did fix one very common version of this crash but this one is another.
Evan/Ginn/Alexander, do you have time to look at the new crash stacks?
The <br> is still being removed in the new crashes, but not via InitEditor()
SInce the developers cannot duplicate this bug on our machines I do not know what the perfectly ideal solution is. However, it makes sense to have an early return for the <br> bogus node case anyway.
Attachment #297658 - Flags: review?(Evan.Yan)
Attachment #297658 - Flags: review?(surkov.alexander)
Attachment #297658 - Flags: review?(Evan.Yan) → review+
Attachment #297658 - Flags: review?(surkov.alexander) → approval1.9?
Comment on attachment 297658 [details] [diff] [review]
If bogus <br> this is an empty text accessible, which means the hypertext must be 0. Return early and avoid bogus node teardown crash

This bug is on the blocking list - you don't need patch approvals to land
Attachment #297658 - Flags: approval1.9?
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
There is still yet another version of this crash:
http://crash-stats.mozilla.com/report/index/3e2ab69e-c913-11dc-92b4-001a4bd46e84

I see the problem. We no longer need to create an accessible so early for dom create events. For SHOWs we don't use the childAccessible, we use the node.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to comment #31)
> Created an attachment (id=298717) [details]
> Don't create new accessible for show/create -- frame tree is not ready yet and
> we don't need it at this point
> 

we use it to fire text changed/reorder events, should we fire them after show event then?
We use it to fire text change events only if text is going away (hiding). If a node is being shown then we do it in FlushPendingEvents(). So I don't see that we need to force creation of accessible this early in the show/create case.
Comment on attachment 298717 [details] [diff] [review]
Don't create new accessible for show/create -- frame tree is not ready yet and we don't need it at this point

you're right, r=me
Attachment #298717 - Flags: review?(surkov.alexander) → review+
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Comment on attachment 295206 [details] [diff] [review]
a guess

This will help as well.
Attachment #295206 - Flags: review+
Verified using Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9b3) Gecko/2008020514 Firefox/3.0b3
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.