Closed
Bug 468727
Opened 16 years ago
Closed 16 years ago
Crash [@ nsHTMLTextFieldAccessible::GetStateInternal()]
Categories
(Core :: Disability Access APIs, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla1.9.2a1
People
(Reporter: MarcoZ, Assigned: ginnchen+exoracle)
References
Details
(4 keywords)
Crash Data
Attachments
(1 file, 4 obsolete files)
2.43 KB,
patch
|
surkov
:
review+
davidb
:
review+
|
Details | Diff | Splinter Review |
This is the top crasher on 3.1 right now, with 45 within the last 10 days.
http://crash-stats.mozilla.com/report/list?product=Firefox&branch=1.9.1&version=Firefox%3A3.1b2&version=Firefox%3A3.1b2pre&version=Firefox%3A3.1b3pre&platform=windows&query_search=signature&query_type=contains&query=Access&date=&range_value=9&range_unit=days&do_query=1&signature=nsHTMLTextFieldAccessible%3A%3AGetStateInternal%28unsigned%20int*%2C%20unsigned%20int*%29
The stack points to a simple usage of "content" where it is not being checked whether "content" is actually not NULL.
Reporter | ||
Comment 1•16 years ago
|
||
Assignee: nobody → marco.zehe
Status: NEW → ASSIGNED
Attachment #352213 -
Flags: review?(surkov.alexander)
Comment 2•16 years ago
|
||
mDOMNode of nsHTMLTextFieldAccessible should always implement nsIContent interface. It means there is one case when you can't query nsIContent, this case is when accessible is defunct (i.e. mDOMNode is null) but this is checked in nsAccessible::GetStateInternal which returns NS_OK_DEFUNCT_OBJECT in this case and this nsresult is checked inside nsHTMLTextFieldAccessible::GetStateInternal (by NS_ENSURE_A11Y_SUCCESS macros). So I can't see the case when you can't query nsIContent. Any thoughts? Are you able to reproduce this bug?
Reporter | ||
Comment 3•16 years ago
|
||
I ran into this bug yesterday once, I was simply going back to a previous page, and suddenly it crashed onme with this same signature. But no, I cannot reproduce the bug at will.
Comment 4•16 years ago
|
||
I have half a theory. I think it relates to this code:
http://mxr.mozilla.org/mozilla-central/source/accessible/src/xul/nsXULFormControlAccessible.cpp#819
There we create a temporary nsHTMLTextFieldAccessible, and guess what, we call GetStateInternal() on that.
Not sure which part is crashing, though.
Comment 5•16 years ago
|
||
That's the only place in the code where we create a new accessible on the stack, and don't init it. The accessible is not in the cache.
We did this to save code because the XUL textfield's state is dependent on the state of the HTML text field inside of it. Since we already had code to get the state of an HTML textfield we wanted an easy way to grab that.
Updated•16 years ago
|
Flags: blocking1.9.1?
Comment 6•16 years ago
|
||
See my new comments in 4 & 5. I believe it's a major hint. Anyone have any idea what's going wrong? Ginn, this is the kind of bug you are good at.
If we don't figure it out, I suggest we make the code more standard. Just pull all the logic to get the state from an <input> node into a static method that takes a DOM node, like:
static nsHTMLTextFieldAccessible::GetStateForTextInput(nsIDOMNode *aTextInput)
{
}
Then have both impls of GetStateInternal() call that, and remove the temporary accessible. I bet that will fix the bug.
Comment 7•16 years ago
|
||
In addition to that, we will need to borrow some code from nsAccessible::GetStateInternal() for nsXULTextFieldAccessible::GetStateInternal()
else if (content->IsNodeOfType(nsINode::eELEMENT)) {
nsIFrame *frame = GetFrame();
if (frame && frame->IsFocusable()) {
*aState |= nsIAccessibleStates::STATE_FOCUSABLE;
}
if (gLastFocusedNode == mDOMNode) {
*aState |= nsIAccessibleStates::STATE_FOCUSED;
}
}
Reporter | ||
Comment 8•16 years ago
|
||
I'll take a stab at it tomorrow.
I think the null check makes sense.
It seems nsHyperTextAccessibleWrap::GetStateInternal() doesn't return an error if content is null.
Comment 10•16 years ago
|
||
Ginn, it doesn't return an error but it retunrs early with NS_OK_DEFUNCT_OBJECT. And, I think the solution of no longer creating the accessible on the stack is a good idea anyway. It can't hurt.
Here the details of how it returns early:
nsHTMLTextFieldAccessible::GetStateInternal() has this:
nsresult rv = nsHyperTextAccessibleWrap::GetStateInternal(aState,
aExtraState);
NS_ENSURE_A11Y_SUCCESS(rv, rv);
There is no nsHyperTextAccessibleWrap::GetStateInternal(),
so this calls up to nsHyperTextAccessible::GetStateInternal(), which has:
nsresult rv = nsAccessibleWrap::GetStateInternal(aState, aExtraState);
NS_ENSURE_A11Y_SUCCESS(rv, rv);
Which actually calls up to nsAccessible::GetStateInternal(aState, aExtraState), which has:
if (IsDefunct()) {
if (aExtraState)
*aExtraState = nsIAccessibleStates::EXT_STATE_DEFUNCT;
return NS_OK_DEFUNCT_OBJECT;
}
The definition of NS_ENSURE_A11Y_SUCCESS ensures that NS_OK_DEFUNCT_OBJECT is returned early.
http://mxr.mozilla.org/mozilla-central/source/accessible/src/base/nsAccessNode.h#80
So, NS_OK_DEFUNCT object is returned up the chain, an nsHTMLTextFieldAccessible::GetStateInternal() would return early.
Assignee | ||
Comment 11•16 years ago
|
||
OK, I see.
But how did msaa get the temp accessible object?
Comment 12•16 years ago
|
||
Ginn, we create it in nsXULTextFieldAccessible::GetStateInternal()
Maybe I don't understand your question.
Assignee | ||
Comment 13•16 years ago
|
||
My question is
the crash stack is called from msaa nsAccessibleWrap::get_accState,
so the call is from AT client.
how did the AT client get the pointer to temp accessible object?
Comment 14•16 years ago
|
||
I didn't see a good stack in awhile. Now that crash-stats is coming up I see that we may need to change the theory:
http://crash-stats.mozilla.com/report/index/468ea81c-3b2c-450f-bb40-80cd12090116
0 xul.dll nsHTMLTextFieldAccessible::GetStateInternal accessible/src/html/nsHTMLFormControlAccessible.cpp:457
1 xul.dll nsAccessNode::Release accessible/src/base/nsAccessNode.cpp:126
2 xul.dll nsAccessible::GetState accessible/src/base/nsAccessible.cpp:2276
3 xul.dll nsAccessibleWrap::get_accState accessible/src/msaa/nsAccessibleWrap.cpp:554
But, I don't think it's a temp accessible issue now that I see that.
Comment 15•16 years ago
|
||
MArco, I'd love to know if the crashes started after we implemented cycle collecting.
That stack is just weird, why would Release() call GetStateInternal? I don't get it.
Maybe some others have a different stack.
Bottom line: the problem needs more research.
Assignee | ||
Comment 16•16 years ago
|
||
Most of the stacks in reports are like this.
0 xul.dll nsHTMLTextFieldAccessible::GetStateInternal accessible/src/html/nsHTMLFormControlAccessible.cpp:457
1 xul.dll nsAccessible::GetState accessible/src/base/nsAccessible.cpp:2276
2 xul.dll nsAccessibleWrap::get_accState accessible/src/msaa/nsAccessibleWrap.cpp:554
But some reports are strange.
e.g.
0 xul.dll nsHTMLTextFieldAccessible::GetStateInternal accessible/src/html/nsHTMLFormControlAccessible.cpp:457
1 ntdll.dll RtlFreeHeap
0 xul.dll nsHTMLTextFieldAccessible::GetStateInternal accessible/src/html/nsHTMLFormControlAccessible.cpp:457
1 oleaut32.dll LateInitRpcDll
0 xul.dll nsHTMLTextFieldAccessible::GetStateInternal accessible/src/html/nsHTMLFormControlAccessible.cpp:457
1 nssutil3.dll nssutil3.dll@0x10048
Comment 17•16 years ago
|
||
Wait, it's getting interesting. I wish crash-stats was working! But, the query above works -- perhaps it is cached.
All of these crashes are on Windows.
I just looked at the modules in each crash:
* every crash contains 3rd party apps that inject themselves into the process. This is on Windows only. That means we have no protection against bad pointers they might pass in. Normally the marshalling process will give us its own pointers before copying the data to another process, which means we only get good pointers.
* every crash except 1 contains this dll
APSHook.dll -- From Cognizance software. They make security software with things like single sign-on and web access management
* I also saw these here and there:
- GrooveSystemServices, GrooveShellExtensions, GrooveUtil -- part of Microsoft Groove, seems to be another file sharing/synching system
- DropboxExt -- from getdropbox.com, an online file sharing/synching program
- RocketDock -- a fancy mac like doc for Windows
- Yodm3d -- desktop 3D enhancer
- BtMmHook or BTKeyInd -- part of Blue Tooth, injects itself. Not clear what they do.
I'm amazed at how many applications inject themselves into Firefox's process!
If crash-stats was working we could look for more cases where it wasn't APSHook, or get a rough regression range. It could be that our new cycle collection is affecting this. It might always be related to file inputs or XUL inputs. I'm not even sure if the bad pointer is the accState pointer or the IAccessible.
Comment 18•16 years ago
|
||
The fact that it is just with HTML text field is suspicious.
Here is a hypothesis:
APSHook holds onto the IAccessible* of a given HTML textfield. It then uses it again and again, but the text field goes away at some point. They are still using the old pointer.
Either we have a refcounting error or they do. Perhaps they don't ref count it at all, assuming it will always stay the same.
Comment 19•16 years ago
|
||
Cognizance Corporation who made this APSHook.dll, was bought out by Bioscrypt in 2005.
I *think* this is related to single sign on solutions -- automatically filling in user name and password fields, via fingerprint identification, for example. I've heard of such apps wanting to use Firefox's MSAA support before.
Comment 20•16 years ago
|
||
A single sign on package is one that reads your fingerprint and fills in password fields. That means it needs to scan for textfields and see if they are STATE_PROTECTED, and thus passwords. So, it needs to call get_accState on textfields.
Comment 21•16 years ago
|
||
A web search for Firefox APSHook reveals people having all sorts of problems with sluggishness, lock ups, stack overflows, extra memory usage etc. Web searches for Verisoft Firefox reveal that many users are frustrated that Firefox 3 doesn't work with it. I called support and they told me that Firefox isn't supported, but that they would forward my email to the developers, so I've done that. I'm not holding my breath though.
Maybe the HP Pavilion folks who bundle Verisoft and a fingerprint reader will need to find out about this.
I guess one solution is to disable our accessibility support when this DLL is detected.
Updated•16 years ago
|
Summary: Crash [@ nsHTMLTextFieldAccessible::GetStateInternal(unsigned int*, unsigned int*) ] → Crash [@ nsHTMLTextFieldAccessible::GetStateInternal(unsigned int*, unsigned int*) -- probably Verisoft fingerprint reader that comes with HP laptops
Updated•16 years ago
|
Summary: Crash [@ nsHTMLTextFieldAccessible::GetStateInternal(unsigned int*, unsigned int*) -- probably Verisoft fingerprint reader that comes with HP laptops → Crash [@ nsHTMLTextFieldAccessible::GetStateInternal()], probably Verisoft fingerprint reader that comes with HP laptops
Comment 22•16 years ago
|
||
Update: a developer from Bioscrypt, Alex Krasnov, has replied to me and will try and help resolve the issue.
He writes "The fact that the problem did not happen with Firefox 2 could suggest that the issue is caused by a combination of behavior of our product and new behaviors in Firefox 3."
We need crash-stats queries to work, in order to see what changed in Firefox that makes APSHook cause us to crash now.
Updated•16 years ago
|
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P2
Comment 23•16 years ago
|
||
[1] puts the first crash of 3.1b3pre at build 20081127. nothing found search ing 3.1b2pre [2]
[1] 20071122-20081129 http://crash-stats.mozilla.com/report/list?product=Firefox&branch=1.9.1&version=Firefox%3A3.1b3pre&platform=windows&query_search=signature&query_type=contains&query=nsHTMLTextFieldAccessible%3A%3AGetStateInternal&date=2008-11-29&range_value=7&range_unit=days&do_query=1&signature=nsHTMLTextFieldAccessible%3A%3AGetStateInternal%28unsigned%20int*%2C%20unsigned%20int*%29
[2] http://crash-stats.mozilla.com/?do_query=1&product=Firefox&branch=1.9.1&version=Firefox%3A3.1b2pre&platform=windows&query_search=signature&query_type=contains&query=nsHTMLTextFieldAccessible%3A%3AGetStateInternal&date=2008-11-29&range_value=6&range_unit=days
Comment 24•16 years ago
|
||
We absolutely need to find the right fix here but what about, in the meantime, pushing Marco's null check? Would we just bury the problem deeper?
Also regarding comment 16, I'm confused too. What is up with those strange frame stacks?
Assignee | ||
Comment 25•16 years ago
|
||
The DOM node is shutdown during thensHTMLTextFieldAccessible::GetStateInternal() call.
[3] nsCOMPtr<nsIDOMNode>::operator=(this = 0xef2fa47c, rhs = (nil)), line 640 in "nsCOMPtr.h"
[4] nsAccessNode::Shutdown(this = 0xef2fa470), line 225 in "nsAccessNode.cpp"
[5] nsAccessible::Shutdown(this = 0xef2fa470), line 486 in "nsAccessible.cpp"
[6] nsAccessibleWrap::Shutdown(this = 0xef2fa470), line 331 in "nsAccessibleWrap.cpp"
[7] nsAccessNode::GetPresShell(this = 0xef2fa470), line 357 in "nsAccessNode.cpp"
[8] nsAccessible::IsVisible(this = 0xef2fa470, aIsOffscreen = 0x8045624), line 847 in "nsAccessible.cpp"
[9] nsAccessible::GetStateInternal(this = 0xef2fa470, aState = 0x8045b6c, aExtraState = 0x8045b68), line 999 in "nsAccessible.cpp"
[10] nsHyperTextAccessible::GetStateInternal(this = 0xef2fa470, aState = 0x8045b6c, aExtraState = 0x8045b68), line 178 in "nsHyperTextAccessible.cpp"
[11] nsHTMLTextFieldAccessible::GetStateInternal(this = 0xef2fa470, aState = 0x8045b6c, aExtraState = 0x8045b68), line 450 in "nsHTMLFormControlAccessible.cpp"
[12] nsAccessible::GetState(this = 0xef2fa470, aState = 0x8045b6c, aExtraState = 0x8045b68), line 2276 in "nsAccessible.cpp"
[13] refStateSetCB(aAtkObj = 0xef25e970), line 1036 in "nsAccessibleWrap.cpp"
That's why nsHyperTextAccessibleWrap::GetStateInternal() passed, but mDOMNode and content is null after that.
This bug is reliably reproducible with orca testcase doc_tabs.py.
Assignee | ||
Comment 26•16 years ago
|
||
Aaron, Surkov, David,
Would it be a performance issue if we do GetPresShell().get() in IsDefunct() ?
Other concerns?
Comment 27•16 years ago
|
||
(In reply to comment #26)
> Aaron, Surkov, David,
>
> Would it be a performance issue if we do GetPresShell().get() in IsDefunct() ?
> Other concerns?
Aaron doesn't watch bugzilla anymore, you should email him directly to get his attention.
It might be because every method calls IsDefunct(). But on another hand it might be ok if it's right approach in general.
Comment 28•16 years ago
|
||
technically if we shutdown an accessible during external call then we must stop to execute this method and return error.
Assignee | ||
Comment 29•16 years ago
|
||
Right, so I think we should do it in IsDefunct().
It is better than return error in IsVisible().
If we do it in IsVisible(), we need to reset aState, aExtraState for error, it doesn't sound good.
And do it in IsDefunt() is safer.
Since we are going to fix it in 1.9.1, we should take the safe approach.
Assignee | ||
Comment 30•16 years ago
|
||
Attachment #366781 -
Flags: review?(surkov.alexander)
Comment 31•16 years ago
|
||
Ginn, I still thinking of two things
1) do we need to shutdown accessilbe on GetPresShell (this method and logic was introduced in bug 193802)
2) understand why we didn't invalidate accessible tree when there is no presshell.
But these might be result of following up bug and in the meantime we could take IsDefunct approach.
Comment 32•16 years ago
|
||
Comment on attachment 366781 [details] [diff] [review]
patch
>- virtual PRBool IsDefunct() { return !mDOMNode; }
>+ virtual PRBool IsDefunct() { return !mDOMNode || !GetPresShell().get(); }
! operator isn't defined for already_AddRefed?
I would appreciate a comment here, possibly it makes sense to move the method into .cpp file.
Ginn, could you file a following up bug to investigate what's wrong going on here, why we don't invalidate accessible tree?
r=me
Attachment #366781 -
Flags: review?(surkov.alexander) → review+
Updated•16 years ago
|
Attachment #352213 -
Flags: review?(surkov.alexander)
Comment 33•16 years ago
|
||
Comment on attachment 352213 [details] [diff] [review]
null check.
cancelling reivew, though this should fix the bug :)
Updated•16 years ago
|
Assignee: marco.zehe → ginn.chen
Assignee | ||
Comment 34•16 years ago
|
||
The problem is we didn't shutdown doc accessible when a tab is closed.
I filed Bug 482932 to track it.
Summary: Crash [@ nsHTMLTextFieldAccessible::GetStateInternal()], probably Verisoft fingerprint reader that comes with HP laptops → Crash [@ nsHTMLTextFieldAccessible::GetStateInternal()]
Assignee | ||
Comment 35•16 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Updated•16 years ago
|
Target Milestone: --- → mozilla1.9.2a1
Assignee | ||
Comment 36•16 years ago
|
||
back out due to leaks
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 37•16 years ago
|
||
fix the leak
Attachment #352213 -
Attachment is obsolete: true
Attachment #366781 -
Attachment is obsolete: true
Attachment #367199 -
Flags: review?(surkov.alexander)
Updated•16 years ago
|
Status: REOPENED → ASSIGNED
Assignee | ||
Comment 38•16 years ago
|
||
duh!
Attachment #367199 -
Attachment is obsolete: true
Attachment #367203 -
Flags: review?(surkov.alexander)
Attachment #367199 -
Flags: review?(surkov.alexander)
Updated•16 years ago
|
Attachment #367203 -
Flags: review?(surkov.alexander) → review+
Comment 39•16 years ago
|
||
Comment on attachment 367203 [details] [diff] [review]
patch v2
Sorry I noticed this leak after my previous r+ and I didn't say you about it. I thought you work on another patch (invalidate accessibles when tab is destroyed) and I thought you're not going to push this patch.
If you want to keep this patch as temporary solution then it's ok with me and r=me. But I still hope to see correct fix for this problem.
Assignee | ||
Comment 40•16 years ago
|
||
I think we should also apply the change to
nsXULTreeitemAccessible::IsDefunct()
I'll continue work on Bug 482932.
But I'm not sure if the fix is suitable for 1.9.1 branch.
OS: Windows Vista → All
Hardware: x86 → All
Comment 41•16 years ago
|
||
(In reply to comment #40)
> I think we should also apply the change to
> nsXULTreeitemAccessible::IsDefunct()
Probably, but xultreeitem accessible is very specific and doesn't use many things from base accessible class.
> But I'm not sure if the fix is suitable for 1.9.1 branch.
Technically it should be suitable. But I'm not sure this approach is right. In general I think GetPresShell shouldn't shutdown the accessible. It's something wrong. I would say, it's probably good to morph your patch and query nsIPressShell from mWeakShell and assert if it can't be queried. That should be enough with your fix.
Assignee | ||
Comment 42•16 years ago
|
||
Attachment #367203 -
Attachment is obsolete: true
Attachment #367564 -
Flags: review?(surkov.alexander)
Attachment #367564 -
Flags: review?(david.bolter)
Comment 43•16 years ago
|
||
Comment on attachment 367564 [details] [diff] [review]
patch v3
>+PRBool nsAccessNode::IsDefunct()
>+{
>+ if (!mDOMNode)
>+ return PR_TRUE;
>+
>+ // Call GetPresShell() since the accessible may be shut down in it.
>+ nsCOMPtr<nsIPresShell> presShell(GetPresShell());
>+ return !presShell;
So in this case we think we have an mDOMNode but we can't get a pres shell. Can we prevent this? If so please open a bug for that work (or fix here).
Either way I think this patch makes us more stable :) so if Alexander is happy, r=me
Attachment #367564 -
Flags: review?(david.bolter) → review+
Comment 44•16 years ago
|
||
Ginn, could you address second part of comment #42? I'm not sure we really need
your patch because of bug 482932.
(In reply to comment #43)
> So in this case we think we have an mDOMNode but we can't get a pres shell. Can
> we prevent this? If so please open a bug for that work (or fix here).
David, it's bug 482932. Technically I believe GetPresShell shouldn't shutdown the accessible. I think we should assert rather.
Assignee | ||
Comment 45•16 years ago
|
||
Surkov,
I agree, ideally mDOMNode should be gone when PresShell is gone.
But currently even with the patch of bug 482932, it's still possible that GetPresShell() failed for CallQueryReferent. If we don't shutdown it, it will hold the memory until we empty our caches on quit.
I agree to add an assertion, and track it in bugzilla.
But we need to keep the shutdown in GetPresShell() until we have addressed the issue completely.
On the other hand, if the pres shell is gone, the accessible obj is defunct, the semantics is correct.
Even if GetPresShell() doesn't shutdown node, it would be a good protection. (the comment should be changed though)
I think we should keep this patch, or add an assertion later.
Also since bug 482932 has a long history, I think we don't need to address it in 1.9.1 branch.
My plan is
1) Land patch of bug 468727 on trunk and 1.9.1
2) Land patch of bug 482932 on trunk and add assertion in GetPresShell().
3) If we have fixed the assertion, we can remove Shutdown() from GetPresShell().
Comment 46•16 years ago
|
||
Comment on attachment 367564 [details] [diff] [review]
patch v3
Ginn, you're plan sounds reasonable. So r=me with nits.
>+
>+PRBool nsAccessNode::IsDefunct()
nit: PRBool on own line
nit: try to place IsDefunct definition between GetCurrentFocus and Init to be similar to IsDefunct declaration.
Attachment #367564 -
Flags: review?(surkov.alexander) → review+
Assignee | ||
Comment 47•16 years ago
|
||
Pushed to trunk:
http://hg.mozilla.org/mozilla-central/rev/987405c2542a
I can't put it between GetCurrentFocus and Init, because GetCurrentFocus comes after Init. :)
I put it before GetPresShell(), it would be easier to later changes.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 48•16 years ago
|
||
Pushed to 1.9.1
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/8a4847a6194c
Keywords: fixed1.9.1
Updated•13 years ago
|
Crash Signature: [@ nsHTMLTextFieldAccessible::GetStateInternal()]
You need to log in
before you can comment on or make changes to this bug.
Description
•