If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Crash [@ nsHTMLTextFieldAccessible::GetStateInternal()]

RESOLVED FIXED in mozilla1.9.2a1

Status

()

Core
Disability Access APIs
P2
critical
RESOLVED FIXED
9 years ago
6 years ago

People

(Reporter: MarcoZ, Assigned: Ginn Chen)

Tracking

(4 keywords)

1.9.1 Branch
mozilla1.9.2a1
access, crash, fixed1.9.1, topcrash
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9.1 +

Firefox Tracking Flags

(Not tracked)

Details

(crash signature)

Attachments

(1 attachment, 4 obsolete attachments)

(Reporter)

Description

9 years ago
This is the top crasher on 3.1 right now, with 45 within the last 10 days.
http://crash-stats.mozilla.com/report/list?product=Firefox&branch=1.9.1&version=Firefox%3A3.1b2&version=Firefox%3A3.1b2pre&version=Firefox%3A3.1b3pre&platform=windows&query_search=signature&query_type=contains&query=Access&date=&range_value=9&range_unit=days&do_query=1&signature=nsHTMLTextFieldAccessible%3A%3AGetStateInternal%28unsigned%20int*%2C%20unsigned%20int*%29

The stack points to a simple usage of "content" where it is not being checked whether "content" is actually not NULL.
(Reporter)

Comment 1

9 years ago
Created attachment 352213 [details] [diff] [review]
null check.
Assignee: nobody → marco.zehe
Status: NEW → ASSIGNED
Attachment #352213 - Flags: review?(surkov.alexander)

Comment 2

9 years ago
mDOMNode of nsHTMLTextFieldAccessible should always implement nsIContent interface. It means there is one case when you can't query nsIContent, this case is when accessible is defunct (i.e. mDOMNode is null) but this is checked in nsAccessible::GetStateInternal which returns NS_OK_DEFUNCT_OBJECT in this case and this nsresult is checked inside nsHTMLTextFieldAccessible::GetStateInternal (by NS_ENSURE_A11Y_SUCCESS macros). So I can't see the case when you can't query nsIContent. Any thoughts? Are you able to reproduce this bug?
(Reporter)

Comment 3

9 years ago
I ran into this bug yesterday once, I was simply going back to a previous page, and suddenly it crashed onme with this same signature. But no, I cannot reproduce the bug at will.

Comment 4

9 years ago
I have half a theory. I think it relates to this code:
http://mxr.mozilla.org/mozilla-central/source/accessible/src/xul/nsXULFormControlAccessible.cpp#819

There we create a temporary nsHTMLTextFieldAccessible, and guess what, we call GetStateInternal() on that.

Not sure which part is crashing, though.

Comment 5

9 years ago
That's the only place in the code where we create a new accessible on the stack, and don't init it. The accessible is not in the cache.

We did this to save code because the XUL textfield's state is dependent on the state of the HTML text field inside of it. Since we already had code to get the state of an HTML textfield we wanted an easy way to grab that.

Updated

9 years ago
Flags: blocking1.9.1?

Comment 6

9 years ago
See my new comments in 4 & 5. I believe it's a major hint. Anyone have any idea what's going wrong? Ginn, this is the kind of bug you are good at.

If we don't figure it out, I suggest we make the code more standard. Just pull all the logic to get the state from an <input> node into a static method that takes a DOM node, like:
static nsHTMLTextFieldAccessible::GetStateForTextInput(nsIDOMNode *aTextInput)
{

}

Then have both impls of GetStateInternal() call that, and remove the temporary accessible. I bet that will fix the bug.

Comment 7

9 years ago
In addition to that, we will need to borrow some code from nsAccessible::GetStateInternal() for nsXULTextFieldAccessible::GetStateInternal()

   else if (content->IsNodeOfType(nsINode::eELEMENT)) {
     nsIFrame *frame = GetFrame();
     if (frame && frame->IsFocusable()) {
       *aState |= nsIAccessibleStates::STATE_FOCUSABLE;
     }
 
     if (gLastFocusedNode == mDOMNode) {
       *aState |= nsIAccessibleStates::STATE_FOCUSED;
     }
   }
(Reporter)

Comment 8

9 years ago
I'll take a stab at it tomorrow.
(Assignee)

Comment 9

9 years ago
I think the null check makes sense.
It seems nsHyperTextAccessibleWrap::GetStateInternal() doesn't return an error if content is null.

Comment 10

9 years ago
Ginn, it doesn't return an error but it retunrs early with NS_OK_DEFUNCT_OBJECT. And, I think the solution of no longer creating the accessible on the stack is a good idea anyway. It can't hurt.

Here the details of how it returns early:

nsHTMLTextFieldAccessible::GetStateInternal() has this:
   nsresult rv = nsHyperTextAccessibleWrap::GetStateInternal(aState,
                                                             aExtraState);
   NS_ENSURE_A11Y_SUCCESS(rv, rv);

There is no nsHyperTextAccessibleWrap::GetStateInternal(),
so this calls up to nsHyperTextAccessible::GetStateInternal(), which has:
   nsresult rv = nsAccessibleWrap::GetStateInternal(aState, aExtraState);
   NS_ENSURE_A11Y_SUCCESS(rv, rv);

Which actually calls up to nsAccessible::GetStateInternal(aState, aExtraState), which has:
   if (IsDefunct()) {
     if (aExtraState)
       *aExtraState = nsIAccessibleStates::EXT_STATE_DEFUNCT;
 
    return NS_OK_DEFUNCT_OBJECT;
   }

The definition of NS_ENSURE_A11Y_SUCCESS ensures that NS_OK_DEFUNCT_OBJECT is returned early.
http://mxr.mozilla.org/mozilla-central/source/accessible/src/base/nsAccessNode.h#80

So, NS_OK_DEFUNCT object is returned up the chain, an  nsHTMLTextFieldAccessible::GetStateInternal() would return early.
(Assignee)

Comment 11

9 years ago
OK, I see.

But how did msaa get the temp accessible object?

Comment 12

9 years ago
Ginn, we create it in nsXULTextFieldAccessible::GetStateInternal()
Maybe I don't understand your question.
(Assignee)

Comment 13

9 years ago
My question is
the crash stack is called from msaa nsAccessibleWrap::get_accState,
so the call is from AT client.
how did the AT client get the pointer to temp accessible object?

Comment 14

9 years ago
I didn't see a good stack in awhile. Now that crash-stats is coming up I see that we may need to change the theory:
http://crash-stats.mozilla.com/report/index/468ea81c-3b2c-450f-bb40-80cd12090116

0  	xul.dll  	nsHTMLTextFieldAccessible::GetStateInternal  	accessible/src/html/nsHTMLFormControlAccessible.cpp:457
1 	xul.dll 	nsAccessNode::Release 	accessible/src/base/nsAccessNode.cpp:126
2 	xul.dll 	nsAccessible::GetState 	accessible/src/base/nsAccessible.cpp:2276
3 	xul.dll 	nsAccessibleWrap::get_accState 	accessible/src/msaa/nsAccessibleWrap.cpp:554 

But, I don't think it's a temp accessible issue now that I see that.

Comment 15

9 years ago
MArco, I'd love to know if the crashes started after we implemented cycle collecting.  

That stack is just weird, why would Release() call GetStateInternal? I don't get it.

Maybe some others have a different stack.

Bottom line: the problem needs more research.
(Assignee)

Comment 16

9 years ago
Most of the stacks in reports are like this.
0	xul.dll	nsHTMLTextFieldAccessible::GetStateInternal	 accessible/src/html/nsHTMLFormControlAccessible.cpp:457
1	xul.dll	nsAccessible::GetState	 accessible/src/base/nsAccessible.cpp:2276
2	xul.dll	nsAccessibleWrap::get_accState	 accessible/src/msaa/nsAccessibleWrap.cpp:554

But some reports are strange.
e.g.
0	xul.dll	nsHTMLTextFieldAccessible::GetStateInternal	 accessible/src/html/nsHTMLFormControlAccessible.cpp:457
1	ntdll.dll	RtlFreeHeap

0	xul.dll	nsHTMLTextFieldAccessible::GetStateInternal	 accessible/src/html/nsHTMLFormControlAccessible.cpp:457
1	oleaut32.dll	LateInitRpcDll

0	xul.dll	nsHTMLTextFieldAccessible::GetStateInternal	 accessible/src/html/nsHTMLFormControlAccessible.cpp:457
1	nssutil3.dll	nssutil3.dll@0x10048

Comment 17

9 years ago
Wait, it's getting interesting. I wish crash-stats was working! But, the query above works -- perhaps it is cached.

All of these crashes are on Windows.

I just looked at the modules in each crash:
* every crash contains 3rd party apps that inject themselves into the process. This is on Windows only. That means we have no protection against bad pointers they might pass in. Normally the marshalling process will give us its own pointers before copying the data to another process, which means we only get good pointers.
* every crash except 1 contains this dll
APSHook.dll -- From Cognizance software. They make security software with things like single sign-on and web access management
* I also saw these here and there:
- GrooveSystemServices, GrooveShellExtensions, GrooveUtil -- part of Microsoft Groove, seems to be another file sharing/synching system
- DropboxExt -- from getdropbox.com, an online file sharing/synching program
- RocketDock -- a fancy mac like doc for Windows
- Yodm3d -- desktop 3D enhancer
- BtMmHook or BTKeyInd -- part of Blue Tooth, injects itself. Not clear what they do.

I'm amazed at how many applications inject themselves into Firefox's process!

If crash-stats was working we could look for more cases where it wasn't APSHook, or get a rough regression range. It could be that our new cycle collection is affecting this. It might always be related to file inputs or XUL inputs. I'm not even sure if the bad pointer is the accState pointer or the IAccessible.

Comment 18

9 years ago
The fact that it is just with HTML text field is suspicious.

Here is a hypothesis:
APSHook holds onto the IAccessible* of a given HTML textfield. It then uses it again and again, but the text field goes away at some point. They are still using the old pointer.

Either we have a refcounting error or they do. Perhaps they don't ref count it at all, assuming it will always stay the same.

Comment 19

9 years ago
Cognizance Corporation who made this APSHook.dll, was bought out by Bioscrypt in 2005.

I *think* this is related to single sign on solutions -- automatically filling in user name and password fields, via fingerprint identification, for example. I've heard of such apps wanting to use Firefox's MSAA support before.

Comment 20

9 years ago
A single sign on package is one that reads your fingerprint and fills in password fields. That means it needs to scan for textfields and see if they are STATE_PROTECTED, and thus passwords. So, it needs to call get_accState on textfields.

Comment 21

9 years ago
A web search for Firefox APSHook reveals people having all sorts of problems with sluggishness, lock ups, stack overflows, extra memory usage etc. Web searches for Verisoft Firefox reveal that many users are frustrated that Firefox 3 doesn't work with it. I called support and they told me that Firefox isn't supported, but that they would forward my email to the developers, so I've done that. I'm not holding my breath though.

Maybe the HP Pavilion folks who bundle Verisoft and a fingerprint reader will need to find out about this.

I guess one solution is to disable our accessibility support when this DLL is detected.

Updated

9 years ago
Summary: Crash [@ nsHTMLTextFieldAccessible::GetStateInternal(unsigned int*, unsigned int*) ] → Crash [@ nsHTMLTextFieldAccessible::GetStateInternal(unsigned int*, unsigned int*) -- probably Verisoft fingerprint reader that comes with HP laptops

Updated

9 years ago
Summary: Crash [@ nsHTMLTextFieldAccessible::GetStateInternal(unsigned int*, unsigned int*) -- probably Verisoft fingerprint reader that comes with HP laptops → Crash [@ nsHTMLTextFieldAccessible::GetStateInternal()], probably Verisoft fingerprint reader that comes with HP laptops

Updated

9 years ago
Depends on: 444749

Comment 22

9 years ago
Update: a developer from Bioscrypt, Alex Krasnov, has replied to me and will try and help resolve the issue.

He writes "The fact that the problem did not happen with Firefox 2 could suggest that the issue is caused by a combination of behavior of our product and new behaviors in Firefox 3."

We need crash-stats queries to work, in order to see what changed in Firefox that makes APSHook cause us to crash now.

Updated

9 years ago
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P2
[1] puts the first crash of 3.1b3pre at build 20081127. nothing found search ing 3.1b2pre [2]

[1] 20071122-20081129 http://crash-stats.mozilla.com/report/list?product=Firefox&branch=1.9.1&version=Firefox%3A3.1b3pre&platform=windows&query_search=signature&query_type=contains&query=nsHTMLTextFieldAccessible%3A%3AGetStateInternal&date=2008-11-29&range_value=7&range_unit=days&do_query=1&signature=nsHTMLTextFieldAccessible%3A%3AGetStateInternal%28unsigned%20int*%2C%20unsigned%20int*%29

[2] http://crash-stats.mozilla.com/?do_query=1&product=Firefox&branch=1.9.1&version=Firefox%3A3.1b2pre&platform=windows&query_search=signature&query_type=contains&query=nsHTMLTextFieldAccessible%3A%3AGetStateInternal&date=2008-11-29&range_value=6&range_unit=days
Keywords: topcrash
We absolutely need to find the right fix here but what about, in the meantime, pushing Marco's null check? Would we just bury the problem deeper?

Also regarding comment 16, I'm confused too. What is up with those strange frame stacks?
(Assignee)

Comment 25

9 years ago
The DOM node is shutdown during thensHTMLTextFieldAccessible::GetStateInternal() call.

  [3] nsCOMPtr<nsIDOMNode>::operator=(this = 0xef2fa47c, rhs = (nil)), line 640 in "nsCOMPtr.h"
  [4] nsAccessNode::Shutdown(this = 0xef2fa470), line 225 in "nsAccessNode.cpp"
  [5] nsAccessible::Shutdown(this = 0xef2fa470), line 486 in "nsAccessible.cpp"
  [6] nsAccessibleWrap::Shutdown(this = 0xef2fa470), line 331 in "nsAccessibleWrap.cpp"
  [7] nsAccessNode::GetPresShell(this = 0xef2fa470), line 357 in "nsAccessNode.cpp"
  [8] nsAccessible::IsVisible(this = 0xef2fa470, aIsOffscreen = 0x8045624), line 847 in "nsAccessible.cpp"
  [9] nsAccessible::GetStateInternal(this = 0xef2fa470, aState = 0x8045b6c, aExtraState = 0x8045b68), line 999 in "nsAccessible.cpp"
  [10] nsHyperTextAccessible::GetStateInternal(this = 0xef2fa470, aState = 0x8045b6c, aExtraState = 0x8045b68), line 178 in "nsHyperTextAccessible.cpp"
  [11] nsHTMLTextFieldAccessible::GetStateInternal(this = 0xef2fa470, aState = 0x8045b6c, aExtraState = 0x8045b68), line 450 in "nsHTMLFormControlAccessible.cpp"
  [12] nsAccessible::GetState(this = 0xef2fa470, aState = 0x8045b6c, aExtraState = 0x8045b68), line 2276 in "nsAccessible.cpp"
  [13] refStateSetCB(aAtkObj = 0xef25e970), line 1036 in "nsAccessibleWrap.cpp"

That's why nsHyperTextAccessibleWrap::GetStateInternal() passed, but mDOMNode and content is null after that.

This bug is reliably reproducible with orca testcase doc_tabs.py.
(Assignee)

Comment 26

9 years ago
Aaron, Surkov, David,

Would it be a performance issue if we do GetPresShell().get() in IsDefunct() ?
Other concerns?

Comment 27

9 years ago
(In reply to comment #26)
> Aaron, Surkov, David,
> 
> Would it be a performance issue if we do GetPresShell().get() in IsDefunct() ?
> Other concerns?

Aaron doesn't watch bugzilla anymore, you should email him directly to get his attention.

It might be because every method calls IsDefunct(). But on another hand it might be ok if it's  right approach in general.

Comment 28

9 years ago
technically if we shutdown an accessible during external call then we must stop to execute this method and return error.
(Assignee)

Comment 29

9 years ago
Right, so I think we should do it in IsDefunct().

It is better than return error in IsVisible().
If we do it in IsVisible(), we need to reset aState, aExtraState for error, it doesn't sound good.

And do it in IsDefunt() is safer.
Since we are going to fix it in 1.9.1, we should take the safe approach.
(Assignee)

Comment 30

9 years ago
Created attachment 366781 [details] [diff] [review]
patch
Attachment #366781 - Flags: review?(surkov.alexander)

Comment 31

9 years ago
Ginn, I still thinking of two things
1) do we need to shutdown accessilbe on GetPresShell (this method and logic was introduced in bug 193802)
2) understand why we didn't invalidate accessible tree when there is no presshell.

But these might be result of following up bug and in the meantime we could take IsDefunct approach.

Comment 32

9 years ago
Comment on attachment 366781 [details] [diff] [review]
patch


>-    virtual PRBool IsDefunct() { return !mDOMNode; }
>+    virtual PRBool IsDefunct() { return !mDOMNode || !GetPresShell().get(); }

! operator isn't defined for already_AddRefed?

I would appreciate a comment here, possibly it makes sense to move the method into .cpp file.

Ginn, could you file a following up bug to investigate what's wrong going on here, why we don't invalidate accessible tree?

r=me
Attachment #366781 - Flags: review?(surkov.alexander) → review+

Updated

9 years ago
Attachment #352213 - Flags: review?(surkov.alexander)

Comment 33

9 years ago
Comment on attachment 352213 [details] [diff] [review]
null check.

cancelling reivew, though this should fix the bug :)

Updated

9 years ago
Assignee: marco.zehe → ginn.chen
(Assignee)

Comment 34

9 years ago
The problem is we didn't shutdown doc accessible when a tab is closed.
I filed Bug 482932 to track it.
Summary: Crash [@ nsHTMLTextFieldAccessible::GetStateInternal()], probably Verisoft fingerprint reader that comes with HP laptops → Crash [@ nsHTMLTextFieldAccessible::GetStateInternal()]
(Assignee)

Comment 35

9 years ago
http://hg.mozilla.org/mozilla-central/rev/f570b2539bc0
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.2a1
(Assignee)

Comment 36

9 years ago
back out due to leaks
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 37

9 years ago
Created attachment 367199 [details] [diff] [review]
patch v2

fix the leak
Attachment #352213 - Attachment is obsolete: true
Attachment #366781 - Attachment is obsolete: true
Attachment #367199 - Flags: review?(surkov.alexander)
Status: REOPENED → ASSIGNED
(Assignee)

Comment 38

9 years ago
Created attachment 367203 [details] [diff] [review]
patch v2

duh!
Attachment #367199 - Attachment is obsolete: true
Attachment #367203 - Flags: review?(surkov.alexander)
Attachment #367199 - Flags: review?(surkov.alexander)

Updated

9 years ago
Attachment #367203 - Flags: review?(surkov.alexander) → review+

Comment 39

9 years ago
Comment on attachment 367203 [details] [diff] [review]
patch v2

Sorry I noticed this leak after my previous r+ and I didn't say you about it. I thought you work on another patch (invalidate accessibles when tab is destroyed) and I thought you're not going to push this patch.

If you want to keep this patch as temporary solution then it's ok with me and r=me. But I still hope to see correct fix for this problem.
(Assignee)

Comment 40

9 years ago
I think we should also apply the change to
nsXULTreeitemAccessible::IsDefunct()

I'll continue work on Bug 482932.
But I'm not sure if the fix is suitable for 1.9.1 branch.
OS: Windows Vista → All
Hardware: x86 → All

Comment 41

9 years ago
(In reply to comment #40)
> I think we should also apply the change to
> nsXULTreeitemAccessible::IsDefunct()

Probably, but xultreeitem accessible is very specific and doesn't use many things from base accessible class.

> But I'm not sure if the fix is suitable for 1.9.1 branch.

Technically it should be suitable. But I'm not sure this approach is right. In general I think GetPresShell shouldn't shutdown the accessible. It's something wrong. I would say, it's probably good to morph your patch and query nsIPressShell from mWeakShell and assert if it can't be queried. That should be enough with your fix.
(Assignee)

Comment 42

9 years ago
Created attachment 367564 [details] [diff] [review]
patch v3
Attachment #367203 - Attachment is obsolete: true
Attachment #367564 - Flags: review?(surkov.alexander)
(Assignee)

Updated

9 years ago
Attachment #367564 - Flags: review?(david.bolter)
Comment on attachment 367564 [details] [diff] [review]
patch v3

>+PRBool nsAccessNode::IsDefunct()
>+{
>+  if (!mDOMNode)
>+    return PR_TRUE;
>+
>+  // Call GetPresShell() since the accessible may be shut down in it.
>+  nsCOMPtr<nsIPresShell> presShell(GetPresShell());
>+  return !presShell;

So in this case we think we have an mDOMNode but we can't get a pres shell. Can we prevent this? If so please open a bug for that work (or fix here).

Either way I think this patch makes us more stable :) so if Alexander is happy, r=me
Attachment #367564 - Flags: review?(david.bolter) → review+

Comment 44

9 years ago
Ginn, could you address second part of comment #42? I'm not sure we really need
your patch because of bug 482932.

(In reply to comment #43)

> So in this case we think we have an mDOMNode but we can't get a pres shell. Can
> we prevent this? If so please open a bug for that work (or fix here).

David, it's bug 482932. Technically I believe GetPresShell shouldn't shutdown the accessible. I think we should assert rather.
(Assignee)

Comment 45

9 years ago
Surkov,
I agree, ideally mDOMNode should be gone when PresShell is gone.
But currently even with the patch of bug 482932, it's still possible that GetPresShell() failed for CallQueryReferent. If we don't shutdown it, it will hold the memory until we empty our caches on quit.
I agree to add an assertion, and track it in bugzilla.
But we need to keep the shutdown in GetPresShell() until we have addressed the issue completely.

On the other hand, if the pres shell is gone, the accessible obj is defunct, the semantics is correct.
Even if GetPresShell() doesn't shutdown node, it would be a good protection. (the comment should be changed though)
I think we should keep this patch, or add an assertion later.

Also since bug 482932 has a long history, I think we don't need to address it in 1.9.1 branch.

My plan is
1) Land patch of bug 468727 on trunk and 1.9.1
2) Land patch of bug 482932 on trunk and add assertion in GetPresShell().
3) If we have fixed the assertion, we can remove Shutdown() from GetPresShell().

Comment 46

9 years ago
Comment on attachment 367564 [details] [diff] [review]
patch v3

Ginn, you're plan sounds reasonable. So r=me with nits.

>+
>+PRBool nsAccessNode::IsDefunct()

nit: PRBool on own line

nit: try to place IsDefunct definition between GetCurrentFocus and Init to be similar to IsDefunct declaration.
Attachment #367564 - Flags: review?(surkov.alexander) → review+
(Assignee)

Comment 47

9 years ago
Pushed to trunk:
http://hg.mozilla.org/mozilla-central/rev/987405c2542a

I can't put it between GetCurrentFocus and Init, because GetCurrentFocus comes after Init. :)
I put it before GetPresShell(), it would be easier to later changes.
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago9 years ago
Resolution: --- → FIXED
(Assignee)

Comment 48

9 years ago
Pushed to 1.9.1
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/8a4847a6194c
Keywords: fixed1.9.1
Crash Signature: [@ nsHTMLTextFieldAccessible::GetStateInternal()]
You need to log in before you can comment on or make changes to this bug.