Closed Bug 405951 Opened 13 years ago Closed 13 years ago

Thunderbird: newsgroup/feed messages blank in virtual buffer, caused by bug 404380

Categories

(Core :: Disability Access APIs, defect, P4)

x86
Windows Vista
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: MarcoZ, Assigned: aaronlev)

References

Details

(Keywords: access, crash, regression)

Attachments

(6 files, 5 obsolete files)

To reproduce:
1. Start Thunderbird.
2. If you don't have any newsgroup accounts set up, set up the news.mozilla.org newsgroup acccount.
3. In any newsgroup, open any message.

Result: It takes very long to open the window, and once it is open, the JAWS virtual buffer is blank.
Pressing INSERT+Z to turn off V Cursor, and again to turn it back on, may sometimes manage to get a new instance of the message. But as soon as you close that and open another message, the same problem is back.

Further observation: In my normal inbox, I can open and close as many messages as I want, there is no problem. However, in feed folders, for example when subscribing to planet.mozilla.org, the same happens as in Newsgroups.

This problem started happening with the 2007-11-28 nightly build of Thunderbird Trunk. I suspect that one of the fixes for bug 405552 or bug 404380 is responsible, these have been checked in on November 27. My suspicion is that one of them exposes a different crash that is hidden by MSAA.
Because even if I open and close a few messages in Inbox first, then open a newsgroup message, there is a considerable delay that might indicate a crash that MSAA hides.

Except for ordinary e-mails, this makes Thunderbird virtually unusable right now.
Flags: blocking1.9?
Summary: Thunderbird: newsgroup or feed messages come up blank in virtual buffer, effective Sep 28, 2007 → Thunderbird: newsgroup or feed messages come up blank in virtual buffer, effective Nov 28, 2007
Here's a call stack after I break when opening such a message:
 	ntdll.dll!775e2ea8() 	
 	[Unten angegebene Rahmen sind möglicherweise nicht korrekt und/oder fehlen, keine Symbole geladen für ntdll.dll]	
 	ntdll.dll!77636394() 	
 	ole32.dll!77350068() 	
 	ole32.dll!773500e5() 	
 	ole32.dll!77350585() 	
 	ole32.dll!773baa82() 	
 	msvcrt.dll!75e8669b() 	
 	msvcrt.dll!75e86620() 	
 	ole32.dll!772f5081() 	
 	ntdll.dll!77601039() 	
 	ntdll.dll!7760100b() 	
 	ntdll.dll!775c29d7() 	
 	FsDomNodeFirefox.dll!0834b1a7() 	
>	msvcp80.dll!std::basic_string<unsigned short,std::char_traits<unsigned short>,std::allocator<unsigned short> >::basic_string<unsigned short,std::char_traits<unsigned short>,std::allocator<unsigned short> >(const unsigned short * _Ptr=0x00000001)  Zeile 663	C++
 	FsDomNodeFirefox.dll!0834841a() 	
 	msvcp80.dll!std::basic_string<unsigned short,std::char_traits<unsigned short>,std::allocator<unsigned short> >::_Tidy(bool _Built=false, unsigned int _Newsize=1239704)  Zeile 2087 + 0x12 Bytes	C++
 	ntdll.dll!77600e97() 	
 	thunderbird.exe!NS_MsgHashIfNecessary(nsAutoString & name={...})  Zeile 428 + 0x1e Bytes	C++
 	msvcp80.dll!std::basic_string<unsigned short,std::char_traits<unsigned short>,std::allocator<unsigned short> >::assign(const unsigned short * _Ptr=0x00000000)  Zeile 1080 + 0xc Bytes	C++
 	msvcp80.dll!std::basic_string<unsigned short,std::char_traits<unsigned short>,std::allocator<unsigned short> >::basic_string<unsigned short,std::char_traits<unsigned short>,std::allocator<unsigned short> >(const unsigned short * _Ptr=0x00000000)  Zeile 663	C++
 	FsDomNodeFirefox.dll!08343338() 	
 	FsDomNodeFirefox.dll!0834108f() 	
 	FsDomNodeFirefox.dll!0834d66b() 	
 	FsDomSrv.dll!078790c0() 	
 	FsDomSrv.dll!0787930a() 	
 	FsDomSrv.dll!07878f2f() 	
 	FsDomSrv.dll!0787adbf() 	
 	FsDomSrv.dll!0787e18d() 	
 	thunderbird.exe!nsAccessibleWrap::get_accState(tagVARIANT varChild={...}, tagVARIANT * pvarState=Unknown vt type = ...)  Zeile 509 + 0xf Bytes	C++
 	FsDomSrv.dll!0787e2fa() 	
 	FsDomSrv.dll!078637aa() 	
 	rpcrt4.dll!7647144a() 	
 	rpcrt4.dll!76466d7e() 	
 	rpcrt4.dll!764e03a2() 	
 	nspr4.dll!_MD_CURRENT_THREAD()  Zeile 298 + 0xc Bytes	C
 	nspr4.dll!_MD_CURRENT_THREAD()  Zeile 298 + 0xc Bytes	C
 	033e7700()	
 	nspr4.dll!PR_GetThreadPrivate(unsigned int index=1)  Zeile 232 + 0x5 Bytes	C
 	xpcom_core.dll!NS_LogDtor_P(void * aPtr=0x0012f31c, const char * aType=0x019c0f8c, unsigned int aInstanceSize=16)  Zeile 1065 + 0x15 Bytes	C++
 	thunderbird.exe!nsRect::~nsRect()  Zeile 76 + 0x11 Bytes	C++
 	thunderbird.exe!nsBaseHashtable<nsISupportsHashKey,nsCSSFrameConstructor::RestyleData,nsCSSFrameConstructor::RestyleData>::Count()  Zeile 113 + 0xf Bytes	C++
 	thunderbird.exe!nsCSSFrameConstructor::ProcessPendingRestyles()  Zeile 13292 + 0x15 Bytes	C++
 	smime3.dll!___security_cookie()  - 0x3ca0 Bytes	C
 	thunderbird.exe!nsAppShell::`vftable'()  + 0xf Bytes	C++
 	thunderbird.exe!006c078f() 	
 	00107d83()	
I backed out both patches in turn, and it is bug 404380 (the two-part check-in) that breaks things. Ginn, any idea why?
Depends on: 404380
Not at all.

If you commit the patch of bug 405414, is it fixed?
(In reply to comment #4)
> If you commit the patch of bug 405414, is it fixed?

You mean, resubmit patches for bug 404380 to my source, and in addition put patch for bug 405414 in? Will try that and report back here.
(In reply to comment #4)
> If you commit the patch of bug 405414, is it fixed?


Unfortunately not, it is only softened a bit. Instead of the first newsgroup message coming up blank, it will be the fourth or fifth one. After turning virtual cursor off and on again, again I can open 4 or 5 messages before it blanks out again.
So, that patch does have a positive impact somewhat, but it does not completely fix the issue.
Ginn, please take a look.
Assignee: aaronleventhal → ginn.chen
I tried feeds, but I didn't reproduce this on Solaris with Orca.

Macro, did this happen, if you don't use JAWS?

Marking as blocking since it is a regression...
Flags: blocking1.9? → blocking1.9+
Priority: -- → P4
reassign, since I'm not able to reproduce it and I don't have JAWS.
Assignee: ginn.chen → aaronleventhal
According to a member of the NVDA-Dev mailing list, this is also reproducible with NVDA starting with the November 28 build.
Aaron, I looked at my traces again, and found that one of them points to this spot:
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/accessible/src/msaa/nsAccessibleWrap.cpp&mark=510#510

Question: In line 508, pvarChild is used without being checked if it is valid. What if it's NULL? Would that not mean that an invalid object gets addreffe'd, and could that cause the crash?
Attached file Call stack
This is another call stack. I used a debug build made from Trunk.
Marco, sometimes when I run Firefox in the debugger it throws an exception at that part of get_accState(). If I return S_OK there instead the problem goes away. However, it makes no sense to me, and I don't see what it has to do with Ginn's fix in bug 404380.
Keywords: crash
Severity: major → critical
Summary: Thunderbird: newsgroup or feed messages come up blank in virtual buffer, effective Nov 28, 2007 → Thunderbird: newsgroup/feed messages blank in virtual buffer, caused by bug 404380
Attached patch Wild guess #1 (obsolete) — Splinter Review
Attachment #297044 - Flags: review?(marco.zehe)
Attachment #297044 - Attachment is obsolete: true
Attachment #297059 - Flags: review?(marco.zehe)
Attachment #297044 - Flags: review?(marco.zehe)
Comment on attachment 297059 [details] [diff] [review]
For testing (but not for review). Just want to know if this affects it.

If anything, it appears to have made things even worse. Witht his, or the first idea, once it crashes, I'm hardly able to recover at all, except for shutting down and restarting Thunderbird.
Attachment #297059 - Flags: review?(marco.zehe)
Attachment #297059 - Attachment is obsolete: true
Marco, these are 2 different patches of bug 404380 that apply cleanly. You can try to see if applying either of them fixes the bug (you shouldn't have to reverse the patches, they're already reversed, so just apply them).
(In reply to comment #18)
> Wildguess-nsCaretAccessible -- just to help narrow down the problem

This one had an effect. The blanking out happened later, but when it happened, I could not recover from it again unless I shut down and restarted Thunderbird.

Applying the other wild guess did not have any effect.
Marco, does this patch help?

BTW, I think there is a chance that several of the cleanup patches I've been checking in or seeking review for may help. See patches in bug 412878 and bug 413778 as well.
Attachment #297866 - Attachment is obsolete: true
Attachment #297869 - Attachment is obsolete: true
Attachment #298852 - Flags: review?(surkov.alexander)
Comment on attachment 298852 [details] [diff] [review]
Null pointer check for mRootAccessible

> nsRect
> nsCaretAccessible::GetCaretRect(nsIWidget **aOutWidget)
> {
>   nsRect caretRect;
>+  NS_ENSURE_TRUE(mRootAccessible, caretRect);
>   NS_ENSURE_TRUE(aOutWidget, caretRect);

possibly I would prefer to check output argument firstly.

>   *aOutWidget = nsnull;
> 
>   if (!mLastTextAccessible) {
>     return caretRect;    // Return empty rect

r=me
Attachment #298852 - Flags: review?(surkov.alexander) → review+
Patch checked in, but we don't know if it's fixed yet.
Attachment #299190 - Flags: review?(surkov.alexander)
Comment on attachment 299190 [details] [diff] [review]
Another null check and move ClearControlSelectionListener() up


> 
> nsresult nsCaretAccessible::AddDocSelectionListener(nsIDOMDocument *aDoc)
> {
>+  NS_ENSURE_TRUE(mRootAccessible, NS_ERROR_FAILURE);
>+

Here mRootAccessible isn't used. Do you want to check it here because we shouldn't add doc selection listener if it is null, right?

Why do we get this case? Does we pass nsnull into nsCaretAccessible constructor or we call Shutdown already? Can you give me a hint why this happens?
Right. I don't actually know if it happens, because I can't duplicate the original bug.

But, looking at what we currently know -- I want to make absolutely sure it doesn't happen.

At the moment I can't build Thunderbird but I would like to make sure that with Marco's test case (opening and closing newsgroup messages in a new window), that adding/removing the selection listeners is symmetrical and occurs at the right times. If you can build Thunderbird then you could help me test that.
Attachment #299190 - Flags: review?(surkov.alexander) → review+
Should we add assertion additionally to easy track?
I definitely see an improvement, if not a restauration, of functionality with the 2008012804 build of Thunderbird. So the fix for bug 413778 definitely had a positive impact, yesterday's build was still broken.
I got the frequent crashes back when returning to reading bugmail messages. So, there seems to be a difference with newsgroupmessages, but for messages that take some time to load, like from gmail IMAP, which is where my bugmail comes from, it is still an issue. And I remember having opened and closed the last message before the crash, and the message where it crashed, in rather rapid succession, so this may be impacting it, too.
This really might be it. I see that RemoveDocSelectionListener() fails when I close a tab, because the doc doesn't know the presshell anymore. That would cause havoc.
Attachment #299914 - Flags: review?(surkov.alexander)
Comment on attachment 299914 [details] [diff] [review]
Sometimes doc->presshell was null -- but we still have it in nsDocAccessible mWeakShell so pass that in


>   /**
>    * Start listening to selection events for a given document
>    * More than one document's selection events can be listened to
>    * at the same time, by a given nsCaretAccessible
>-   * @param aDocument   Document to listen to selection events for.
>+   * @param aDocument   PresShell for document to listen to selection events from.

use aShell here instead of aDocument. Would be nice if you give some explanation here why we prefer to use presshell.

> 
>-nsresult nsCaretAccessible::AddDocSelectionListener(nsIDOMDocument *aDoc)
>+nsresult nsCaretAccessible::AddDocSelectionListener(nsIPresShell *aShell)

why you are here I would prefer to use the following sytax
>+nsresult
> +nsCaretAccessible::AddDocSelectionListener(nsIPresShell *aShell)
Attachment #299914 - Flags: review?(surkov.alexander) → review+
Thanks, just checked that in. We'll wait to see what Marco says the results are.
Depends on: 414574
Attachment #300135 - Flags: superreview?(bzbarsky)
Attachment #300135 - Flags: review?(bzbarsky)
Comment on attachment 300135 [details] [diff] [review]
Fix crash the last patch caused in bug 414574

I'm sorry, I can't review this.  Someone familiar with this logic needs to do that.

sr=me, though.
Attachment #300135 - Flags: superreview?(bzbarsky)
Attachment #300135 - Flags: superreview+
Attachment #300135 - Flags: review?(bzbarsky)
Attachment #300135 - Flags: review?
Attachment #300135 - Flags: review? → review?(surkov.alexander)
Comment on attachment 300135 [details] [diff] [review]
Fix crash the last patch caused in bug 414574

Whoever gets to this first, I'd like to get it in ASAP. If I'm not around please check it in for me, it's already marked blocking 1.9
Attachment #300135 - Flags: review?(ginn.chen)
Comment on attachment 300135 [details] [diff] [review]
Fix crash the last patch caused in bug 414574

r=me
Attachment #300135 - Flags: review?(surkov.alexander) → review+
Comment on attachment 300135 [details] [diff] [review]
Fix crash the last patch caused in bug 414574

Checked this in so that the stack overflow is fixed.
Attachment #300135 - Flags: review?(ginn.chen)
My crashes still happen in version 3.0a1pre (2008021103), and they are not swallowed by fix for bug 381049.
Update with latest source running as debug build: I no longer get any ::getState calls in my call list. A definite change from previous tries.
Here are some warnings that appear in the console right before the crash. Perhaps they'll help!
WARNING: NS_ENSURE_TRUE(mDOMNode) failed: file c:/Users/Marco/entwicklung/mozilla/accessible/src/base/nsDocAccessible.cpp, line 1794
WARNING: NS_ENSURE_TRUE(mDOMNode) failed: file c:/Users/Marco/entwicklung/mozilla/accessible/src/base/nsDocAccessible.cpp, line 1794
WARNING: NS_ENSURE_TRUE(mDOMNode) failed: file c:/Users/Marco/entwicklung/mozilla/accessible/src/base/nsDocAccessible.cpp, line 1794
WARNING: NS_ENSURE_TRUE(mDOMNode) failed: file c:/Users/Marco/entwicklung/mozilla/accessible/src/base/nsDocAccessible.cpp, line 1794
WARNING: NS_ENSURE_TRUE(selPrivate) failed: file c:/Users/Marco/entwicklung/mozilla/accessible/src/base/nsCaretAccessible.cpp, line 82
WARNING: NS_ENSURE_TRUE(selPrivate) failed: file c:/Users/Marco/entwicklung/mozilla/accessible/src/base/nsCaretAccessible.cpp, line 82 
Attached patch Patch for Marco to try (obsolete) — Splinter Review
Attachment #305523 - Flags: review?(marco.zehe)
Attachment #305524 - Flags: review?(marco.zehe)
Attachment #305523 - Attachment is obsolete: true
Attachment #305523 - Flags: review?(marco.zehe)
Fixed via checkin to bug 417249.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Verified fixed using version 3.0a1pre (2008022603).
Status: RESOLVED → VERIFIED
Attachment #305524 - Flags: review?(marco.zehe)
You need to log in before you can comment on or make changes to this bug.