Closed Bug 434644 Opened 16 years ago Closed 16 years ago

Crash [@ nsMsgImapHdrXferInfo::GetHeader] when accessing IMAP folder

Categories

(MailNews Core :: Networking: IMAP, defect)

defect
Not set
blocker

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.9

People

(Reporter: gkw, Assigned: standard8)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(3 files, 7 obsolete files)

Attached file stacktrace
Bug 434493 got checked in and subsequently broke accessing IMAP folders to download mail.

Regression range:
2008-05-19-03 didn't crash.
version 3.0a2pre (2008051916) hourly crashed though.

http://crash-stats.mozilla.com/report/index/dd247ffa-25fc-11dd-8236-001cc45a2ce4?p=1 is the Breakpad report but I'm doubtful about its relevance.

STR: (fresh profile)

1: Create a new account, entering a Gmail IMAP account setting, ensuring that you head to account settings to enable SSL.
2: Password dialog comes out, enter password.
3: Thunderbird crashes.
Attached patch proposed fix (obsolete) — Splinter Review
we're clearing out the data in m_hdrInfo's elements, so we'd better tell it there are no more elements :-)

I'm not exactly sure why this broke, but this does fix the problem...
Attachment #321679 - Flags: superreview?(dmose)
Attachment #321679 - Flags: review?(dmose)
I verify that the patch fixes the crash. The IMAP messages download properly with the patch.
The reason this broke has something to do with the change here: 

https://bugzilla.mozilla.org/attachment.cgi?id=321590&action=diff#mozilla/mailnews/imap/src/nsImapProtocol.cpp_sec2

What's less clear to me is whether this fix is the right one.  The Clear call added actually causes the underlying objects in the array to be released.  It's not clear to me why this is desirable, at least in part because nsImapHeaderXferInfo.idl doesn't contain any documentation about how that interface is supposed to work.  Can you at least add IDL dox to that file so that the next person who has to touch this could has some chance of understanding what's going on in more detail?  Bonus points if you add an xpcshell unit test!
Attachment #321679 - Flags: superreview?(dmose)
Attachment #321679 - Flags: review?(dmose)
Attachment #321679 - Flags: review-
(In reply to comment #1)
> Created an attachment (id=321679) [details]
> proposed fix
> 
> we're clearing out the data in m_hdrInfo's elements, so we'd better tell it
> there are no more elements :-)

That doesn't make sense. ResetAll is called after we downloaded a header:

http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/mailnews/imap/src/nsImapProtocol.cpp&rev=1.681&mark=3548,3564#3548

Whereas ReleaseAll is called after we've finished a set of downloads:

http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/mailnews/imap/src/nsImapProtocol.cpp&rev=1.681&mark=3907,3911#3907

So as I understand it, ResetAll is allowing us to minimise allocations whilst downloading a group of headers. If we now clear the elements out of that during ResetAll, we're going to loose this optimisation.

> I'm not exactly sure why this broke, but this does fix the problem...

ditto. I wonder if Neil has any ideas.

My only current thought is to put a nsCOMPtr wrapper around hdrInfo, but I don't think that would work as nsCOMArray should be holding onto the object.
Flags: blocking-thunderbird3.0a1?
Peter, 3.0a1 has already been released (yes we should remove that flag now), so moving request to 3.0a2
Flags: blocking-thunderbird3.0a1? → blocking-thunderbird3.0a2?
Attached patch Correct the reference counting (obsolete) — Splinter Review
I think this is the right fix. I'd changed GetFreeHeaderInfo to do SafeObjectAt, SafeObjectAt doesn't refcount on the way out, therefore we need to addref it manually.

This would also explain why David's Clear() was working in ResetAll - it meant we weren't activating the broken code.
Assignee: bienvenu → bugzilla
Attachment #321679 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #321738 - Flags: superreview?(bienvenu)
Attachment #321738 - Flags: review?(dmose)
Attached patch Correct the reference counting (obsolete) — Splinter Review
Opps, that was an old version, this one compiles and works.
Attachment #321738 - Attachment is obsolete: true
Attachment #321739 - Flags: superreview?(bienvenu)
Attachment #321739 - Flags: review?(dmose)
Attachment #321738 - Flags: superreview?(bienvenu)
Attachment #321738 - Flags: review?(dmose)
Comment on attachment 321739 [details] [diff] [review]
Correct the reference counting

>+  if (aResult)
>+  {
>+    NS_ADDREF(*aResult);
You fixed this one but not the one two lines above ;-)
Attachment #321739 - Attachment is obsolete: true
Attachment #321739 - Flags: superreview?(bienvenu)
Attachment #321739 - Flags: review?(dmose)
Attached patch Alternate method (obsolete) — Splinter Review
Dan suggested unit tests, so here's a compiled code test (there's no other way as nsImapHdrXferInfo isn't registered as a component), and some other tidy up (apologies for that, but it made writing the test easier, and fixed a couple of other potential problems).

As regards to the crasher problem, StartNewHdr (was GetFreeHeaderInfo) now returns the pointer to the new header structure. This is fine as we're storing a ref-counted local copy (in the nsCOMArray) and when we call the function we assign to an nsCOMPtr which will do ref counting on the way in. Hence we can just drop NS_ADDREFs from the function.

I've also included Neil's suggested optimisation for pre-assigning the size of the array, ensured that a caller of GetHeader can't get one we officially haven't got, dropped GetCurrentHdrInfo which wasn't defined anywhere, and renamed GetFreeHeaderInfo to StartNewHdr to avoid just calling a function and passing it on for the sake of it.

I'm not sure the test case really covers the crash case, I couldn't really see a way of ensuring the memory/refcounts weren't bad or anything, however I think it covers the rest of the class pretty well.
Attachment #321758 - Flags: superreview?(bienvenu)
Attachment #321758 - Flags: review?(dmose)
Do tests run in debug or release builds? If debug builds, then the crash case could be covered by setting some values on the header info and then verifying that the header info still has the same values after manipulating the objects in the same way as the code does - since debug builds will overwrite the memory after releasing the objects.

FYI, the only reason these objects use XPCOM and idl at all is because they needed to get proxied over to the UI thread - they're not meant to be used by any other piece of code, which is why they're not registered as components.
(In reply to comment #10)
> Do tests run in debug or release builds? If debug builds, then the crash case
> could be covered by setting some values on the header info and then verifying
> that the header info still has the same values after manipulating the objects
> in the same way as the code does - since debug builds will overwrite the memory
> after releasing the objects.

Unfortunately, the tinderboxes run in release mode.

> FYI, the only reason these objects use XPCOM and idl at all is because they
> needed to get proxied over to the UI thread - they're not meant to be used by
> any other piece of code, which is why they're not registered as components.
> 

Yeah I gathered that, I wasn't complaining, just pointing it out.
Comment on attachment 321758 [details] [diff] [review]
Alternate method

some unrelated mailnews/base changes snuck into the patch...

+  // If the header index is more than (or equal to) our next pointer, then its
+  // a header we haven't really got and the caller has done something wrong.


maybe "next free"?

I'll try running with this patch ASAP.

compiling the test directory fails for me on windows with nsMsgImapHdrXferInfo::ReleaseAll, ResetAll, StartNewHdr, and the constructor undefined. Perhaps I did something wrong, or perhaps it's a problem with non-static builds.
The patch definitely fixes the problem, and seems to work fine. No luck getting the tests to build, however. It doesn't seem to be trying to link either msgbsutil or imap. I'm not doing an objdir build either, though I don't think that should matter.
I changed imap/test/Makefile.in locally to look like this and got it to build. I doubt it's the right way to fix it, but at least it builds for me now.

ifdef MOZ_STATIC_MAIL_BUILD
LIBS +=		\
		../../base/util/$(LIB_PREFIX)msgbsutl_s.$(LIB_SUFFIX) \
		../src/$(LIB_PREFIX)msgimap_s.$(LIB_SUFFIX) \
		$(NULL)
else

ifeq ($(USE_SHORT_LIBNAME), 1)
LIBS +=		\
#		../../base/util/$(LIB_PREFIX)msgbsutl.$(LIB_SUFFIX) \
		../src/$(LIB_PREFIX)msgimap_s.$(LIB_SUFFIX) \
		$(NULL)
else
LIBS +=		\
#		../../base/util/$(LIB_PREFIX)msgbaseutil.$(LIB_SUFFIX) \
		../src/$(LIB_PREFIX)msgimap_s.$(LIB_SUFFIX) \
		$(NULL)
endif
endif
Attached patch Alternate method v2 (obsolete) — Splinter Review
Updated method to fix comment and hopefully fix non-static build problems.
Attachment #321758 - Attachment is obsolete: true
Attachment #321806 - Flags: superreview?(bienvenu)
Attachment #321806 - Flags: review?(dmose)
Attachment #321758 - Flags: superreview?(bienvenu)
Attachment #321758 - Flags: review?(dmose)
Attached patch Alternate method v3 (obsolete) — Splinter Review
This should be better - use EXPAND_LIBNAME to get the library right on the linker line.
Attachment #321806 - Attachment is obsolete: true
Attachment #321813 - Flags: superreview?(bienvenu)
Attachment #321813 - Flags: review?(dmose)
Attachment #321806 - Flags: superreview?(bienvenu)
Attachment #321806 - Flags: review?(dmose)
Attached patch Alternate method v4 (obsolete) — Splinter Review
This one works for me and David now :-)
Attachment #321813 - Attachment is obsolete: true
Attachment #321816 - Flags: superreview?(bienvenu)
Attachment #321816 - Flags: review?(dmose)
Attachment #321813 - Flags: superreview?(bienvenu)
Attachment #321813 - Flags: review?(dmose)
Comment on attachment 321816 [details] [diff] [review]
Alternate method v4

looks good
Attachment #321816 - Flags: superreview?(bienvenu) → superreview+
Comment on attachment 321816 [details] [diff] [review]
Alternate method v4

Looks good; r=dmose with a few tweaks:

>@@ -144,6 +145,12 @@ NS_IMETHODIMP nsMsgImapHdrXferInfo::GetN
> 
> NS_IMETHODIMP nsMsgImapHdrXferInfo::GetHeader(PRInt32 hdrIndex, nsIImapHeaderInfo **aResult)
> {
>+  // If the header index is more than (or equal to) our next free pointer, then
>+  // its a header we haven't really got and the caller has done something
>+  // wrong.
>+  if (hdrIndex >= m_nextFreeHdrInfo)
>+    return NS_ERROR_NULL_POINTER;

If the caller has made a mistake here, we should assert as well as returning an error in release builds; otherwise this may never get noticed.

>diff -r 91bbb8489ff6 mailnews/imap/src/nsImapProtocol.h
>--- a/mailnews/imap/src/nsImapProtocol.h	Tue May 20 16:17:33 2008 +0100
>+++ b/mailnews/imap/src/nsImapProtocol.h	Tue May 20 21:14:22 2008 +0100
>@@ -126,16 +126,14 @@ public:
>   NS_DECL_NSIIMAPHEADERXFERINFO
>   nsMsgImapHdrXferInfo();
>   virtual ~nsMsgImapHdrXferInfo();
>+  void    ResetAll(); // reset HeaderInfo's for re-use

Might as well get rid of the spurious apostrophe as long as you're touching this line.

>
>+DEPTH		= ../../..
>+topsrcdir	= @top_srcdir@
>+srcdir		= @srcdir@
>+VPATH		= @srcdir@

Indentation issue in the srcdir line.

>+LOCAL_INCLUDES = \
>+		-I$(topsrcdir)/xpcom/tests \
>+		-I$(topsrcdir)/mailnews/base/util \
>+		-I$(srcdir)/../src \
>+		$(NULL)
>+
>+ifndef MOZILLA_INTERAL_API

s/INTERAL/INTERNAL/

Might want to re-test after fixing this bit.

>+// Matches definition in nsImapProtocol.cpp
>+static const PRInt32 kNumHdrsToXfer = 10;

You could consider moving the original definition from being a static const in the .cpp file to just being a #define (or non-static const) in the .h file; then you wouldn't need to replicate it here.

>+// hdrArrayCheck:
>+// 0 = don't check, just add hdr to hdrArray
>+// 1 = check hdrArray and hdr are the same
>+// 2 = check hdrArray and hdr are different

This wants to be an |enum| for code readability.
Attachment #321816 - Flags: review?(dmose) → review+
This is what I've actually checked in so that it would be fixed for today's nightlies.

I didn't alter the tabs in the Makefile.in as they are exactly the same as mozilla/Makefile.in and display fine on mxr and emacs, just look a bit strange on plain text.

At the moment, I've put an NS_WARNING into the code rather than an NS_ASSERTION. I've also added an fprintf to the test file to say that the warning is expected.

The way the unit tests are run, are such that if an assertion is raised it will cause the test to fail.

So if we keep the test in that checks this case, we can't have the message as an assertion.

I'm happy to keep it as it is or to remove the test and make it an assertion. I think whilst we may be able to disable assertion checking on the unit test side that it is not really a good idea to do that.
Attachment #321816 - Attachment is obsolete: true
Flags: blocking-thunderbird3.0a2? → in-testsuite+
OS: Mac OS X → All
Hardware: PC → All
Target Milestone: --- → mozilla1.9
Ok, so including what I couldn't do in comment 20, I had to comment out this part of the code:

+    case eCheckDifferent:
+      if (hdrArray[i] == hdr)
+        return 2;
+      break;

This was failing on the SeaMonkey Unit test tinderboxes (in release mode). I just realised it wasn't failing on the Linux cb-sea-linux-tbox Dep Nightly box which is weird.

I'm going to build Linux in the morning and see if I can replicate any of this, it might be we just have to allocate an extra hdr so that if there's reuse of the memory the pointers get offset.
(In reply to comment #20)
> At the moment, I've put an NS_WARNING into the code rather than an
> NS_ASSERTION.
> 
> The way the unit tests are run, are such that if an assertion is raised it will
> cause the test to fail.
Doesn't NS_ENSURE_TRUE combine a warning with an error return?
A trunk build 3.0a2pre from yesterday's code doesn't crash anymore. Is this bug waiting on the unit tests to get marked resolved fixed?
(In reply to comment #23)
> A trunk build 3.0a2pre from yesterday's code doesn't crash anymore. Is this bug
> waiting on the unit tests to get marked resolved fixed?
> 
Yes, I'm currently coming up with the fix for those.
Follow up to this bug. I've changed the NS_WARNING into an NS_ENSURE_TRUE as suggested by Neil.

I've completely removed the check for different pointers after a ReleaseAll. Given the potential for memory reuse in different ways, I'm not sure there is a sensible way of testing this. I'm also not convinced it we should expect different pointers each time anyway.
Attachment #323542 - Flags: superreview?(bienvenu)
Attachment #323542 - Flags: review?(bienvenu)
Comment on attachment 323542 [details] [diff] [review]
[checked in] Follow-up

no, I don't think we should expect different pointers
Attachment #323542 - Flags: superreview?(bienvenu)
Attachment #323542 - Flags: superreview+
Attachment #323542 - Flags: review?(bienvenu)
Attachment #323542 - Flags: review+
Attachment #323542 - Attachment description: Follow-up → [checked in] Follow-up
All issues resolved, so this bug is fixed now.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
Product: Core → MailNews Core
Crash Signature: [@ nsMsgImapHdrXferInfo::GetHeader]
You need to log in before you can comment on or make changes to this bug.