Last Comment Bug 856577 - crash in nsMovemailService::GetNewMail
: crash in nsMovemailService::GetNewMail
Status: RESOLVED FIXED
: crash
Product: MailNews Core
Classification: Components
Component: Movemail (show other bugs)
: Trunk
: All Linux
: -- critical (vote)
: Thunderbird 24.0
Assigned To: :aceman
:
Mentors:
Depends on: 858238
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-01 05:39 PDT by Wayne Mery (:wsmwk, NI for questions)
Modified: 2013-05-16 13:51 PDT (History)
5 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
patch (15.24 KB, patch)
2013-04-02 15:43 PDT, :aceman
no flags Details | Diff | Splinter Review
patch v2 (15.27 KB, patch)
2013-04-03 12:47 PDT, :aceman
Pidgeot18: review-
Details | Diff | Splinter Review
patch v3 (21.96 KB, patch)
2013-04-04 13:46 PDT, :aceman
Pidgeot18: review-
Details | Diff | Splinter Review
patch v4 (23.14 KB, patch)
2013-04-05 11:13 PDT, :aceman
Pidgeot18: review+
Details | Diff | Splinter Review
patch v5 (23.05 KB, patch)
2013-04-15 10:22 PDT, :aceman
acelists: review+
Details | Diff | Splinter Review
patch v6 (23.56 KB, patch)
2013-05-16 13:40 PDT, :aceman
acelists: review+
Details | Diff | Splinter Review

Description Wayne Mery (:wsmwk, NI for questions) 2013-04-01 05:39:28 PDT
There are two stack variations of this signature, varying by frame 1
nsMovemailIncomingServer::PerformBiff [1] ~90% ?
nsMovemailIncomingServer::GetNewMail [2]  ~10% ?

Not a regression afaict - crashes go back several versions, eg cited in bug 787605 comment 12. 

[1] bp-456afaf3-ff2d-4a5a-a4ef-945342121218 .
0	libxul.so	nsMovemailService::GetNewMail	nsMovemailService.cpp:465
1	libxul.so	nsMovemailIncomingServer::PerformBiff	nsMovemailIncomingServer.cpp:74
2	libxul.so	NS_InvokeByIndex_P	xptcinvoke_x86_64_unix.cpp:164
3	libxul.so	XPCWrappedNative::CallMethod	XPCWrappedNative.cpp:3106
4	libxul.so	XPC_WN_CallMethod	XPCWrappedNativeJSOps.cpp:1478
5	libxul.so	js::InvokeKernel	jscntxtinlines.h:372
6	libxul.so	js::Interpret	jsinterp.cpp:2414
7	libxul.so	js::RunScript	jsinterp.cpp:309 

[2] nsMovemailIncomingServer::GetNewMail
bp-b691bbd7-5419-445f-9f8b-f10992130318
0	libxul.so	nsMovemailService::GetNewMail	nsMovemailService.cpp:465
1	libxul.so	nsMovemailIncomingServer::GetNewMail	nsMovemailIncomingServer.cpp:202
2	libxul.so	nsMsgLocalMailFolder::GetNewMessages	nsLocalMailFolder.cpp:1904
3	libxul.so	NS_InvokeByIndex_P	xptcinvoke_x86_64_unix.cpp:164
4	libxul.so	XPCWrappedNative::CallMethod	XPCWrappedNative.cpp:3067
Comment 1 Dave Wolfe 2013-04-01 06:00:40 PDT
(In reply to Wayne Mery (:wsmwk) from comment #0)
> There are two stack variations of this signature, varying by frame 1
> nsMovemailIncomingServer::PerformBiff [1] ~90% ?
> nsMovemailIncomingServer::GetNewMail [2]  ~10% ?

Since I received the notification shortly after my Thunderbird crashed and I let it send the notification, I assume this is in response to that crash, so I'll follow-up. This crash is a long standing problem that may not actually be Thunderbird's fault since it results from the first character missing in the mailbox being collected, i.e. the first line of the first message starts "rom [...]" instead of "From [...]". Whether the culprit is Thunderbird (unlikely IMHO), procmail, postfix, fetchmail, or something else in the custody chain is the culprit is unknown, at least by me, but it happens from time to time, probably due to some race condition between two or more of these programs while delivering mail. That said, Thunderbird could be more defensive in handling the situation by recognizing the invalid mbox format and reporting it as such rather than crashing, and possibly even attempting to correct the problem.
Comment 2 :aceman 2013-04-02 08:11:34 PDT
Do you have a self-compiled TB or something? The crash report does not have links to source code lines. Can you test a C++ patch if we create one?
Comment 3 Dave Wolfe 2013-04-02 08:23:16 PDT
No, standard Ubuntu 12.04 release TB. I have gcc and headers installed but I doubt that I have everything needed to build TB and have never done it. Note that the crash is simple to reproduce, just edit the source mbox to remove the 1st character from a message (the 'F' in the From header).
Comment 4 :aceman 2013-04-02 08:25:29 PDT
Can you write steps how to create such a broken movemail account? I have never used one before. But I could create the patch (depending on the cause) so bear with me:)
Comment 5 Dave Wolfe 2013-04-02 08:41:28 PDT
Well, for Ubuntu, do the following:
1) Set up a movemail server account if you don't already have one.
2) Stop TB and wait for mail to arrive in /var/mail/<username> where "<username>" is the movemail account name. Copying a TB mbox file (the file without an extension, not the .msf file) will work too.
3) Edit /var/mail/<username> to remove the 1st character from the 1st line.
4) Start TB and it'll crash.
Comment 6 :aceman 2013-04-02 08:43:56 PDT
Thanks, I'll try that out. Does the size of the /var/mail/account file matter?
Comment 7 Dave Wolfe 2013-04-02 08:47:44 PDT
Not that I've noticed, just needs to be a valid email in every other respect. Try starting TB with the message before editing to be sure it accepts it.
Comment 8 :aceman 2013-04-02 13:42:11 PDT
OK, confirming the crash, when following the steps from comment 4:
#2  0xb406f06c in ah_crap_handler (signum=11) at /var/SSD/TB-hg/mozilla/toolkit/xre/nsSigHandlers.cpp:88
#3  0xb4073b36 in nsProfileLock::FatalSignalHandler (signo=11, info=0xbf89d00c, context=0xbf89d08c) at /var/SSD/TB-hg/tbird-bin/mozilla/toolkit/profile/nsProfileLock.cpp:190
#4  <signal handler called>
#5  mozalloc_abort (msg=0xbf89d3d4 "###!!! ABORT: You can't dereference a NULL nsCOMPtr with operator->().: 'mRawPtr != 0', file ../../../dist/include/nsCOMPtr.h, line 783")
    at /var/SSD/TB-hg/mozilla/memory/mozalloc/mozalloc_abort.cpp:30
#6  0xb55f593b in Abort (aMsg=<optimized out>) at /var/SSD/TB-hg/mozilla/xpcom/base/nsDebugImpl.cpp:430
#7  NS_DebugBreak (aSeverity=3, aStr=0xb5974139 "You can't dereference a NULL nsCOMPtr with operator->().", aExpr=0xb597412c "mRawPtr != 0", aFile=
    0xb5a81fd8 "../../../dist/include/nsCOMPtr.h", aLine=783) at /var/SSD/TB-hg/mozilla/xpcom/base/nsDebugImpl.cpp:417
#8  0xb4095bda in nsCOMPtr<nsIOutputStream>::operator-> (this=0xbf89d86c) at ../../../dist/include/nsCOMPtr.h:783
#9  0xb5036c5b in nsMovemailService::GetNewMail (this=0xa5ca2680, aMsgWindow=0xa6d8d5b0, aUrlListener=0x0, aMsgFolder=0xa6b44264, movemailServer=0xaa14ed64, aURL=0x0)
    at /var/SSD/TB-hg/mailnews/local/src/nsMovemailService.cpp:465
#10 0xb50376e3 in nsMovemailIncomingServer::GetNewMail (this=0xaa14ece0, aMsgWindow=0xa6d8d5b0, aUrlListener=0x0, aMsgFolder=0xa6b44264, aResult=0x0)
    at /var/SSD/TB-hg/mailnews/local/src/nsMovemailIncomingServer.cpp:194
#11 0xb500f8f5 in nsMsgLocalMailFolder::GetNewMessages (this=0xa6b44240, aWindow=0xa6d8d5b0, aListener=0x0) at /var/SSD/TB-hg/mailnews/local/src/nsLocalMailFolder.cpp:1882
#12 0xb4f3952a in nsMsgIncomingServer::GetNewMessages (this=0xaa14ece0, aFolder=0xa6b44264, aMsgWindow=0xa6d8d5b0, aUrlListener=0x0)
    at /var/SSD/TB-hg/mailnews/base/util/nsMsgIncomingServer.cpp:178

We probably do not handle the case the file is invalid. What do you propose to do in such case? In case of success, we copy the messages to our inbox and clear the /var/mail/file . What if we fail? Do we just report an error and leave the file as is?
Comment 9 Dave Wolfe 2013-04-02 13:58:45 PDT
(In reply to :aceman from comment #8)

> We probably do not handle the case the file is invalid. What do you propose
> to do in such case? In case of success, we copy the messages to our inbox
> and clear the /var/mail/file . What if we fail? Do we just report an error
> and leave the file as is?

That would be acceptable to me. It's probably a lot to ask to detect and correct this degenerate case of the missing 'F'. Otherwise, if the file is invalid, what can TB do? Is there any point in continuing? At least terminating gracefully is preferable to crashing. Maybe ask what to do, terminate or continue in case there are other accounts.
Comment 10 :aceman 2013-04-02 15:43:55 PDT
Created attachment 732571 [details] [diff] [review]
patch

Ok, so throw up a nice error to the user (the infrastructure in the movemail service is good for this), clean up locks but leave the file untouched.
Add some more error checking where useful.
Comment 11 neil@parkwaycc.co.uk 2013-04-03 05:26:21 PDT
Comment on attachment 732571 [details] [diff] [review]
patch

>+4053=Unable to parse spool file %S. The file format may not be valid.
How about "The file may be corrupt."

>+  const PRUnichar *spoolPathString[] = {
>+    NS_ConvertUTF8toUTF16(spoolPath).get()
>+  };
This is unsafe as NS_ConvertUTF8toUTF16's destructor will run, freeing the memory pointed to by get(). The code you copied from probably got away with it by not having any allocations before it was next used. The safe way is to write something like this:
NS_ConvertUTF8toUTF16 wideSpoolPath(spoolPath);
const PRUnichar* spoolPathString[] = { wideSpoolPath.get(); }

>+      // If we do not have outputStream not, something bad happened.
Typo or something?
Comment 12 :aceman 2013-04-03 12:47:52 PDT
Created attachment 732981 [details] [diff] [review]
patch v2

Thanks.
Comment 13 Joshua Cranmer [:jcranmer] 2013-04-03 17:36:06 PDT
Comment on attachment 732981 [details] [diff] [review]
patch v2

Review of attachment 732981 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/locales/en-US/chrome/messenger/localMsgs.properties
@@ +158,5 @@
>  4037=Unable to locate mail spool file.
>  
> +## @name MOVEMAIL_SPOOL_FILE_NOT_VALID
> +## @loc %S is file name
> +4053=Unable to parse spool file %S. The file may be corrupt or not valid.

Ew, ew, ew, ew. These number-based localization schemes are crap. At least file a bug on moving us away from this stuff...

::: mailnews/local/src/nsMovemailService.cpp
@@ +450,2 @@
>  
> +    if (isMore && StringBeginsWith(buffer, msgHeader)) {

StringBeginsWith(buffer, NS_LITERAL_CSTRING("From ")) perhaps?

At the very least, 'msgHeader' is completely the wrong name for the Berkeley mailbox delimiter.

@@ +454,5 @@
>          outputStream->Flush();
>          newMailParser->PublishMsgHeader(nullptr);
> +        rv = msgStore->FinishNewMessage(outputStream, newHdr);
> +        if (NS_FAILED(rv))
> +          break;

I don't like this structure of handling errors in the main while loop--early returns would be preferable than a break, so that subsequent code could assume it parsed correctly after exiting the loop.

Could you try refactoring the ObtainSpoolLock/YieldSpoolLock into a RAII helper?

@@ +485,4 @@
>      newMailParser->HandleLine(buffer.BeginWriting(), buffer.Length());
> +    rv = outputStream->Write(buffer.get(), buffer.Length(), &bytesWritten);
> +    if (NS_FAILED(rv) || (bytesWritten < buffer.Length()))
> +      break;

Under your new logic, if the write doesn't fully complete, the user's mailbox will be irrevocably truncated (rv == success but not all bytes written). Just having the NS_ENSURE_SUCCESS(rv, rv) here (assuming the RAII helper) would cause data loss only of potentially the last line of a message.
Comment 14 :aceman 2013-04-03 23:32:52 PDT
(In reply to Joshua Cranmer [:jcranmer] from comment #13)
> @@ +454,5 @@
> >          outputStream->Flush();
> >          newMailParser->PublishMsgHeader(nullptr);
> > +        rv = msgStore->FinishNewMessage(outputStream, newHdr);
> > +        if (NS_FAILED(rv))
> > +          break;
> 
> I don't like this structure of handling errors in the main while loop--early
> returns would be preferable than a break, so that subsequent code could
> assume it parsed correctly after exiting the loop.
I just break the loop to get to the lock removing code.
 
> Could you try refactoring the ObtainSpoolLock/YieldSpoolLock into a RAII
> helper?
I don't know what that is.
Comment 15 :aceman 2013-04-04 07:30:19 PDT
Should the lock be converted to some object that automatically releases in its destructor?
Comment 16 Joshua Cranmer [:jcranmer] 2013-04-04 07:57:58 PDT
(In reply to :aceman from comment #14)
> > Could you try refactoring the ObtainSpoolLock/YieldSpoolLock into a RAII
> > helper?
> I don't know what that is.

RAII (resource acquisition is initialization) is a C++ idiom where an object's constructor does work to acquire a resource and the destructor releases it.

<http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/Mutex.h#137> is an example of such a class.
Comment 17 :aceman 2013-04-04 13:46:06 PDT
Created attachment 733532 [details] [diff] [review]
patch v3

OK, so something like this?
Comment 18 Joshua Cranmer [:jcranmer] 2013-04-04 15:27:47 PDT
Comment on attachment 733532 [details] [diff] [review]
patch v3

Review of attachment 733532 [details] [diff] [review]:
-----------------------------------------------------------------

I'd like another look over the code after these changes.

::: mailnews/local/src/nsLocalStrings.h
@@ +30,5 @@
>  #define MOVEMAIL_CANT_CREATE_LOCK                             4034
>  #define MOVEMAIL_CANT_DELETE_LOCK                             4035
>  #define MOVEMAIL_CANT_TRUNCATE_SPOOL_FILE                     4036
>  #define MOVEMAIL_SPOOL_FILE_NOT_FOUND                         4037
> +#define MOVEMAIL_SPOOL_FILE_NOT_VALID                         4053

Move this to the end of the list, please to keep the numbers in order.

::: mailnews/local/src/nsMovemailService.cpp
@@ +147,3 @@
>  
> +SpoolLock::~SpoolLock() {
> +  if (!YieldSpoolLock()) {

if (mLocked && !YieldSpoolLock())

@@ +431,5 @@
>    rv = in_server->GetRootFolder(getter_AddRefs(serverFolder));
>    NS_ENSURE_SUCCESS(rv, rv);
>  
>    rootMsgFolder = do_QueryInterface(serverFolder, &rv);
> +  NS_ENSURE_TRUE(NS_SUCCEEDED(rv) && rootMsgFolder, rv);

Er, if you're already changing this, just remove the rootMsgFolder variable and replace it with the serverFolder. It's silly to be QI'ing nsIMsgFolder to nsIMsgFolder.

@@ +512,4 @@
>        buffer.AssignLiteral("X-Mozilla-Status2: 00000000" MSG_LINEBREAK);
>        newMailParser->HandleLine(buffer.BeginWriting(), buffer.Length());
> +      rv = outputStream->Write(buffer.get(), buffer.Length(), &bytesWritten);
> +      NS_ENSURE_TRUE(NS_SUCCEEDED(rv) && (bytesWritten == buffer.Length()), NS_ERROR_FAILURE);

Nit: 80-char line width violations.

::: mailnews/local/src/nsMovemailService.h
@@ +20,5 @@
>  class nsIMsgFolder;
>  
> +class nsMovemailService;
> +
> +class SpoolLock

Move the class definition into the .cpp file, and wrap it in an anonymous namespace:

namespace {
class SpoolLock
{
};
}

SpoolLock::SpoolLock() { /* ... */ }

@@ +55,5 @@
>    NS_DECL_NSIMSGPROTOCOLINFO
> +
> +  friend class SpoolLock;
> +protected:
> +  void Error(int32_t errorCode, const PRUnichar **params, uint32_t length);

Might as well make this public and drop the friend declaration.
Comment 19 :aceman 2013-04-05 11:13:41 PDT
Created attachment 733951 [details] [diff] [review]
patch v4
Comment 20 Joshua Cranmer [:jcranmer] 2013-04-14 13:13:30 PDT
Comment on attachment 733951 [details] [diff] [review]
patch v4

Review of attachment 733951 [details] [diff] [review]:
-----------------------------------------------------------------

Tested this with my local movemail file and it works. Just a lot of style nits:

::: mailnews/local/src/nsMovemailService.cpp
@@ -35,5 @@
>  
>  #include "nsILineInputStream.h"
>  #include "nsISeekableStream.h"
>  #include "nsNetUtil.h"
> -#include "nsAutoPtr.h"

Nit: keep this one too.

@@ +64,5 @@
>  };
>  #define NUM_DEFAULT_SPOOL_PATHS (sizeof(gDefaultSpoolPaths)/sizeof(gDefaultSpoolPaths[0]))
>  
> +namespace {
> +class SpoolLock

Nit:
class MOZ_STACK_CLASS SpoolLock

(I didn't mention this earlier because the MOZ_STACK_CLASS hadn't been checked in yet).

@@ +156,5 @@
> + * @param aSpoolName  The path to the spool file.
> + * @param aSeconds    The number of seconds to retry the locking.
> + * @param aMovemail   The movemail service requesting the lock.
> + * @param aServer
> + */

Nit: move documentation to the declaration.

@@ +160,5 @@
> + */
> +SpoolLock::SpoolLock(nsACString *aSpoolName, int aSeconds,
> +                     nsMovemailService &aMovemail,
> +                     nsIMsgIncomingServer *aServer):
> +  mLocked(false)

The preferred style for these constructs are:
Foo::Foo(args)
: mMember1(init),
  mMember2(bar)
{
  // Code goes here
}

@@ +164,5 @@
> +  mLocked(false)
> +{
> +  mSpoolName = *aSpoolName;
> +  mOwningService = &aMovemail;
> +  mServer = aServer;

Also, move these to the C++ initializer format as well.

@@ +183,5 @@
> +    lockFile.AppendLiteral(LOCK_SUFFIX);
> +    const PRUnichar* params[] = { lockFile.get() };
> +    mOwningService->Error(MOVEMAIL_CANT_DELETE_LOCK, params, 1);
> +  } else {
> +    mLocked = false;

This else branch is unneeded.

@@ -482,3 @@
>      outputStream->Close();
>    }
> -

Nit: don't delete the extra line here.

::: mailnews/local/src/nsMovemailService.h
@@ -6,5 @@
>  #ifndef nsMovemailService_h___
>  #define nsMovemailService_h___
>  
>  #include "nscore.h"
> -#include "nsCOMPtr.h"

Don't remove this, you're using nsCOMPtr in the header.

@@ +21,5 @@
>    NS_DECL_ISUPPORTS
>    NS_DECL_NSIMOVEMAILSERVICE
>    NS_DECL_NSIMSGPROTOCOLINFO
> +
> +  void Error(int32_t errorCode, const PRUnichar **params, uint32_t length);

Nit: empty line afterwards.
Comment 21 :aceman 2013-04-15 10:22:17 PDT
Created attachment 737566 [details] [diff] [review]
patch v5

OK, thanks.
Comment 22 :aceman 2013-04-15 13:16:12 PDT
Let's hold this off until bug 858238 lands.
Comment 23 :aceman 2013-05-16 13:40:14 PDT
Created attachment 750656 [details] [diff] [review]
patch v6

Unbitrotted after the blocking bug has landed.
Comment 24 Ryan VanderMeulen [:RyanVM] 2013-05-16 13:51:31 PDT
https://hg.mozilla.org/comm-central/rev/98c3da4179af

Note You need to log in before you can comment on or make changes to this bug.