Segv in morkNode::SlotStrongNode | morkAtomRowMap::SlotStrongAtomRowMap - crash (debian)

NEW
Unassigned

Status

MailNews Core
Database
--
critical
2 years ago
6 months ago

People

(Reporter: Guido Günther, Unassigned, NeedInfo)

Tracking

({crash})

x86_64
Linux
crash

Firefox Tracking Flags

(Not tracked)

Details

(crash signature)

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0
Build ID: 20160507231935

Steps to reproduce:

Random, various actions but always same crash. Details at this Debian bugreport:

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=827267


Actual results:

Thunderbird crashed. The backtrace is here:

    https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=827267;filename=gdb.txt;msg=10
(Reporter)

Updated

2 years ago
OS: Unspecified → Linux
Hardware: Unspecified → x86_64

Comment 1

2 years ago
#0  0x0000001400000000 in ?? ()
#1  0x00007ffff202e16b in morkNode::SlotStrongNode (me=me@entry=0x0, ev=ev@entry=0x7fff77d3d380, ioSlot=ioSlot@entry=0x7fffc376c980) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNode.cpp:424
#2  0x00007ffff2034623 in morkAtomRowMap::SlotStrongAtomRowMap (ioSlot=0x7fffc376c980, ev=0x7fff77d3d380, me=0x0) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkAtomMap.h:331
#3  morkRowSpace::CloseRowSpace (this=0x7fffc376c800, ev=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkRowSpace.cpp:133
#4  0x00007ffff203468d in morkRowSpace::CloseMorkNode (this=0x7fffc376c800, ev=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkRowSpace.cpp:77
#5  0x00007ffff202e1f4 in morkNode::cut_use_count (this=0x7fffc376c800, ev=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNode.cpp:509
#6  0x00007ffff202e366 in morkNode::CutStrongRef (this=0x7fffc376c800, ev=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNode.cpp:530
#7  0x00007ffff202e5b7 in morkNodeMap::CutAllNodes (this=this@entry=0x7fff8d37b098, ev=ev@entry=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNodeMap.cpp:152
#8  0x00007ffff202e619 in morkNodeMap::CloseNodeMap (this=0x7fff8d37b098, ev=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNodeMap.cpp:73
#9  0x00007ffff202e64d in morkNodeMap::CloseMorkNode (this=this@entry=0x7fff8d37b098, ev=ev@entry=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNodeMap.cpp:45
#10 0x00007ffff2037480 in morkStore::CloseStore (this=0x7fff8d37b000, ev=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkStore.cpp:227
#11 0x00007ffff203752d in morkStore::CloseMorkNode (this=0x7fff8d37b000, ev=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkStore.cpp:109
#12 0x00007ffff2037567 in morkStore::~morkStore (this=0x7fff8d37b000, __in_chrg=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkStore.cpp:138
#13 0x00007ffff2037689 in morkStore::~morkStore (this=0x7fff8d37b000, __in_chrg=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkStore.cpp:150
#14 0x00007ffff202e81e in morkObject::Release (this=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkObject.cpp:35
#15 0x00007ffff2153253 in nsMsgDatabase::~nsMsgDatabase (this=0x7fffafdd5350, __in_chrg=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/mailnews/db/msgdb/src/nsMsgDatabase.cpp:1164
#16 0x00007ffff21462b1 in nsImapMailDatabase::~nsImapMailDatabase (this=0x7fffafdd5350, __in_chrg=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/mailnews/db/msgdb/src/nsImapMailDatabase.cpp:23
#17 0x00007ffff2148979 in nsMsgDatabase::Release (this=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/mailnews/db/msgdb/src/nsMsgDatabase.cpp:1174
#18 0x00007ffff22c775b in ReleaseObjects (aArray=...) at /build/icedove-SXJ_d3/icedove-45.1.0/mozilla/xpcom/glue/nsCOMArray.cpp:267
#19 0x00007ffff22cc40b in nsCOMArray_base::Clear (this=this@entry=0x7fff775d5248) at /build/icedove-SXJ_d3/icedove-45.1.0/mozilla/xpcom/glue/nsCOMArray.cpp:276


none of the other 4 crashes I sampled have morkAtomRowMap::SlotStrongAtomRowMap as the next frame.  Most, like bp-b53ca152-1812-4d6f-b023-7e3612160705,  have 
 0 	libxul.so	morkNode::SlotStrongNode	/build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkNode.cpp:424
1 	libxul.so	morkRowCellCursor::CloseRowCellCursor	/build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkRowObject.h:195
2 	libxul.so	morkRowCellCursor::CloseMorkNode	/build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkRowCellCursor.cpp:53
3 	libxul.so	morkRowCellCursor::~morkRowCellCursor	/build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkRowCellCursor.cpp:61
4 	libxul.so	morkRowCellCursor::~morkRowCellCursor	/build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkRowCellCursor.cpp:63
5 	libxul.so	morkObject::Release	/build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkObject.cpp:35
Crash Signature: [@ morkNode::SlotStrongNode]
Component: Untriaged → Database
Product: Thunderbird → MailNews Core
Version: 45 Branch → 45

Updated

a year ago
Severity: normal → critical
Keywords: crash
Summary: Segv in morkNode::SlotStrongNode → Segv in morkNode::SlotStrongNode - crash

Comment 2

a year ago
There was another backtrace reported.

https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=834392;filename=icedove-gdb-45.2.0-3_2016-08-16_09%3A04%3A34.log;msg=15
Do the users seeing this problem also get a crash when using a mozilla supplied build?
And if so, please post mozilla crash ID.
Flags: needinfo?(c.schoenert)
Flags: needinfo?(agx)

Updated

a year ago
Whiteboard: [closeme 2016-12-01]

Comment 4

a year ago
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #3)
> Do the users seeing this problem also get a crash when using a mozilla
> supplied build?
> And if so, please post mozilla crash ID.

Probably we will never get an answer to this question. Some reporters are happy they can provide a GDB log with the help of instructions from the Debian Wiki but they wont install a package from Mozilla as they don't know how or simple don't want to do that.

In the past I have installed a version from Mozilla and have seen also few but less crashes than from a Debian Icedove package, but this is quite some time ago so don't nail me. Mozilla is using GCC 4.7.3, in Debian we use GCC 6.2.0 so I believe the newer GCC is doing something more (aggressive) than the old version Mozilla is using. From my personal feeling I would say most of the crashes are happen in the JS garbage collection but that is just a impression.

Currently we can not more provide than the GDB logs in the bug reports, the following URL is holding various reports with probably mixed reports for various problems:
https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=id-crash-45.1.0;users=c.schoenert@t-online.de

Regards
Carsten
perhaps this is a manifestation of Bug 1278795?

Comment 6

a year ago
No, bug 1278795 is about morkAtom::AliasYarn() being called with a null atom. This one is different.
As a general statement, it seems questionable to me that it's worth reporting bug upstream if problem cannot be reproduced with mozilla source. Otherwise we have bug reports that simply go unsolved.

Guido states that the problem is reproducible and supplies patches to nspr, so surely he is capable? (but doesn't reply)
Flags: needinfo?(c.schoenert)
Ah, I see Guido is not the original reporter of https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=827267


> none of the other 4 crashes I sampled have morkAtomRowMap::SlotStrongAtomRowMap as the next frame.

I did a good size sample again from [1] and come up with nothing - no matching stacks.

Here is an ubuntu user with somewhat frequent crashes - but again different stacks
bp-065feea0-6ff7-4e0a-9d7f-25d5b2161202	2016-12-02 20:02:01 	morkNode::SlotStrongNode   Add term
bp-4eacf768-2a88-46a4-a1b8-f2aed2161120	2016-11-20 21:26:09 	MimeMultipart_parse_eof   Add term
bp-3202dcd9-2dab-4196-923f-24d572161120	2016-11-20 21:19:41 	MimeMultipart_parse_eof   Add term
bp-cde5f036-f4c9-4c63-a322-38fc92161019	2016-10-19 19:36:44 	morkNode::SlotStrongNode   Add term 

[1] https://crash-stats.mozilla.com/signature/?signature=morkNode%3A%3ASlotStrongNode&date=%3E%3D2016-11-04T13%3A05%3A59.000Z&date=%3C2016-12-04T13%3A05%3A59.000Z&_columns=date&_columns=version&_columns=platform&_columns=reason&_columns=address&_columns=user_comments&_sort=-email&_sort=platform&_sort=-date&page=1
Flags: needinfo?(agx)
Summary: Segv in morkNode::SlotStrongNode - crash → Segv in morkNode::SlotStrongNode - crash (debian)
Whiteboard: [closeme 2016-12-01]

Comment 9

a year ago
It seems to me all those crashes are from the same machine. Is that expected?
(In reply to :aceman from comment #9)
> It seems to me all those crashes are from the same machine. Is that expected?

Yes - all from the same user. I searched on that person's email address. (it's a restricted capability)
But there are more users tha that.

Comment 11

a year ago
hi,

For info, i'm also seeing this on debian stable. I'm collecting backtraces here:

  https://people.debian.org/~piem/icedove/

Let me know if there is anything else I could try.

cheers, piem

Comment 12

a year ago
Jorg, can you check the stack if this may be another case of running methods on a this=null object in Mork as we already had lately? Comment 4 mentions the build being with gcc 6.
Flags: needinfo?(jorgk)

Comment 13

a year ago
According to the reports this consistently crashes on
/build/thunderbird-C2X7lT/thunderbird-45.4.0+build1/db/mork/src/morkNode.cpp:424
for example:
#1  0x00007ffff202e16b in morkNode::SlotStrongNode (me=me@entry=0x0, ev=ev@entry=0x7fff77d3d380, ioSlot=ioSlot@entry=0x7fffc376c980) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNode.cpp:424

Looking at the code:
morkNode::SlotStrongNode(morkNode* me, morkEnv* ev, morkNode** ioSlot)
[comment deleted]
{
  morkNode* node = *ioSlot;
  if ( me != node )
  {
    if ( node )
    {
      // what if this nulls out the ev and causes asserts?
      // can we move this after the CutStrongRef()?
      *ioSlot = 0;
      node->CutStrongRef(ev); <=== line 424

I can't work out why that crashes. I dereferences 'node' but before it tests |if (node)|.

I'd hope that the compiler takes a copy of the content of '*ioSlot' so that |*ioSlot = 0;| doesn't also set 'node' to null. Hmm. I don't quite understand the cryptic comment, but perhaps it would be better to move |*ioSlot = 0;| after |node->CutStrongRef(ev);|.

Kent, with your sharp eye, can you see anything here?
Flags: needinfo?(jorgk) → needinfo?(rkent)

Comment 14

11 months ago
Normally I would say that this was a case where *ioSlot was a dangling pointer to an object that had already been released. At that point a crash would happen even if *ioSlot (and hence node) was non-zero.

But bp-065feea0-6ff7-4e0a-9d7f-25d5b2161202 from comment 8 contradicts that. Checking that report, I see a crash address of 0x0  My understanding is that that indicates node was truly zero at the point where the crash occurs, which is hard to understand if it was non-zero earlier.

So I do not have a theory of this.

The mork code is extremely old and idiosyncratic in its handling of pointers and XPCOM. It would really, really be good to either replace it or modernize it (heaven forbid!).
Flags: needinfo?(rkent)

Comment 15

9 months ago
Guido, does this crash for you when using a mozilla supplied beta?
http://www.mozilla.org/en-US/thunderbird/channel/
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(agx)
Summary: Segv in morkNode::SlotStrongNode - crash (debian) → Segv in morkNode::SlotStrongNode | morkAtomRowMap::SlotStrongAtomRowMap - crash (debian)

Comment 16

9 months ago
Can you confirm this was on a build (from Debian) built with GCC 6.x? The debian report has discussions as if gcc was not at 6.x yet.

(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #7)
> As a general statement, it seems questionable to me that it's worth
> reporting bug upstream if problem cannot be reproduced with mozilla source.
> Otherwise we have bug reports that simply go unsolved.

I wouldn't underestimate this. We have done changes in mork removing some dubious code, like 'if (this != nullptr)'. It fired back to us e.g. in bug 1278795, that on gcc 6+ that code may have had its reason.

So this may be another manifestation of those removals, in another place than what bug 1278795 fixed.

IF this can consistently be seen with gcc 6, it is not a distro problem, but will become ours (upstream) when mozilla switches the servers to gcc 6+ (now at 4.9). Then all our official linux builds will be unusable.

Let's observe these reports closely please.

Comment 17

9 months ago
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #15)
> Guido, does this crash for you when using a mozilla supplied beta?
> http://www.mozilla.org/en-US/thunderbird/channel/

I'll answering for Guido as I'm one of the two guys who are doing most the packaging thing for Debian.

As I remember, all reporters only complaining about the Debian packages, installed upstream versions from Mozilla working so far. That's why I came to the conclusion the issue can be related to the usage of the relative recent GCC in Debian unstable/testing. If possible we already cherry-picked upstream changes due nullpointer checks e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=1273020 in the latest upload.

Unfortunately we couldn't made up2date beta versions of Thunderbird due the de-branding of Icedove to Thunderbird in the experimental release. To less time for this, so I can't say much about recent beta version built in the Debian eco system. Right now I'm unable to build a version 52.0 for some time.

I'm not a specialist in working with GDB, but whenever I see unresolved or optimized out symbols in the log I think about part we are not really responsible for, e.g. libnspr or libnss. But I could be wrong here. A real good thing would be to sort out where the fault is or are, is it Thunderbird or Mozilla/Firefox codebase or something else.

I'm quite sure the problem is GCC6 related, and will be continuing if GCC7 is hitting the Debian archive. A first rebuild with GCC7 i failing (of course :-) ).
https://bugs.debian.org/853449

BTW: The DebConf this year will be held in Montreal, maybe some parts of the Thunderbird people will/can be there? It's probably a great opportunity to discuss such issues like above. Maybe you can give a talk about the future of Thunderbird?

https://debconf17.debconf.org//

Regards
Carsten

Comment 18

9 months ago
(In reply to Carsten Schoenert from comment #17)
> I'm quite sure the problem is GCC6 related, and will be continuing if GCC7
> is hitting the Debian archive. A first rebuild with GCC7 i failing (of
> course :-) ).
> https://bugs.debian.org/853449

Yeah, when m-c enabled warnings-as-errors, this is to be expected when new warnings appear. Anyway, you could try a new build, e.g. the nsEditor.cpp file seems to no longer exist.

Comment 19

6 months ago
My debian packaged thunderbird crash every 2-3 days on the same line... I tried do debug the problem with gdb but running p *node results in "value has been optimized out". After recompiling the debian package (45.8.0-3~deb8u1) with -O1, i got hopefully some more details about the problem:

Crash happens here:
http://sources.debian.net/src/icedove/1:45.8.0-3~deb8u1/db/mork/src/morkNode.cpp/#L424

Backtrace (full backtrace can be found on top (attached file)):
> #0  0x0000000000000000 in ?? ()
> No symbol table info available.
> #1  0x00007fa0ee2257ca in morkNode::SlotStrongNode (me=0x7fa08200f978, 
>     me@entry=0x0, ev=ev@entry=0x7fa06d7e8b00, ioSlot=0x7fa07a488978, 
>     ioSlot@entry=0x7fa07a488980)
>     at /tmp/buildd/icedove-45.8.0/db/mork/src/morkNode.cpp:424
>         node = 0x7fa08200f978
> #2  0x00007fa0ee22cbc1 in SlotStrongAtomRowMap (ioSlot=0x7fa07a488980, 
>     ev=0x7fa06d7e8b00, me=0x0)
>     at /tmp/buildd/icedove-45.8.0/db/mork/src/morkAtomMap.h:331
> No locals.
> #3  morkRowSpace::CloseRowSpace (this=0x7fa07a488800, ev=0x7fa06d7e8b00)
>     at /tmp/buildd/icedove-45.8.0/db/mork/src/morkRowSpace.cpp:133
>         cache = 0x7fa07a488980
>         cacheEnd = 0x7fa07a4889c0
>         store = <optimized out>
> #4  0x00007fa0ee22cc3d in morkRowSpace::CloseMorkNode (this=0x7fa08200f978, 
>     ev=0x7fa06d7e8b00)
>     at /tmp/buildd/icedove-45.8.0/db/mork/src/morkRowSpace.cpp:77
> No locals.


Variables:
> (gdb) p ev
> $4 = (morkEnv *) 0x7fa06d7e8b00
> (gdb) p *ev
> $5 = {<morkObject> = {<morkBead> = {<morkNode> = {
>         _vptr.morkNode = 0x7fa0f31c02d0 <vtable for morkEnv+16>, 
>         mNode_Heap = 0x7fa073f47a30, mNode_Base = 20068, 
>         mNode_Derived = 17782, mNode_Access = 111 'o', mNode_Usage = 104 'h', 
>         mNode_Mutable = 85 'U', mNode_Load = 34 '"', mNode_Uses = 1, 
>         mNode_Refs = 1}, mBead_Color = 0}, <nsIMdbObject> = {<nsISupports> = {
>         _vptr.nsISupports = 0x7fa0f31c03e8 <vtable for morkEnv+296>}, <No data fields>}, mObject_Handle = 0x0, mMorkEnv = 0x0, mRefCnt = {
>       static isThreadSafe = false, 
>       mValue = 1}}, <nsIMdbEnv> = {<nsISupports> = {
>       _vptr.nsISupports = 0x7fa0f31c0460 <vtable for morkEnv+416>}, <No data fields>}, mEnv_Factory = 0x7fa0bcec9d40, mEnv_Heap = 0x7fa073f47a30, 
>   mEnv_SelfAsMdbEnv = 0x7fa06d7e8b40, mEnv_ErrorHook = 0x0, mEnv_HandlePool = 
>     0x7fa078441580, mEnv_ErrorCount = 1, mEnv_WarningCount = 0, 
>   mEnv_ErrorCode = NS_ERROR_FAILURE, mEnv_DoTrace = 0 '\000', 
>   mEnv_AutoClear = 85 'U', mEnv_ShouldAbort = 0 '\000', 
>   mEnv_BeVerbose = 0 '\000', mEnv_OwnsHeap = 1 '\001'}
> (gdb) p node
> $6 = (morkNode *) 0x7fa08200f978
> (gdb) p *node
> $7 = {_vptr.morkNode = 0x7fa07a488978, mNode_Heap = 0x7fa08200fd78, 
>   mNode_Base = 58853, mNode_Derived = 58853, mNode_Access = 229 '\345', 
>   mNode_Usage = 229 '\345', mNode_Mutable = 229 '\345', 
>   mNode_Load = 229 '\345', mNode_Uses = 0, mNode_Refs = 0}
> (gdb) p me
> $8 = (morkNode *) 0x7fa08200f978
> (gdb) p *me
> $9 = {_vptr.morkNode = 0x7fa07a488978, mNode_Heap = 0x7fa08200fd78, 
>   mNode_Base = 58853, mNode_Derived = 58853, mNode_Access = 229 '\345', 
>   mNode_Usage = 229 '\345', mNode_Mutable = 229 '\345', 
>   mNode_Load = 229 '\345', mNode_Uses = 0, mNode_Refs = 0}

Running on debian jessie with gcc 4.9.

I am not a cpp pro, but it sounds a little bit strange that ,,me != node'' was true (morkNode.cpp:417).

If more debug infos required, just ask and i'm trying to provide them :)

Comment 20

6 months ago
Created attachment 8877962 [details]
full backtrace

Comment 21

6 months ago
Reproduce: I'm not 100% sure, but it happens at browsing websites from atom feeds (not really on email reading).
You need to log in before you can comment on or make changes to this bug.