Closed Bug 1284596 Opened 9 years ago Closed 7 years ago

Segv in morkNode::SlotStrongNode | morkAtomRowMap::SlotStrongAtomRowMap - crash (debian)

Categories

(MailNews Core :: Database, defect)

x86_64
Linux
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: agx, Unassigned)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0 Build ID: 20160507231935 Steps to reproduce: Random, various actions but always same crash. Details at this Debian bugreport: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=827267 Actual results: Thunderbird crashed. The backtrace is here: https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=827267;filename=gdb.txt;msg=10
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
#0 0x0000001400000000 in ?? () #1 0x00007ffff202e16b in morkNode::SlotStrongNode (me=me@entry=0x0, ev=ev@entry=0x7fff77d3d380, ioSlot=ioSlot@entry=0x7fffc376c980) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNode.cpp:424 #2 0x00007ffff2034623 in morkAtomRowMap::SlotStrongAtomRowMap (ioSlot=0x7fffc376c980, ev=0x7fff77d3d380, me=0x0) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkAtomMap.h:331 #3 morkRowSpace::CloseRowSpace (this=0x7fffc376c800, ev=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkRowSpace.cpp:133 #4 0x00007ffff203468d in morkRowSpace::CloseMorkNode (this=0x7fffc376c800, ev=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkRowSpace.cpp:77 #5 0x00007ffff202e1f4 in morkNode::cut_use_count (this=0x7fffc376c800, ev=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNode.cpp:509 #6 0x00007ffff202e366 in morkNode::CutStrongRef (this=0x7fffc376c800, ev=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNode.cpp:530 #7 0x00007ffff202e5b7 in morkNodeMap::CutAllNodes (this=this@entry=0x7fff8d37b098, ev=ev@entry=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNodeMap.cpp:152 #8 0x00007ffff202e619 in morkNodeMap::CloseNodeMap (this=0x7fff8d37b098, ev=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNodeMap.cpp:73 #9 0x00007ffff202e64d in morkNodeMap::CloseMorkNode (this=this@entry=0x7fff8d37b098, ev=ev@entry=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNodeMap.cpp:45 #10 0x00007ffff2037480 in morkStore::CloseStore (this=0x7fff8d37b000, ev=0x7fff77d3d380) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkStore.cpp:227 #11 0x00007ffff203752d in morkStore::CloseMorkNode (this=0x7fff8d37b000, ev=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkStore.cpp:109 #12 0x00007ffff2037567 in morkStore::~morkStore (this=0x7fff8d37b000, __in_chrg=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkStore.cpp:138 #13 0x00007ffff2037689 in morkStore::~morkStore (this=0x7fff8d37b000, __in_chrg=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkStore.cpp:150 #14 0x00007ffff202e81e in morkObject::Release (this=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkObject.cpp:35 #15 0x00007ffff2153253 in nsMsgDatabase::~nsMsgDatabase (this=0x7fffafdd5350, __in_chrg=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/mailnews/db/msgdb/src/nsMsgDatabase.cpp:1164 #16 0x00007ffff21462b1 in nsImapMailDatabase::~nsImapMailDatabase (this=0x7fffafdd5350, __in_chrg=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/mailnews/db/msgdb/src/nsImapMailDatabase.cpp:23 #17 0x00007ffff2148979 in nsMsgDatabase::Release (this=<optimized out>) at /build/icedove-SXJ_d3/icedove-45.1.0/mailnews/db/msgdb/src/nsMsgDatabase.cpp:1174 #18 0x00007ffff22c775b in ReleaseObjects (aArray=...) at /build/icedove-SXJ_d3/icedove-45.1.0/mozilla/xpcom/glue/nsCOMArray.cpp:267 #19 0x00007ffff22cc40b in nsCOMArray_base::Clear (this=this@entry=0x7fff775d5248) at /build/icedove-SXJ_d3/icedove-45.1.0/mozilla/xpcom/glue/nsCOMArray.cpp:276 none of the other 4 crashes I sampled have morkAtomRowMap::SlotStrongAtomRowMap as the next frame. Most, like bp-b53ca152-1812-4d6f-b023-7e3612160705, have 0 libxul.so morkNode::SlotStrongNode /build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkNode.cpp:424 1 libxul.so morkRowCellCursor::CloseRowCellCursor /build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkRowObject.h:195 2 libxul.so morkRowCellCursor::CloseMorkNode /build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkRowCellCursor.cpp:53 3 libxul.so morkRowCellCursor::~morkRowCellCursor /build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkRowCellCursor.cpp:61 4 libxul.so morkRowCellCursor::~morkRowCellCursor /build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkRowCellCursor.cpp:63 5 libxul.so morkObject::Release /build/thunderbird-Fdwp3q/thunderbird-38.8.0+build1/db/mork/src/morkObject.cpp:35
Crash Signature: [@ morkNode::SlotStrongNode]
Component: Untriaged → Database
Product: Thunderbird → MailNews Core
Version: 45 Branch → 45
Severity: normal → critical
Keywords: crash
Summary: Segv in morkNode::SlotStrongNode → Segv in morkNode::SlotStrongNode - crash
Do the users seeing this problem also get a crash when using a mozilla supplied build? And if so, please post mozilla crash ID.
Flags: needinfo?(c.schoenert)
Flags: needinfo?(agx)
Whiteboard: [closeme 2016-12-01]
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #3) > Do the users seeing this problem also get a crash when using a mozilla > supplied build? > And if so, please post mozilla crash ID. Probably we will never get an answer to this question. Some reporters are happy they can provide a GDB log with the help of instructions from the Debian Wiki but they wont install a package from Mozilla as they don't know how or simple don't want to do that. In the past I have installed a version from Mozilla and have seen also few but less crashes than from a Debian Icedove package, but this is quite some time ago so don't nail me. Mozilla is using GCC 4.7.3, in Debian we use GCC 6.2.0 so I believe the newer GCC is doing something more (aggressive) than the old version Mozilla is using. From my personal feeling I would say most of the crashes are happen in the JS garbage collection but that is just a impression. Currently we can not more provide than the GDB logs in the bug reports, the following URL is holding various reports with probably mixed reports for various problems: https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=id-crash-45.1.0;users=c.schoenert@t-online.de Regards Carsten
perhaps this is a manifestation of Bug 1278795?
No, bug 1278795 is about morkAtom::AliasYarn() being called with a null atom. This one is different.
As a general statement, it seems questionable to me that it's worth reporting bug upstream if problem cannot be reproduced with mozilla source. Otherwise we have bug reports that simply go unsolved. Guido states that the problem is reproducible and supplies patches to nspr, so surely he is capable? (but doesn't reply)
Flags: needinfo?(c.schoenert)
Ah, I see Guido is not the original reporter of https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=827267 > none of the other 4 crashes I sampled have morkAtomRowMap::SlotStrongAtomRowMap as the next frame. I did a good size sample again from [1] and come up with nothing - no matching stacks. Here is an ubuntu user with somewhat frequent crashes - but again different stacks bp-065feea0-6ff7-4e0a-9d7f-25d5b2161202 2016-12-02 20:02:01 morkNode::SlotStrongNode Add term bp-4eacf768-2a88-46a4-a1b8-f2aed2161120 2016-11-20 21:26:09 MimeMultipart_parse_eof Add term bp-3202dcd9-2dab-4196-923f-24d572161120 2016-11-20 21:19:41 MimeMultipart_parse_eof Add term bp-cde5f036-f4c9-4c63-a322-38fc92161019 2016-10-19 19:36:44 morkNode::SlotStrongNode Add term [1] https://crash-stats.mozilla.com/signature/?signature=morkNode%3A%3ASlotStrongNode&date=%3E%3D2016-11-04T13%3A05%3A59.000Z&date=%3C2016-12-04T13%3A05%3A59.000Z&_columns=date&_columns=version&_columns=platform&_columns=reason&_columns=address&_columns=user_comments&_sort=-email&_sort=platform&_sort=-date&page=1
Flags: needinfo?(agx)
Summary: Segv in morkNode::SlotStrongNode - crash → Segv in morkNode::SlotStrongNode - crash (debian)
Whiteboard: [closeme 2016-12-01]
It seems to me all those crashes are from the same machine. Is that expected?
(In reply to :aceman from comment #9) > It seems to me all those crashes are from the same machine. Is that expected? Yes - all from the same user. I searched on that person's email address. (it's a restricted capability) But there are more users tha that.
hi, For info, i'm also seeing this on debian stable. I'm collecting backtraces here: https://people.debian.org/~piem/icedove/ Let me know if there is anything else I could try. cheers, piem
Jorg, can you check the stack if this may be another case of running methods on a this=null object in Mork as we already had lately? Comment 4 mentions the build being with gcc 6.
Flags: needinfo?(jorgk)
According to the reports this consistently crashes on /build/thunderbird-C2X7lT/thunderbird-45.4.0+build1/db/mork/src/morkNode.cpp:424 for example: #1 0x00007ffff202e16b in morkNode::SlotStrongNode (me=me@entry=0x0, ev=ev@entry=0x7fff77d3d380, ioSlot=ioSlot@entry=0x7fffc376c980) at /build/icedove-SXJ_d3/icedove-45.1.0/db/mork/src/morkNode.cpp:424 Looking at the code: morkNode::SlotStrongNode(morkNode* me, morkEnv* ev, morkNode** ioSlot) [comment deleted] { morkNode* node = *ioSlot; if ( me != node ) { if ( node ) { // what if this nulls out the ev and causes asserts? // can we move this after the CutStrongRef()? *ioSlot = 0; node->CutStrongRef(ev); <=== line 424 I can't work out why that crashes. I dereferences 'node' but before it tests |if (node)|. I'd hope that the compiler takes a copy of the content of '*ioSlot' so that |*ioSlot = 0;| doesn't also set 'node' to null. Hmm. I don't quite understand the cryptic comment, but perhaps it would be better to move |*ioSlot = 0;| after |node->CutStrongRef(ev);|. Kent, with your sharp eye, can you see anything here?
Flags: needinfo?(jorgk) → needinfo?(rkent)
Normally I would say that this was a case where *ioSlot was a dangling pointer to an object that had already been released. At that point a crash would happen even if *ioSlot (and hence node) was non-zero. But bp-065feea0-6ff7-4e0a-9d7f-25d5b2161202 from comment 8 contradicts that. Checking that report, I see a crash address of 0x0 My understanding is that that indicates node was truly zero at the point where the crash occurs, which is hard to understand if it was non-zero earlier. So I do not have a theory of this. The mork code is extremely old and idiosyncratic in its handling of pointers and XPCOM. It would really, really be good to either replace it or modernize it (heaven forbid!).
Flags: needinfo?(rkent)
Guido, does this crash for you when using a mozilla supplied beta? http://www.mozilla.org/en-US/thunderbird/channel/
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(agx)
Summary: Segv in morkNode::SlotStrongNode - crash (debian) → Segv in morkNode::SlotStrongNode | morkAtomRowMap::SlotStrongAtomRowMap - crash (debian)
Can you confirm this was on a build (from Debian) built with GCC 6.x? The debian report has discussions as if gcc was not at 6.x yet. (In reply to Wayne Mery (:wsmwk, NI for questions) from comment #7) > As a general statement, it seems questionable to me that it's worth > reporting bug upstream if problem cannot be reproduced with mozilla source. > Otherwise we have bug reports that simply go unsolved. I wouldn't underestimate this. We have done changes in mork removing some dubious code, like 'if (this != nullptr)'. It fired back to us e.g. in bug 1278795, that on gcc 6+ that code may have had its reason. So this may be another manifestation of those removals, in another place than what bug 1278795 fixed. IF this can consistently be seen with gcc 6, it is not a distro problem, but will become ours (upstream) when mozilla switches the servers to gcc 6+ (now at 4.9). Then all our official linux builds will be unusable. Let's observe these reports closely please.
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #15) > Guido, does this crash for you when using a mozilla supplied beta? > http://www.mozilla.org/en-US/thunderbird/channel/ I'll answering for Guido as I'm one of the two guys who are doing most the packaging thing for Debian. As I remember, all reporters only complaining about the Debian packages, installed upstream versions from Mozilla working so far. That's why I came to the conclusion the issue can be related to the usage of the relative recent GCC in Debian unstable/testing. If possible we already cherry-picked upstream changes due nullpointer checks e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=1273020 in the latest upload. Unfortunately we couldn't made up2date beta versions of Thunderbird due the de-branding of Icedove to Thunderbird in the experimental release. To less time for this, so I can't say much about recent beta version built in the Debian eco system. Right now I'm unable to build a version 52.0 for some time. I'm not a specialist in working with GDB, but whenever I see unresolved or optimized out symbols in the log I think about part we are not really responsible for, e.g. libnspr or libnss. But I could be wrong here. A real good thing would be to sort out where the fault is or are, is it Thunderbird or Mozilla/Firefox codebase or something else. I'm quite sure the problem is GCC6 related, and will be continuing if GCC7 is hitting the Debian archive. A first rebuild with GCC7 i failing (of course :-) ). https://bugs.debian.org/853449 BTW: The DebConf this year will be held in Montreal, maybe some parts of the Thunderbird people will/can be there? It's probably a great opportunity to discuss such issues like above. Maybe you can give a talk about the future of Thunderbird? https://debconf17.debconf.org// Regards Carsten
(In reply to Carsten Schoenert from comment #17) > I'm quite sure the problem is GCC6 related, and will be continuing if GCC7 > is hitting the Debian archive. A first rebuild with GCC7 i failing (of > course :-) ). > https://bugs.debian.org/853449 Yeah, when m-c enabled warnings-as-errors, this is to be expected when new warnings appear. Anyway, you could try a new build, e.g. the nsEditor.cpp file seems to no longer exist.
My debian packaged thunderbird crash every 2-3 days on the same line... I tried do debug the problem with gdb but running p *node results in "value has been optimized out". After recompiling the debian package (45.8.0-3~deb8u1) with -O1, i got hopefully some more details about the problem: Crash happens here: http://sources.debian.net/src/icedove/1:45.8.0-3~deb8u1/db/mork/src/morkNode.cpp/#L424 Backtrace (full backtrace can be found on top (attached file)): > #0 0x0000000000000000 in ?? () > No symbol table info available. > #1 0x00007fa0ee2257ca in morkNode::SlotStrongNode (me=0x7fa08200f978, > me@entry=0x0, ev=ev@entry=0x7fa06d7e8b00, ioSlot=0x7fa07a488978, > ioSlot@entry=0x7fa07a488980) > at /tmp/buildd/icedove-45.8.0/db/mork/src/morkNode.cpp:424 > node = 0x7fa08200f978 > #2 0x00007fa0ee22cbc1 in SlotStrongAtomRowMap (ioSlot=0x7fa07a488980, > ev=0x7fa06d7e8b00, me=0x0) > at /tmp/buildd/icedove-45.8.0/db/mork/src/morkAtomMap.h:331 > No locals. > #3 morkRowSpace::CloseRowSpace (this=0x7fa07a488800, ev=0x7fa06d7e8b00) > at /tmp/buildd/icedove-45.8.0/db/mork/src/morkRowSpace.cpp:133 > cache = 0x7fa07a488980 > cacheEnd = 0x7fa07a4889c0 > store = <optimized out> > #4 0x00007fa0ee22cc3d in morkRowSpace::CloseMorkNode (this=0x7fa08200f978, > ev=0x7fa06d7e8b00) > at /tmp/buildd/icedove-45.8.0/db/mork/src/morkRowSpace.cpp:77 > No locals. Variables: > (gdb) p ev > $4 = (morkEnv *) 0x7fa06d7e8b00 > (gdb) p *ev > $5 = {<morkObject> = {<morkBead> = {<morkNode> = { > _vptr.morkNode = 0x7fa0f31c02d0 <vtable for morkEnv+16>, > mNode_Heap = 0x7fa073f47a30, mNode_Base = 20068, > mNode_Derived = 17782, mNode_Access = 111 'o', mNode_Usage = 104 'h', > mNode_Mutable = 85 'U', mNode_Load = 34 '"', mNode_Uses = 1, > mNode_Refs = 1}, mBead_Color = 0}, <nsIMdbObject> = {<nsISupports> = { > _vptr.nsISupports = 0x7fa0f31c03e8 <vtable for morkEnv+296>}, <No data fields>}, mObject_Handle = 0x0, mMorkEnv = 0x0, mRefCnt = { > static isThreadSafe = false, > mValue = 1}}, <nsIMdbEnv> = {<nsISupports> = { > _vptr.nsISupports = 0x7fa0f31c0460 <vtable for morkEnv+416>}, <No data fields>}, mEnv_Factory = 0x7fa0bcec9d40, mEnv_Heap = 0x7fa073f47a30, > mEnv_SelfAsMdbEnv = 0x7fa06d7e8b40, mEnv_ErrorHook = 0x0, mEnv_HandlePool = > 0x7fa078441580, mEnv_ErrorCount = 1, mEnv_WarningCount = 0, > mEnv_ErrorCode = NS_ERROR_FAILURE, mEnv_DoTrace = 0 '\000', > mEnv_AutoClear = 85 'U', mEnv_ShouldAbort = 0 '\000', > mEnv_BeVerbose = 0 '\000', mEnv_OwnsHeap = 1 '\001'} > (gdb) p node > $6 = (morkNode *) 0x7fa08200f978 > (gdb) p *node > $7 = {_vptr.morkNode = 0x7fa07a488978, mNode_Heap = 0x7fa08200fd78, > mNode_Base = 58853, mNode_Derived = 58853, mNode_Access = 229 '\345', > mNode_Usage = 229 '\345', mNode_Mutable = 229 '\345', > mNode_Load = 229 '\345', mNode_Uses = 0, mNode_Refs = 0} > (gdb) p me > $8 = (morkNode *) 0x7fa08200f978 > (gdb) p *me > $9 = {_vptr.morkNode = 0x7fa07a488978, mNode_Heap = 0x7fa08200fd78, > mNode_Base = 58853, mNode_Derived = 58853, mNode_Access = 229 '\345', > mNode_Usage = 229 '\345', mNode_Mutable = 229 '\345', > mNode_Load = 229 '\345', mNode_Uses = 0, mNode_Refs = 0} Running on debian jessie with gcc 4.9. I am not a cpp pro, but it sounds a little bit strange that ,,me != node'' was true (morkNode.cpp:417). If more debug infos required, just ask and i'm trying to provide them :)
Attached file full backtrace
Reproduce: I'm not 100% sure, but it happens at browsing websites from atom feeds (not really on email reading).
Sascha, Does the crash go away if you disable gcc7 compiler optimize?
Flags: needinfo?(agx) → needinfo?(mozilla.dev)
Wayne, the crash disappeared since some version / debian package upgrade. Everything is fine with the current version in debian stretch (52.8). Bug can be closed from my side.
Flags: needinfo?(mozilla.dev)
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: