309.26 KB, application/x-bzip2
4.32 KB, application/octet-stream
25 bytes, application/octet-stream
178.34 KB, application/rfc822
25 bytes, application/octet-stream
4.08 KB, application/zip
2.19 KB, patch
|Details | Diff | Splinter Review|
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; cs-CZ; rv:22.214.171.124) Gecko/20091105 Fedora/3.5.5-1.fc12 Firefox/3.5.5 Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:126.96.36.199) Gecko/20091119 Fedora/3.0-3.11.rc1.fc12 Lightning/1.0pre Thunderbird/3.0 Apparently I've got a corrupted index on a newsgroup gmane.linux.redhat.fedora.devel and whenever I enter the newsgroup TB crashes. Abrt (automatic crash reporting tool in Fedora) falls in pieces however, because it generates 45+MB backtrace (approx. 1M lines of text). According to Martin Stránský it is a stacker overflow bug. Attached is bzip2ed backtrace. Reproducible: Always Steps to Reproduce: 1.see above 2. 3. Actual Results: entering the newsgroup leads to crash Expected Results: it shouldn't ... whatever corruption happened in the NG index (and it shouldn't get corrupted in the first place), it shouldn't put TB to its knees.
Matej can you save the .msf for that newsgroup too ? You got the backtrace with gdb ?
Whilst this may not be the issue, I see that you have enigmail 0.97a installed. Please can you either uninstall that or run in safe mode. There are known issues with enigmail 0.97a that cause crashes or strange effects. You should definitely update it to a latest nightly build of enigmail.
Here's the stack trace in ultra-condensed form: nsMsgQuickSearchDBView::ListIdsInThreadOrder nsMsgQuickSearchDBView::ListIdsInThreadOrder [ etc. ] parentKey is 124112, then 124141, then 124112, then repeat again. Yet another coder who assumed that threads could never have cycles! :-)
(In reply to comment #2) > Matej can you save the .msf for that newsgroup too ? Yes, I can, but I am afraid it would be useless, because I have reindexed the group already and TB doesn't crash on it anymore. > You got the backtrace with gdb ? yes, this was generated by abrt which uses gdb (and Fedora -debuginfo packages).
Created attachment 413630 [details] gmane.linux.redhat.fedora.devel.msf Fortunately Thunderbird got the same folder corrupted again and it crashes every time I got to this group. This is *.msf file.
Created attachment 413632 [details] nntp.gmane.org/gmane.linux.redhat.fedora.devel.msf
Is there a public news server I can access this newsgroup from?
(In reply to comment #9) > Is there a public news server I can access this newsgroup from? just plain news.gmane.org (see http://dir.gmane.org/gmane.linux.redhat.fedora.devel and http://gmane.org/about.php) and this is overview of all files attached as seen from /home/matej/.thunderbird/vq2fybjd.default/News: bradford:News$ ls -l */*fedora.devel* -rw-------. 1 matej matej 182620 20. lis 12.15 news.gmane.org/gmane.linux.redhat.fedora.devel -rw-r--r--. 1 matej matej 25 13. říj 14.21 news.gmane.org/gmane.linux.redhat.fedora.devel.dat -rw-rw-r--. 1 matej matej 51295625 20. lis 18.32 news.gmane.org/gmane.linux.redhat.fedora.devel.msf -rw-r--r--. 1 matej matej 25 19. zář 23.23 nntp.gmane.org/gmane.linux.redhat.fedora.devel.dat -rw-r--r--. 1 matej matej 11230 19. zář 23.23 nntp.gmane.org/gmane.linux.redhat.fedora.devel.msf bradford:News$
nominating for blocking, though ride-along is much more likely, if I can find a simple fix.
Drivers don't think this is significant enough to block on, but we'd take a ride-along patch later or possibly something for a dot release.
The attached .msf file doesn't crash for me with a 3.0 build - since you were crasomg in quick search code, you must have had a view selected, or done a quick search. Do you know what that might have been? Is Fedora using the about to ship 3.0 TB code?
(In reply to comment #15) > The attached .msf file doesn't crash for me with a 3.0 build - since you were > crasomg in quick search code, you must have had a view selected, or done a > quick search. Do you know what that might have been? Threaded view with unread messages > Is Fedora using the about to ship 3.0 TB code? Sorry, don't understand. Yes, this is Fedora build of TB (available on http://koji.fedoraproject.org/koji/buildinfo?buildID=141955). And now I don't think this is final TB 3.0 package for Fedora. Of course we will have at least one more build if/when you release it. I guess, we may do even one more build for the real RC1.
(In reply to comment #16) > > Is Fedora using the about to ship 3.0 TB code? I just meant how closely are you tracking the comm central 1.9.1 branch. If you're within a day or two, then you have all the fixes. Oh, darn, I think I needed your .newsrc file (or at least the line for this newsgroup) in order to see what you're seeing with view | unread messages.
(In reply to comment #17) > I just meant how closely are you tracking the comm central 1.9.1 branch. If > you're within a day or two, then you have all the fixes. Adding actual maintainer of Thunderbird in Red Hat to the CC list of this bug. > Oh, darn, I think I needed your .newsrc file (or at least the line for this > newsgroup) in order to see what you're seeing with view | unread messages. Unfortunately, I am not at my computer ATM, so will attach when I am back at home.
Created attachment 413810 [details] newsrc files bradford:~$ locate News/newsrc- /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.cs.felk.cvut.cz /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.eclipse.org /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.felk.cvut.cz /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.gmane.org /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.grc-1.com /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.grc-2.com /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.grc.com /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.mozilla-1.org /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.mozilla.org /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-nntp.gmane.org /home/matej/.thunderbird/vq2fybjd.default/News/newsrc-post-office.corp.redhat.com bradford:~$ zip -9rT newsrc.zip $(locate News/newsrc-) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.cs.felk.cvut.cz (deflated 9%) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.eclipse.org (deflated 15%) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.felk.cvut.cz (deflated 9%) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.gmane.org (deflated 59%) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.grc-1.com (stored 0%) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.grc-2.com (deflated 9%) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.grc.com (deflated 35%) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.mozilla-1.org (deflated 43%) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-news.mozilla.org (deflated 35%) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-nntp.gmane.org (deflated 67%) adding: home/matej/.thunderbird/vq2fybjd.default/News/newsrc-post-office.corp.redhat.com (deflated 35%) test of newsrc.zip OK bradford:~$
(In reply to comment #3) > Whilst this may not be the issue, I see that you have enigmail 0.97a installed. > Please can you either uninstall that or run in safe mode. > > There are known issues with enigmail 0.97a that cause crashes or strange > effects. You should definitely update it to a latest nightly build of enigmail. I have uninstalled enigmail and I still get crashes like https://bugzilla.redhat.com/attachment.cgi?id=373277 (from the closed bug https://bugzilla.redhat.com/show_bug.cgi?id=540694).
currently #10 crasher for 3.0, 1.2% of crashes to see if all arena_malloc_small crashes were this bug, I checked in 2 week period for 3.0b4. All contain nsMsgQuickSearchDBView::ListIdsInThreadOrder. I spot checked some other signatures containing arena_malloc_small, and none contain nsMsgQuickSearchDBView::ListIdsInThreadOrder. (nsMsgQuickSearchDBView::ListIdsInThreadOrder isn't in top 10 frames of any crashes of 3.0b4 and 3.0 in the last two months. frame 12 and higher) example stack bp-f606d26c-a838-40d9-baf0-caf2b2091118 0 mozcrt19.dll arena_malloc_small objdir-tb/mozilla/memory/jemalloc/src/jemalloc.c:4055 1 mozcrt19.dll malloc objdir-tb/mozilla/memory/jemalloc/src/jemalloc.c:6177 2 mozcrt19.dll operator new objdir-tb/mozilla/memory/jemalloc/src/new.cpp:54 3 thunderbird.exe orkinHeap::Alloc db/mork/src/orkinHeap.cpp:90 4 thunderbird.exe morkNext::MakeNewNext db/mork/src/morkNode.cpp:182 5 thunderbird.exe morkTable::NewTableRowCursor db/mork/src/morkTable.cpp:1540 6 thunderbird.exe morkTable::GetTableRowCursor db/mork/src/morkTable.cpp:458 7 thunderbird.exe nsMsgThread::GetChildHdrAt mailnews/db/msgdb/src/nsMsgThread.cpp:533 8 thunderbird.exe nsMsgThread::GetChildHdrForKey mailnews/db/msgdb/src/nsMsgThread.cpp:1069 9 thunderbird.exe nsMsgThread::GetRootHdr mailnews/db/msgdb/src/nsMsgThread.cpp:956 10 thunderbird.exe nsMsgThreadEnumerator::nsMsgThreadEnumerator mailnews/db/msgdb/src/nsMsgThread.cpp:699 11 thunderbird.exe nsMsgThread::EnumerateMessages mailnews/db/msgdb/src/nsMsgThread.cpp:905 12 thunderbird.exe nsMsgQuickSearchDBView::ListIdsInThreadOrder mailnews/base/src/nsMsgQuickSearchDBView.cpp:641 (repeats) 16176 thunderbird.exe nsMsgQuickSearchDBView::ListIdsInThreadOrder mailnews/base/src/nsMsgQuickSearchDBView.cpp:667 16177 thunderbird.exe nsMsgQuickSearchDBView::ListIdsInThreadOrder mailnews/base/src/nsMsgQuickSearchDBView.cpp:667 16178 thunderbird.exe nsMsgQuickSearchDBView::ListIdsInThreadOrder mailnews/base/src/nsMsgQuickSearchDBView.cpp:680 16179 thunderbird.exe nsMsgQuickSearchDBView::SortThreads mailnews/base/src/nsMsgQuickSearchDBView.cpp:531 16180 thunderbird.exe nsMsgThreadedDBView::Sort mailnews/base/src/nsMsgThreadedDBView.cpp:361 16181 thunderbird.exe nsMsgQuickSearchDBView::OnSearchDone mailnews/base/src/nsMsgQuickSearchDBView.cpp:335 16182 thunderbird.exe nsMsgSearchSession::NotifyListenersDone mailnews/base/search/src/nsMsgSearchSession.cpp:598
currently #33 crash for 3.0 and dropping. rare in nightlies (like only a couple a month) bp-085fc527-d2c3-4462-9dca-c03fb2090924 mentions ... rapidly clicking between newsgroups/messages on a secure server bp-ea7198a5-ebd7-4c63-8438-6e4f12091107 changed the threading in the moz seamonkey NG
Created attachment 416825 [details] [diff] [review] proposed fix this is analogous to what we do in normal threaded views, and should fix the stack overflow. I don't have a reproducible case, however.
(In reply to comment #25) > Created an attachment (id=416825) > > this is analogous to what we do in normal threaded views Normal threaded views compare *pNumListed to numChildren, is there a good reason not to do that (i.e. copy lines 5155-5166 of nsMsgDBView.cpp) here?
Created attachment 417705 [details] [diff] [review] slightly different check This is more like the code in nsMsgDBView.cpp, except that I don't want to blow away the db for this situation. I'd like to try to repair the corruption, but I don't have a test-case to reproduce the bug...
It happeans to me today. Crash Id 8eab8e40-0728-4a34-a7e5-3086b2091216 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a1pre) Gecko/20091215 Lightning/1.1a1pre Shredder/3.1a1pre ID:20091215062043
Comment on attachment 417705 [details] [diff] [review] slightly different check >+ // If we discover depths of more than numChildren, Nit: comment out of date ;-) >+ // Technically, this is an error, but forcing a database rebuild >+ // is too destructive so we just return. >+ if (*pNumListed > numChildren) In fact, it takes two more children than we were expecting to trigger this. Fortunately this is still using InsertMsgHdrAt rather than SetMsgHdrAt so it doesn't matter yet.
fixed on trunk: changeset: 4543:27b6c6e10fd2
Not blocking on this as it has gone down in the rankings, we'll probably take the patch anyway.
Comment on attachment 417705 [details] [diff] [review] slightly different check a=Standard8
fixed for 3.01
(In reply to comment #34) > fixed for 3.01 Shouldn't this have landed on 'default' hg branch (too)?
Yes, I see what you're trying to say - you mean the 1.9.1 branch, not the trunk...thx for catching this.
(in the process of verifying 3.0.1 fixes) I suspect this is not gone so being conservative and reopening. 4 crashes in 3.0.1pre in the past 5 days, and crash rate is too low to say the problem diminished at all after checkin. bp-49a254dc-220b-4abe-9455-7071a2100103 bp-4ea0bd79-d909-4f07-bded-e51e72100103 Aureliano, was this (comment 28) crash reproducible for you?
(In reply to comment #37) > (in the process of verifying 3.0.1 fixes) > I suspect this is not gone so being conservative and reopening. Wayne, as we have tried to fix something that's included in a stable release, can we file a new bug rather than reopening? That way we get to keep track of what's been included in which release. (Ludovic agreed this was the best to do, even if we end up with 10 reports/fixes for one stack).
(In reply to comment #38) > (In reply to comment #37) > > (in the process of verifying 3.0.1 fixes) > > I suspect this is not gone so being conservative and reopening. > > Wayne, as we have tried to fix something that's included in a stable release, > can we file a new bug rather than reopening? That way we get to keep track of > what's been included in which release. (Ludovic agreed this was the best to do, > even if we end up with 10 reports/fixes for one stack). I forgot there was a testcase here. Yes, makes sense to keep this FIXED. pinged reporters of Bug 532093 and bug 536070 in their respective bugs, as unfortunately they were never asked to test 3.0.1pre (somehow we missed that) and they never commented here. If both report their problem is gone then we'll open a new bug. Matej, is there a bug#/link that can be added to the "See also" link? Also, perhaps your reporter of the linux testcase can verify for us that this patch works.
This bug is not fixed since I am seeing frequent crashes even on newly subscribed news groups.
(In reply to comment #42) > This bug is not fixed since I am seeing frequent crashes even on newly > subscribed news groups. What version are you using ?
I am using 3.1b2Pre and have automatic updates enabled so I am always running the latest. My test environment has ~100 NG configured across ~15 NNTP servers. The crash seems to happen mainly when showing Unread items in a threaded NG view. Once this starts I can usually get it the NG back in working order by selecting it for offline use, download all message and rebuilding the index.
Still seeing crashes that are reportedly duplicates of 530044 which is marked as fixed. This crash happens with great frequency while browsing or search NG posts. http://crash-stats.mozilla.com/report/index/bp-05575b27-999f-40d0-ae88-598602100312
David, if you have a reproducible crash on a particular newsgroup, if you send me the .msf file for that newsgroup, along with the newsrc file for the server (assuming you're viewing unread only as your quick search), I can try it out.
I will see what I can do about getting you a file for testing. Unfortunately, the news groups where I have encountered this most frequently are private and contain content under NDA. I have also seen this happen on public news groups and will be sure to send you a repeatable example as soon as possible. One thing I have noticed is that sometimes duplicate posts show in the affected new groups. By duplicate I mean the exact date/time/subject etc. Rebuilding the index often corrects the duplicate entries which suggests there is a corruption problem. Perhaps the root cause of the crash is due to an infinite loop caused by this corruption. The crashes also only seem to happen when the NG is display is threaded. Thanks, David
Suggest we wait on bienvenu's analysis before we reopen (any bugs) vs creating a new bug. But there are crash reports with email addresses, so perhaps we'd be better served treating arena_malloc_small and malloc | operator new(unsigned int) | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**) separately? Notes: * bug 536070 malloc | operator new(unsigned int) | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**) ** reporter is MIA but benb seems to think that case was fixed the patch in this bug. ** two crashes with email addresses, taylor's and someone at ibm bp-381b1977-3a45-4504-9efb-2aaee2100219 * bug 549105 arena_malloc_small reporter peter says he was fixed in 3.0.3 - quite unclear why it went away between v3.0.1 and 3.0.3 * arena_malloc_small i.e. bug 531029 and this bug ** 5 email address in crash reports ** perhaps a candidate for reopening ** my sense is v3.0.3 crash rate is same as v3.0.1 and v3.0 but I can't say for sure without spending lots of time on this (longer search period than 4 weeks sure would be nice to have, but I wouldn't give any appendages for it) *** http://crash-stats.mozilla.com/query/query?product=Thunderbird&version=ALL%3AALL&date=&range_value=4&range_unit=weeks&query_search=signature&query_type=exact&query=malloc+|+operator+new%28unsigned+int%29+|+orkinHeap%3A%3AAlloc%28nsIMdbEnv*%2C+unsigned+int%2C+void**%29&build_id=&process_type=all&do_query=1 *** https://crash-stats.mozilla.com/report/list?product=Thunderbird&build_id=&query_search=signature&query_type=exact&query=arena_malloc_small&date=2%2F15%2F2010&range_value=4&range_unit=weeks&process_type=all&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&signature=arena_malloc_small&missing_sig=&page=1
This bug is still an issue. Crash-stats shows 214 occurrences of that bug during the last 4 weeks (96 of those from TB 3.1.2).
(In reply to comment #49) > This bug is still an issue. Crash-stats shows 214 occurrences of that bug > during the last 4 weeks (96 of those from TB 3.1.2). Given how long this has been fixed for and it was fixed on a branch, please file a new bug as it is easier to track for getting on branches etc.
(In reply to comment #49) > This bug is still an issue. Crash-stats shows 214 occurrences of that bug > during the last 4 weeks (96 of those from TB 3.1.2). Aqualon, do you have a new bug for this?