Closed
Bug 857678
Opened 12 years ago
Closed 11 years ago
crash in nsMsgDatabase::RowCellColumnToCollationKey
Categories
(MailNews Core :: Database, defect)
Tracking
(thunderbird24+ fixed, thunderbird25 fixed, seamonkey2.19 fixed, seamonkey2.20 fixed, seamonkey2.21 fixed)
RESOLVED
FIXED
Thunderbird 26.0
People
(Reporter: wsmwk, Assigned: mkmelin)
References
Details
(Keywords: crash, regression, topcrash, Whiteboard: [tbird betatopcrash][regression:TB22])
Crash Data
Attachments
(4 files, 2 obsolete files)
2.16 KB,
patch
|
jcranmer
:
review+
InvisibleSmiley
:
feedback-
|
Details | Diff | Splinter Review |
629 bytes,
patch
|
jcranmer
:
review+
standard8
:
approval-comm-aurora+
standard8
:
approval-comm-beta+
standard8
:
approval-comm-release+
|
Details | Diff | Splinter Review |
14.18 KB,
patch
|
neil
:
review-
InvisibleSmiley
:
feedback+
|
Details | Diff | Splinter Review |
1009 bytes,
patch
|
neil
:
review+
standard8
:
approval-comm-aurora+
standard8
:
approval-comm-beta+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-ff9b71a1-3473-432e-892c-42b6a2130323 .
=============================================================
unknown reporter "Crashing when accessing a newsgroup."
This bug was filed from the Socorro interface and is
report bp-ff9b71a1-3473-432e-892c-42b6a2130323 .
=============================================================
0 libxul.so nsMsgDatabase::RowCellColumnToCollationKey objdir-tb/mozilla/dist/include/nsCharTraits.h:514
1 libxul.so morkBookAtom::EqualFormAndBody const db/mork/src/morkAtom.cpp:450
2 libxul.so morkMap::find const db/mork/src/morkMap.cpp:268
3 libxul.so morkMap::Get db/mork/src/morkMap.cpp:698
4 libxul.so nsMsgDBView::GetCollationKey mailnews/base/src/nsMsgDBView.cpp:4067
5 libxul.so nsMsgDatabase::YarnTonsString objdir-tb/mozilla/dist/include/nsTSubstring.h:503
6 libxul.so nsMsgDatabase::RowCellColumnTonsString mailnews/db/msgdb/src/nsMsgDatabase.cpp:3483
7 libxul.so nsMsgDatabase::GetPropertyAsNSString mailnews/db/msgdb/src/nsMsgDatabase.cpp:3933
8 libxul.so nsMsgDBView::GetCurCustomColumn objdir-tb/mozilla/dist/include/nsCOMPtr.h:453
9 libxul.so nsMsgDBView::GetColumnHandler objdir-tb/mozilla/dist/include/nsTSubstring.h:85
10 libxul.so nsMsgDBView::GetCurColumnHandlerFromDBInfo objdir-tb/mozilla/dist/include/nsTSubstring.h:85
11 libxul.so nsMsgDatabase::GetMsgHdrForKey mailnews/db/msgdb/src/nsMsgDatabase.cpp:1813
Reporter | ||
Comment 1•12 years ago
|
||
perhaps regression from bug 834757.
first crash 2013-03-21 bp-a15d794c-ae78-4fea-8974-b2b262130321 22.0a1
0 msvcr100.dll zzz_AsmCodeRange_Begin f:\dd\vctools\crt_bld\SELF_X86\crt\src\INTEL\strlen.asm:81
1 xul.dll nsMsgDatabase::RowCellColumnToCollationKey mailnews/db/msgdb/src/nsMsgDatabase.cpp:3645
2 xul.dll nsMsgHdr::GetSubjectCollationKey mailnews/db/msgdb/src/nsMsgHdr.cpp:694
3 xul.dll nsMsgDBView::GetCollationKey mailnews/base/src/nsMsgDBView.cpp:4067
4 xul.dll nsMsgDBView::GetIndexForThread mailnews/base/src/nsMsgDBView.cpp:5072
5 xul.dll nsMsgSearchDBView::AddHdrFromFolder mailnews/base/src/nsMsgSearchDBView.cpp:444
6 xul.dll nsMsgSearchDBView::InsertHdrFromFolder mailnews/base/src/nsMsgSearchDBView.cpp:658
7 xul.dll nsMsgXFVirtualFolderDBView::OnSearchHit mailnews/base/src/nsMsgXFVirtualFolderDBView.cpp:281
Pidgeot18@12125 3643 err = m_mimeConverter->DecodeMimeHeaderToUTF8(
Pidgeot18@12125 3644 nsDependentCString(nakedString), charSet.get(), characterSetOverride,
Pidgeot18@12125 3645 true, decodedStr);
Assignee: nobody → Pidgeot18
Blocks: 834757
Crash Signature: [@ nsMsgDatabase::RowCellColumnToCollationKey(nsIMdbRow*, unsigned int, unsigned char**, unsigned int*)] → [@ nsMsgDatabase::RowCellColumnToCollationKey(nsIMdbRow*, unsigned int, unsigned char**, unsigned int*)]
[@ zzz_AsmCodeRange_Begin | nsMsgDatabase::RowCellColumnToCollationKey(nsIMdbRow*, unsigned int, unsigned char**, unsigned int*)]
Keywords: regression,
topcrash
OS: Linux → All
Comment 2•12 years ago
|
||
I'm going to guess that nakedString here is null, and it looks like nsDependentCString constructors aren't happy about getting NULL pointers as a value.
Comment 3•12 years ago
|
||
(In reply to Joshua Cranmer [:jcranmer] from comment #2)
> I'm going to guess that nakedString here is null, and it looks like
> nsDependentCString constructors aren't happy about getting NULL pointers as
> a value.
Yep, that's a definite
tracking-thunderbird22:
--- → +
Reporter | ||
Comment 4•12 years ago
|
||
datapoint - a few dozen individuals are crashing in beta in roughly one week of usage, which is a significant rate for our betas. not a plague, but significant. I'm still assessing impact but for now I'd say it would be best to have patched for TB23 beta lest it potentially hinder wider testing.
Is a testcase needed?
Flags: needinfo?(Pidgeot18)
Reporter | ||
Comment 5•12 years ago
|
||
bp-655f20b4-4f04-41a4-9a19-fe7282130611 rkrug is a frequent crasher
Whiteboard: [startupcrash][regression:TB22]
Assignee | ||
Comment 6•12 years ago
|
||
RowCellColumnToConstCharPtr shouldn't return NS_OK when it doesn't set the outparam.
Assignee: Pidgeot18 → mkmelin+mozilla
Status: NEW → ASSIGNED
Comment 7•12 years ago
|
||
I don't like this patch, since it makes things throw errors that didn't use to happen.
I'd prefer a more minimal fix, which I haven't had time to prepare yet.
Flags: needinfo?(Pidgeot18)
Assignee | ||
Comment 8•12 years ago
|
||
Well, it's certainly against conventions to allow ok when you pass out null - if you can't trust that you end up null-checking everything. Now would give it some time to bake before the ESR, and if there's problems we can always back it out and just null-check nakedString instead, but there might be other cases the current patch also fixes.
Updated•12 years ago
|
tracking-thunderbird22:
+ → ---
tracking-thunderbird24:
--- → +
So any other potential reviewers?
Flags: needinfo?(mozilla)
Flags: needinfo?(mbanner)
Comment 10•12 years ago
|
||
Mozilla/5.0 (Windows NT 5.1; rv:23.0) Gecko/20100101 Firefox/23.0 SeaMonkey/2.20a2
Build identifier: 20130624013001
I was crashing with this bug with this repro:
1. I created view and saved it as folder.
2. When saved view show several e-mails, I moved them all to another folder.
3. In this moment saved view was still showing same count of unread e-mails.
4. When I click next time on saved view name in Folder Pane, SeaMonkey instantly
crashed.
I didn't try this reproduction steps on another "physical" folder.
Relevant crash report:
https://crash-stats.mozilla.com/report/index/bp-b2344e3a-a798-46ce-abfb-6f6652130625
Comment 11•12 years ago
|
||
I experience this crash 100% each time I click one of my search folders (which happens to be a "Status is New" search across multiple folders of multiple accounts) in SM 2.19b1. Is there any workaround (other than not clicking the search folder of course)? I already tried compacting all folders.
![]() |
||
Comment 12•12 years ago
|
||
The line in question (from the last crash report) was specifically changed by jcranmer in this patch http://hg.mozilla.org/releases/comm-aurora/diff/93e89494abfd/mailnews/db/msgdb/src/nsMsgDatabase.cpp (not sure which bug). Maybe some of the strings is null and it is not handled properly.
Assignee | ||
Comment 13•12 years ago
|
||
I think it's pretty sure it's nakedString being null... (The check that it's set was bitten by the broken conventions of RowCellColumnToConstCharPtr)
Comment 14•12 years ago
|
||
How about this for a workaround?
Attachment #768210 -
Flags: review?(Pidgeot18)
Comment 15•12 years ago
|
||
Looks like we now need feedback from Joshua.
Flags: needinfo?(mozilla)
Flags: needinfo?(mbanner)
Comment 16•12 years ago
|
||
joshua, for reference, urgency here is a SeaMonkey 2.19 final release due out on Tuesday, July 2. So if we can eliminate this fairly common/large crasher from our users, we'll be very happy. We just need a correctness/risk assessment by someone who knows this code in order for us to land it.
Updated•12 years ago
|
Attachment #768210 -
Flags: review?(Pidgeot18) → review+
Comment 17•12 years ago
|
||
Comment on attachment 768210 [details] [diff] [review]
Possible patch
a=me for comm-release (a SeaMonkey-only branch)
Comment 18•12 years ago
|
||
Comment on attachment 768210 [details] [diff] [review]
Possible patch
[Approval Request Comment]
Regression caused by (bug #): 834757
User impact if declined: crash
Risk to taking this patch (and alternatives if risky): low
Attachment #768210 -
Flags: approval-comm-release?
Attachment #768210 -
Flags: approval-comm-beta?
Attachment #768210 -
Flags: approval-comm-aurora?
Comment 19•12 years ago
|
||
http://hg.mozilla.org/comm-central/rev/c95e6993f4df
Dunno whether this counts as fixed or whether someone wants to write some better code.
Assignee | ||
Comment 20•12 years ago
|
||
Comment on attachment 762812 [details] [diff] [review]
proposed fix
I'd still like this to go in (for trunk). Passing to Joshua for a verdict, seems i never set the review flag for this.
Attachment #762812 -
Flags: review?(Pidgeot18)
Updated•12 years ago
|
Attachment #768210 -
Flags: approval-comm-release?
Attachment #768210 -
Flags: approval-comm-release+
Attachment #768210 -
Flags: approval-comm-beta?
Attachment #768210 -
Flags: approval-comm-beta+
Attachment #768210 -
Flags: approval-comm-aurora?
Attachment #768210 -
Flags: approval-comm-aurora+
Comment 21•12 years ago
|
||
https://hg.mozilla.org/releases/comm-aurora/rev/953ec24f5192
https://hg.mozilla.org/releases/comm-beta/rev/9e2d19b4883f
https://hg.mozilla.org/releases/comm-release/rev/586520179d8b
status-seamonkey2.19:
--- → fixed
status-seamonkey2.20:
--- → fixed
status-seamonkey2.21:
--- → fixed
Comment 22•12 years ago
|
||
I've noticed non-null-terminated string in nsParseMailbox.cpp might be the root cause. See bug 877831 comment 13.
Comment 23•12 years ago
|
||
Comment on attachment 762812 [details] [diff] [review]
proposed fix
The ramifications of this make me nervous, but I don't see anybody who would be negatively impacted by this as far as I could look...
Attachment #762812 -
Flags: review?(Pidgeot18) → review+
Comment 24•12 years ago
|
||
I'm still seeing this with SM 2.20b1:
https://crash-stats.mozilla.com/report/index/1e94edaa-e1d1-4220-b361-529cb2130712
![]() |
||
Comment 25•12 years ago
|
||
So cay you try the patch?
Comment 26•12 years ago
|
||
(In reply to :aceman from comment #25)
> So cay you try the patch?
You mean the one from comment 6? Only if someone provided me an SM build (ideally built from comm-beta code) containing the proposed fix within the next 5 hours. After that I'll be AFK for a week. When I'm back I'll be able to build/test myself again.
Comment 27•12 years ago
|
||
Here is another fix for this crash.
I am thinking the root cause of this crash is come from nsParseMailMessageState.cpp because nsParseMailMessageState passes non-null-terminated string to nsMsgHdr (then, to nsMsgDatabase).
This patch can be applied after the fix for bug 877831 is applied.
Attachment #776096 -
Flags: review?(mbanner)
Comment 28•12 years ago
|
||
Comment on attachment 762812 [details] [diff] [review]
proposed fix
This patch does not fix the crash I'm experiencing (just compiled two builds off comm-beta tip yesterday, one vanilla and one with just this patch). Will try to create a build with the other patch later today.
Attachment #762812 -
Flags: feedback-
Comment 29•12 years ago
|
||
I don't actually know what will happen once non-null-terminated string has stored in the db.
But anyway this patch might be solve the crash in case of NULL string.
Attachment #780258 -
Flags: review?(mbanner)
Comment 30•12 years ago
|
||
(In reply to Hiroyuki Ikezoe (:hiro) from comment #29)
> But anyway this patch might be solve the crash in case of NULL string.
this patch *will* solve, I meant.
Comment 31•12 years ago
|
||
I am sorry the previous patch has garbage.
Attachment #780258 -
Attachment is obsolete: true
Attachment #780258 -
Flags: review?(mbanner)
Attachment #780261 -
Flags: review?(mbanner)
Comment 32•12 years ago
|
||
Comment on attachment 776096 [details] [diff] [review]
Another possible fix
This fixed the crash for me. :-) I built from the same c-b/m-b changesets as with my previous try, but now with only the following applied:
* attachment 766955 [details] [diff] [review] from bug 877831 (adapted/applied manually)
* attachment 776096 [details] [diff] [review] (this patch)
* attachment 780261 [details] [diff] [review] (Add null check barrier)
After I had run this build and opened the search folder which triggered the crash with other builds, I can now open the affected search folder with the other builds, too, so for the time being I won't be able to reproduce the crash anymore.
Attachment #776096 -
Flags: feedback+
Comment 33•12 years ago
|
||
It's currently #1 top crasher in TB 23.0b1 that contains the possible patch fix.
Reporter | ||
Comment 34•12 years ago
|
||
(In reply to Scoobidiver from comment #33)
> It's currently #1 top crasher in TB 23.0b1 that contains the possible patch
> fix.
note those numbers are inflated due to 40-50% duplicate crash reports.
ditto version 24 https://crash-stats.mozilla.com/report/list?product=Thunderbird&version=Thunderbird:24.0a2&query_search=signature&query_type=contains&reason_type=contains&date=2013-08-08&range_value=14&range_unit=days&hang_type=any&process_type=any&signature=zzz_AsmCodeRange_Begin+|+nsMsgDatabase%3A%3ARowCellColumnToCollationKey%28nsIMdbRow*%2C+unsigned+int%2C+unsigned+char**%2C+unsigned+int*%29
Comment 35•11 years ago
|
||
Only 17% of crashes happen within one minute.
Whiteboard: [startupcrash][regression:TB22] → [tbird betatopcrash][regression:TB22]
Reporter | ||
Comment 36•11 years ago
|
||
(In reply to Jens Hatlak (:InvisibleSmiley) from comment #32)
> Comment on attachment 776096 [details] [diff] [review]
> Another possible fix
>
> This fixed the crash for me. :-) I built from the same c-b/m-b changesets as
> with my previous try, but now with only the following applied:
> * attachment 766955 [details] [diff] [review] from bug 877831
> (adapted/applied manually)
> * attachment 776096 [details] [diff] [review] (this patch)
> * attachment 780261 [details] [diff] [review] (Add null check barrier)
>
> After I had run this build and opened the search folder which triggered the
> crash with other builds, I can now open the affected search folder with the
> other builds, too, so for the time being I won't be able to reproduce the
> crash anymore.
nice testing
(Mark sure has been stretched thin)
Sounds like we should get these landed for trunk testing so they have potential to get in TB24.
Flags: needinfo?(mbanner)
Comment 37•11 years ago
|
||
Comment on attachment 776096 [details] [diff] [review]
Another possible fix
I suggest Neil should have a look at these, as I'm going to be afk for a few days.
Attachment #776096 -
Flags: review?(mbanner) → review?(neil)
Updated•11 years ago
|
Attachment #780261 -
Flags: review?(mbanner) → review?(neil)
Comment 38•11 years ago
|
||
Comment on attachment 780261 [details] [diff] [review]
Add null check barrier
># HG changeset patch
># Parent 2827cd29313bb7118826c9a39a5613cd2ebe3596
># User Hiroyuki Ikezoe <hiikezoe@mozilla-japan.org>
>Bug 857678 - Add null check barrier in nsMsgDatabase::RowCellColumnToCollationKey
>
>diff --git a/mailnews/db/msgdb/src/nsMsgDatabase.cpp b/mailnews/db/msgdb/src/nsMsgDatabase.cpp
>--- a/mailnews/db/msgdb/src/nsMsgDatabase.cpp
>+++ b/mailnews/db/msgdb/src/nsMsgDatabase.cpp
>@@ -3750,17 +3750,17 @@ nsresult nsMsgDatabase::GetCollationKeyG
> }
>
> nsresult nsMsgDatabase::RowCellColumnToCollationKey(nsIMdbRow *row, mdb_token columnToken, uint8_t **result, uint32_t *len)
> {
> const char *nakedString = "";
> nsresult err;
>
> err = RowCellColumnToConstCharPtr(row, columnToken, &nakedString);
>- if (NS_SUCCEEDED(err))
>+ if (NS_SUCCEEDED(err) && nakedString)
I see from YarnTonsCString that mYarn_Buf could be null, but in that case I'd prefer to back out my original patch and replace null with "" after fetching the pointer.
Attachment #780261 -
Flags: review?(neil) → review-
Comment 39•11 years ago
|
||
Comment on attachment 776096 [details] [diff] [review]
Another possible fix
>- m_newMsgHdr->SetStringProperty("replyTo", replyTo->value);
>+ m_newMsgHdr->SetStringProperty("replyTo", nsAutoCString(replyTo->value, replyTo->length).get());
What are you hoping to achieve by this? If you're trying to avoid writing nulls then I guess it's too late for that now, there are people with existing summary files with null values in them.
Comment 40•11 years ago
|
||
(In reply to neil@parkwaycc.co.uk from comment #39)
> Comment on attachment 776096 [details] [diff] [review]
> Another possible fix
>
> >- m_newMsgHdr->SetStringProperty("replyTo", replyTo->value);
> >+ m_newMsgHdr->SetStringProperty("replyTo", nsAutoCString(replyTo->value, replyTo->length).get());
> What are you hoping to achieve by this? If you're trying to avoid writing
> nulls then I guess it's too late for that now, there are people with
> existing summary files with null values in them.
The purpose is to make the string passing to SetStringProperty null-terminated.
Comment 41•11 years ago
|
||
Comment on attachment 776096 [details] [diff] [review]
Another possible fix
I don't think this is the right approach, FinializeHeaders (and InternSubject which it calls) should be able to assume that the headers are null-terminated, and wallpapering over unterminated headers isn't a reasonable solution.
Attachment #776096 -
Flags: review?(neil) → review-
Comment 42•11 years ago
|
||
(In reply to neil@parkwaycc.co.uk from comment #41)
> Comment on attachment 776096 [details] [diff] [review]
> Another possible fix
>
> I don't think this is the right approach, FinializeHeaders (and
> InternSubject which it calls) should be able to assume that the headers are
> null-terminated, and wallpapering over unterminated headers isn't a
> reasonable solution.
Then we need to change ParseHeaders(), right? I do not think that such changes are good in the meantime.
Comment 43•11 years ago
|
||
Addressed the comment.
Attachment #780261 -
Attachment is obsolete: true
Attachment #792686 -
Flags: review?(neil)
Comment 44•11 years ago
|
||
Comment on attachment 792686 [details] [diff] [review]
Add null check barrier v2
This is OK but I would move the nakedString check to nearer where it is used because that would make it clearer that we need it because nsDependentCString doesn't accept null pointers.
Attachment #792686 -
Flags: review?(neil) → review+
Comment 45•11 years ago
|
||
(In reply to Hiroyuki Ikezoe from comment #42)
> (In reply to comment #41)
> > I don't think this is the right approach, FinializeHeaders (and
> > InternSubject which it calls) should be able to assume that the headers are
> > null-terminated, and wallpapering over unterminated headers isn't a
> > reasonable solution.
>
> Then we need to change ParseHeaders(), right? I do not think that such
> changes are good in the meantime.
If ParseHeaders has any bugs then they should be fixed. I would consider generating unterminated headers a bug.
Comment 46•11 years ago
|
||
I do not object to fix ParseHeaders in the future.
My thought of priority is that:
1) stop the crash
2) stop storing non null-terminated strings in db to avoid data corruption in the db
3) refactor ParseHeaders and related functions (and write more unit tests)
Assignee | ||
Comment 47•11 years ago
|
||
Comment on attachment 762812 [details] [diff] [review]
proposed fix
https://hg.mozilla.org/comm-central/rev/339bc9eac2fc
Reporter | ||
Comment 48•11 years ago
|
||
(In reply to Magnus Melin from comment #47)
> Comment on attachment 762812 [details] [diff] [review]
> proposed fix
>
> https://hg.mozilla.org/comm-central/rev/339bc9eac2fc
I see no obvious crash reports related to this checkin.
But 26.0a1 has an odd crash, most with mork on the stack, which does not coincide with any checkin from this bug, so I am mentioning it now so it's confusing in future
strlen | nsMsgDatabase::RowCellColumnToCollationKey(nsIMdbRow*, unsigned int, unsigned char**, unsigned int*)
first crash 20130802030203
last crash 20130823064649
https://crash-stats.mozilla.com/report/list?signature=strlen+|+nsMsgDatabase%3A%3ARowCellColumnToCollationKey%28nsIMdbRow*%2C+unsigned+int%2C+unsigned+char**%2C+unsigned+int*%29&product=Thunderbird&query_type=is_exactly&range_unit=weeks&process_type=any&hang_type=any&date=2013-08-28+16%3A00%3A00&range_value=4
Flags: needinfo?(mbanner)
Comment 49•11 years ago
|
||
Comment on attachment 792686 [details] [diff] [review]
Add null check barrier v2
[Triage Comment]
As trunk is a bit broken, but I wanted to land this for the upcoming beta, I've just pushed this to aurora and beta. As a reminder, we need to fix the review comment before this lands on trunk.
https://hg.mozilla.org/releases/comm-aurora/rev/50e64aa0beb7
https://hg.mozilla.org/releases/comm-beta/rev/bccdca1a2011
Attachment #792686 -
Flags: approval-comm-beta+
Attachment #792686 -
Flags: approval-comm-aurora+
Comment 50•11 years ago
|
||
I've just pushed this so that we can get this bug closed:
https://hg.mozilla.org/comm-central/rev/54fc7ea4e21d
If we want any more patches for or related to this bug, please do them as new bugs so that we can easily track them for branch landings, especially now that we're heading into the 24 end game.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
status-thunderbird24:
--- → fixed
status-thunderbird25:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 26.0
You need to log in
before you can comment on or make changes to this bug.
Description
•