Closed
Bug 1302478
Opened 8 years ago
Closed 8 years ago
Thunderbird 50 is crashing if selecting the first inbox [@ InvalidArrayIndex_CRASH | nsMsgKeyArray::GetArray ]
Categories
(MailNews Core :: Backend, defect)
Tracking
(thunderbird49 unaffected, thunderbird50+ fixed, thunderbird51 fixed, thunderbird52 fixed)
VERIFIED
FIXED
Thunderbird 52.0
Tracking | Status | |
---|---|---|
thunderbird49 | --- | unaffected |
thunderbird50 | + | fixed |
thunderbird51 | --- | fixed |
thunderbird52 | --- | fixed |
People
(Reporter: Nomis101, Assigned: squib)
References
Details
(4 keywords)
Crash Data
Attachments
(2 files)
117.15 KB,
text/plain
|
Details | |
722 bytes,
patch
|
rkent
:
review+
jorgk-bmo
:
approval-comm-aurora+
jorgk-bmo
:
approval-comm-beta+
|
Details | Diff | Splinter Review |
I have four mail accounts in Thunderbird. It is no problem to work with account 2 or 3 or 4. But if I try to select the first inbox (most upper one), than Thunderbird is crashing instantly. This is with Thunderbird 50.0a2. I'm on macOS 10.12 (16A320). I will try to add a lldb debugging output of this crash.
I can not test this with Thunderbird 51, because this version is crashing always immediately at startup.
I could not figure out yet how to debug this with lldb, so for now I'm attaching the Apple crash report. If I've figured out how this lldb is working, I will attach this output as well.
Comment 2•8 years ago
|
||
Can you please post the crash report ID from the Troubleshooting Information.
I couldn't find the one from 09-13, so I've reproduced the crash. This one is
Crash ID: bp-9b943ebb-e5c3-47ec-bd67-63c202160915
MozCrashReason: ElementAt(aIndex = 0, aLength = 0)
CrashTime: 1473962458
If somebody can guid me how to debug this crash with lldb, I could do this. Seems a lot different like it was with gdb.
Comment 5•8 years ago
|
||
The stack has no hints of mailnews code except for the final event, which is a crash when nsMsgKeyArray::GetArray has an invalid index.
I can't help you with OSX. If this was within Windows and reproducible, here is what I would try to do.
With a debugger on, I would try to see the Js Stack when the crash occurs. Get the debugger to gain control on the crash. Within Visual Studio I would then do Quick Watch with Dump JSStack() which will show the JS stack at the point of interruption in the DOS console. That might give some hints.
I would simplify. Can I move the message to a folder so that only has this message and reproduce the crash?
If you can get it simple enough, then you might me able to audit creations of the nsMsgKey object and try to figure out who is calling it that might affect the length. Maybe add printouts to anything that could affect the length, and see how it is changing.
None of this is easy, which is why we get paid the Big Bucks (or not).
Flags: needinfo?(rkent)
Comment 6•8 years ago
|
||
The crash first appears with 50.0a2 20160901004002. Unless nightly builds were broken in the preceeding week, judging by the crash rate and the build date being a Thursday I'd say the regression likely occurred one or two days prior.
https://crash-stats.mozilla.com/signature/?product=Thunderbird&date=%3E2016-06-01&signature=InvalidArrayIndex_CRASH%20%7C%20nsMsgKeyArray%3A%3AGetArray&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1
at least two other users are reporting this
bp-6000ce64-ad73-47fa-ae12-03f7d2160918 t7 using win7
bp-c6da5158-559b-441d-b932-332852160919 bp-05068310-85b1-4c07-9d64-8f8122160919 mxn on Mac
All that said, all the crashes I looked at have mailsummaries addon installed. But it has no recent updates - https://addons.mozilla.org/en-US/thunderbird/addon/mail-summaries/versions/
Severity: major → critical
Crash Signature: [@ InvalidArrayIndex_CRASH | nsMsgKeyArray::GetArray ]
Keywords: regression,
regressionwindow-wanted
Summary: Thunderbird 50 is crashing if selecting the first inbox → Thunderbird 50 is crashing if selecting the first inbox [@ InvalidArrayIndex_CRASH | nsMsgKeyArray::GetArray ]
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #6)
> All that said, all the crashes I looked at have mailsummaries addon
> installed. But it has no recent updates -
> https://addons.mozilla.org/en-US/thunderbird/addon/mail-summaries/versions/
Good point. Uninstalling mailsummaries fixed this crash for me and the crash on startup I had with TB 51 (but now opening an email takes for ages).
Assignee | ||
Comment 8•8 years ago
|
||
Yeah, I haven't touched Mail Summaries in a while, so this is probably a regression in a TB codepath that only Mail Summaries exercises.
Assignee | ||
Comment 9•8 years ago
|
||
For what it's worth, here's where it's probably crashing in Mail Summaries: https://bitbucket.org/squib/mail-summaries/src/9426db33ac6152df85987190a029da129efdb849/chrome/content/folderSummary.js?at=default&fileviewer=file-view-default#folderSummary.js-156
At least that way, you can see some context.
Comment 10•8 years ago
|
||
Turns out this signature arises from https://groups.google.com/forum/#!topic/mozilla.dev.platform/bqKWX4gYN14
Minh Nguyá»…n gets both this crash and Bug 1192778, so perhaps they are related
See Also: → 1192778
Comment 11•8 years ago
|
||
Chris is also seeing this in 50 beta.
InvalidArrayIndex_CRASH | nsTArray_Impl<T>::ElementAt | nsMsgKeyArray::GetArray
bp-6a0e18ce-8902-4021-a660-d2bff2161008
It's a somewhat popular addon. We definitely need to get a grip on this
Crash Signature: [@ InvalidArrayIndex_CRASH | nsMsgKeyArray::GetArray ] → [@ InvalidArrayIndex_CRASH | nsMsgKeyArray::GetArray ]
[@ InvalidArrayIndex_CRASH | nsTArray_Impl<T>::ElementAt | nsMsgKeyArray::GetArray ]
status-thunderbird49:
--- → affected
status-thunderbird50:
--- → affected
tracking-thunderbird50:
--- → ?
Keywords: topcrash-thunderbird
Updated•8 years ago
|
Comment 12•8 years ago
|
||
fyi, from the MailSummaries dev:
"In any case, if you're using a beta of Thunderbird, you'll probably need to use the latest version of the code on Bitbucket, not the released version here. In general, I try to time releases of Mail Summaries to major releases of Thunderbird, since I don't have much time to devote to this project."
Assignee | ||
Comment 13•8 years ago
|
||
That's me; I'm already following this bug.
We could really use a regression window on this.
Comment 14•8 years ago
|
||
(In reply to Jim Porter (:squib) from comment #13)
> That's me; I'm already following this bug.
>
> We could really use a regression window on this.
https://mozilla.github.io/mozregression/ is the tool to make this easy.
Can one of you get the regression range? It would be a great help
Flags: needinfo?(pgnet.dev+mozbugs)
Flags: needinfo?(mxn)
Flags: needinfo?(Nomis101)
Assignee | ||
Comment 15•8 years ago
|
||
Well, I've found one bug already, but I don't think it accounts for all the problems I've seen with Mail Summaries.
Assignee | ||
Comment 16•8 years ago
|
||
This fixes it. The issue is that `&array[0]` will trigger an assertion for an empty nsTArray. Instead, we should do the sane thing and just get the begin iterator (which, like std::vector, is a pointer to the beginning of the array).
Assignee: nobody → squibblyflabbetydoo
Status: NEW → ASSIGNED
Attachment #8802697 -
Flags: review?(rkent)
Comment 17•8 years ago
|
||
Comment on attachment 8802697 [details] [diff] [review]
Fix it
Review of attachment 8802697 [details] [diff] [review]:
-----------------------------------------------------------------
I'm not convinced this will work, but I am not sure of the right thing to do either.
The clone() return for zero length array, as I understand it, is not well defined (derived from malloc which is not well defined). It may or may not be null. If it does return null, then the method will return an error, but I think what you want it to do is return an empty array.
I think what I would prefer is defined behavior. That is, if length is zero, set *aLength to zero, *aKeys to null, and return NS_OK. It would be good to write a very simple unit test (or tack this onto another one) to show that this method, going through XPCOM, will work OK for zero length array.
If you can convince me that the return will be well defined and expected, I could accept your change.
Attachment #8802697 -
Flags: review?(rkent) → review-
Assignee | ||
Comment 18•8 years ago
|
||
That doesn't appear to be the case, since NS_Alloc (which is what nsMemory::Clone uses under the hood) *used* to assert if you gave it a size of 0, but that was fixed a long time ago: see bug 322908.
Assignee | ||
Comment 19•8 years ago
|
||
In any case, I don't believe Gecko uses the C standard library's malloc at all; as far as I'm aware, we always use jemalloc, which returns a non-NULL pointer from malloc(0).
Comment 20•8 years ago
|
||
Comment on attachment 8802697 [details] [diff] [review]
Fix it
Review of attachment 8802697 [details] [diff] [review]:
-----------------------------------------------------------------
OK thanks, let's do this then.
Attachment #8802697 -
Flags: review- → review+
Reporter | ||
Comment 21•8 years ago
|
||
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #14)
> (In reply to Jim Porter (:squib) from comment #13)
> > That's me; I'm already following this bug.
> >
> > We could really use a regression window on this.
>
> https://mozilla.github.io/mozregression/ is the tool to make this easy.
>
> Can one of you get the regression range? It would be a great help
I could try that on the weekend. But if you now have a fix, is this still needed?
Flags: needinfo?(Nomis101)
Comment 22•8 years ago
|
||
(In reply to Nomis101 from comment #21)
> (In reply to Wayne Mery (:wsmwk, NI for questions) from comment #14)
> > (In reply to Jim Porter (:squib) from comment #13)
> > > That's me; I'm already following this bug.
> > >
> > > We could really use a regression window on this.
> >
> > https://mozilla.github.io/mozregression/ is the tool to make this easy.
> >
> > Can one of you get the regression range? It would be a great help
>
> I could try that on the weekend. But if you now have a fix, is this still
> needed?
Not required. But generally helpful, because it helps us understand *why* this happened so we can potentially avoid it in the future, and also identify other potential fallout caused by the regressing bug.
That said, are we hitting hitting https://groups.google.com/forum/#!topic/mozilla.dev.platform/bqKWX4gYN14 via bug 1159244 ?
Blocks: 1312136
Comment 23•8 years ago
|
||
When you land the patch on c-c, please also land it in aurora
Flags: needinfo?(squibblyflabbetydoo)
Assignee | ||
Comment 24•8 years ago
|
||
Marking checkin-needed since my HG privileges have lapsed and I don't want to delay things any more by waiting to get them back.
Flags: needinfo?(squibblyflabbetydoo)
Keywords: checkin-needed
Comment 25•8 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Flags: needinfo?(pgnet.dev+mozbugs)
Flags: needinfo?(mxn)
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 52.0
Comment 26•8 years ago
|
||
Comment on attachment 8802697 [details] [diff] [review]
Fix it
Most likely we don't do another TB 50 beta, so no approval for now.
Attachment #8802697 -
Flags: approval-comm-beta?
Attachment #8802697 -
Flags: approval-comm-aurora+
Comment 27•8 years ago
|
||
> Most likely we don't do another TB 50 beta, so no approval for now.
Actually, I'm thinking we should do another beta (I know that isn't part of the original plan) in part because we potentially will need to iterate many times over the next few months to deal with the fallout of bug 1159244. I'd be surprised if we didn't end up with at least half dozen patches, most of which will need uplift to beta to judge effectiveness, unless the patch has a reproducible testcase. n.b. kent's bug 1312136 comment 4 about bug 1159244 which landed in fx50
Comment 28•8 years ago
|
||
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #27)
> > Most likely we don't do another TB 50 beta, so no approval for now.
> Actually, I'm thinking we should do another beta ...
Let me know and I'll do the approvals and uplifts.
Comment 29•8 years ago
|
||
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #27)
> > Most likely we don't do another TB 50 beta, so no approval for now.
>
> Actually, I'm thinking we should do another beta (I know that isn't part of
> the original plan)
Just in case there is any doubt, though I may argue against a particular release, I assume that Wayne serves as the overall release planner, and can make decisions about what releases we do or do not do.
Comment 30•8 years ago
|
||
Jorg, please do uplift this to beta. And we'll assess on Thursdday when to build our third beta. (Note, Firefox just posted they extending their 50 beta by a week. So we will have a longer third beta for Thunderbird as well.)
Rationale - the patch seems to be effective for InvalidArrayIndex_CRASH | nsTArray_Impl<T>::ElementAt | nsMsgKeyArray::GetArray [1] which was the #2 crash for nightly (no crashes reported for 10-25 and 10-26 builds). But the crash rate for InvalidArrayIndex_CRASH | nsMsgKeyArray::GetArray and other "array" crashes (7 signatures in the top 100 for nightly) is to low to determine the impact of the patch [3]. So we need the patch on beta to better determine it's impact where there are 13 "array" signatures in the top 200[4]. Plus, we benifit by killing the #1 crash on beta.
[1] https://crash-stats.mozilla.com/signature/?product=Thunderbird&version=52.0a1&signature=InvalidArrayIndex_CRASH%20%7C%20nsTArray_Impl%3CT%3E%3A%3AElementAt%20%7C%20nsMsgKeyArray%3A%3AGetArray&date=%3E%3D2016-10-20T00%3A00%3A00.000Z&date=%3C2016-10-27T00%3A00%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1#reports
[2] https://crash-stats.mozilla.com/signature/?product=Thunderbird&version=52.0a1&signature=InvalidArrayIndex_CRASH%20%7C%20nsTArray_Impl%3CT%3E%3A%3AElementAt%20%7C%20nsMsgKeyArray%3A%3AGetArray&date=%3E%3D2016-10-20T00%3A00%3A00.000Z&date=%3C2016-10-27T00%3A00%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1#reports
[3] https://crash-stats.mozilla.com/topcrashers/?product=Thunderbird&_tcbs_mode=byday&_facets_size=100&_range_type=build&days=14&version=52.0a1
[4] https://crash-stats.mozilla.com/topcrashers/?product=Thunderbird&version=50.0b0&_facets_size=200&days=14
Flags: needinfo?(jorgk)
Comment 31•8 years ago
|
||
Aurora (TB 51):
https://hg.mozilla.org/releases/comm-aurora/rev/207f1e141e6936f74c92a6145ed61f847820fcd7
Beta coming up.
Flags: needinfo?(jorgk)
Updated•8 years ago
|
Keywords: checkin-needed
Comment 32•8 years ago
|
||
Comment on attachment 8802697 [details] [diff] [review]
Fix it
Beta (TB 50):
https://hg.mozilla.org/releases/comm-beta/rev/b6cebf7cba5ecc787639ef19208f5da812f2fa76
Attachment #8802697 -
Flags: approval-comm-beta? → approval-comm-beta+
Updated•8 years ago
|
Comment 33•8 years ago
|
||
PGN, nomis11, is your crash gone when using 50.0b3?
Flags: needinfo?(pgnet.dev+mozbugs)
Flags: needinfo?(Nomis101)
Comment 34•8 years ago
|
||
> PGN, nomis11, is your crash gone when using 50.0b3?
Yep, upgrade to TB 50.0b3 and re-enable of MailSummaries and ...
... so far, so good. No more crash while selecting 1st tab, or any tab manipulation.
Flags: needinfo?(pgnet.dev+mozbugs)
Comment 35•8 years ago
|
||
I had another user, Giacomo, also report this crash sig is gone when using 50.0b3. However, he is now crashing with nsTreeBodyFrame::PrefillPropertyArray https://bugzilla.mozilla.org/show_bug.cgi?id=615958 which he was experiencing prior to getting this array index crash.
Filed a couple followups for
#2 50.0b2 crash - bug 1314348
#4 50.0b2 crash - bug 1314347
It remains to be seen to want rank these signatures will rise for 50.0b3
Reporter | ||
Comment 36•8 years ago
|
||
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #33)
> PGN, nomis11, is your crash gone when using 50.0b3?
Yes, the crash described here seems fixed with this patch in TB 50.
Flags: needinfo?(Nomis101)
Comment 37•8 years ago
|
||
crash-stats confirms no crashes starting 50.0b3.
Thanks!
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•