Closed Bug 917955 Opened 11 years ago Closed 11 years ago

crash in PL_strncasecmp | nsCaseInsensitiveCStringComparator::operator()(char const*, char const*, unsigned int, unsigned int) during newsgroup subscribe

Categories

(MailNews Core :: Networking: NNTP, defect)

defect
Not set
critical

Tracking

(thunderbird25 fixed, thunderbird26 fixed, thunderbird_esr24+ fixed)

RESOLVED FIXED
Thunderbird 27.0
Tracking Status
thunderbird25 --- fixed
thunderbird26 --- fixed
thunderbird_esr24 + fixed

People

(Reporter: wsmwk, Assigned: infofrommozilla)

References

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(3 files, 1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-e5104f83-9047-4a65-b1ca-2f9912130908. 
Appears to be regression in TB24.
#1 crash so far

=============================================================
"I tried to subscribe to another newsgroup and typed "comp" as the start of the name. Then I noticed that was incorrect, and used backspace to erase all those characters. It got down to "c" remaining, then crashed."  Robert

0	plc4.dll	PL_strncasecmp	nsprpub/lib/libc/src/strcase.c
1	xul.dll	nsCaseInsensitiveCStringComparator::operator()(char const *,char const *,unsigned int,unsigned int)	xpcom/string/src/nsStringComparator.cpp
2	xul.dll	Compare(nsACString_internal const &,nsACString_internal const &,nsCStringComparator const &)	xpcom/string/src/nsTStringComparator.cpp
3	xul.dll	nsCStringLowerCaseComparator::LessThan(nsCString const &,nsCString const &)	mailnews/news/src/nsNntpIncomingServer.cpp
4	xul.dll	nsTArray_Impl<nsCString,nsTArrayInfallibleAllocator>::Compare<nsCStringLowerCaseComparator>(void const *,void const *,void *)	objdir-tb/mozilla/dist/include/nsTArray.h
5	xul.dll	med3	objdir-tb/mozilla/xpcom/build/nsQuickSort.cpp
6	xul.dll	NS_QuickSort	objdir-tb/mozilla/xpcom/build/nsQuickSort.cpp
7	xul.dll	NS_QuickSort	objdir-tb/mozilla/xpcom/build/nsQuickSort.cpp
8	xul.dll	NS_QuickSort	objdir-tb/mozilla/xpcom/build/nsQuickSort.cpp
9	xul.dll	NS_QuickSort	objdir-tb/mozilla/xpcom/build/nsQuickSort.cpp
10	xul.dll	NS_QuickSort	objdir-tb/mozilla/xpcom/build/nsQuickSort.cpp 

bp-d3ea40bb-d71b-4805-a174-ae3212130918 another user, if we need someone to reproduce
#6 crash for Tb25 aurora
stack size of windows is 1MB, but Linux is 8MB....
Crashes also on Seamonkey 2.21

See crash report: 

ID: f1922010-b159-4398-bed1-9a1ed2130919
Signature: PL_strncasecmp | nsCaseInsensitiveCStringComparator::operator()(char const*, char const*, unsigned int, unsigned int)
Already in version 17.0.8 was ns_quicksort incorrect. The first element is out of place.
In version TB 24.0 ns_quicksort is totally broken. The elements are completely messed up. The recursion depth is thus obviously much too high.

With many results (~16000) that causes the stack overflow.
I suspect Bug 872497 ( O(n^2) performance in NS_QuickSort)
http://hg.mozilla.org/mozilla-central/rev/aad29aa89237
Blocks: 872497
(In reply to Philip Chee from comment #7)
> I suspect Bug 872497 ( O(n^2) performance in NS_QuickSort)
> http://hg.mozilla.org/mozilla-central/rev/aad29aa89237

Yes, I can confirm that. Without that patch the result looks like the TB17.0.8 result. :-)
And no crash!
(In reply to Alfred Peters from comment #5)
> The first element is out of place.
(In reply to Alfred Peters from comment #6)
> The elements are completely messed up.

QuickSort is non-stable sort. Patch of Bug 872497 simply removed "do nothing if data is already sorted in ascending order" part only, in order to resolve severe performance problem of NSQuickSort() when (a) first half of data is sorted in descending order and (b) second half of data is also sorted in descending order, and when (a) < (b). Because of "(a) < (b)", swap won't occur, then insertion sort was executed for entire data which is never sorted in ascending order(each half of data is already sorted in descending order).

If reason of crash is non-stable sort, it's simply bug of caller of NS_QuickSort(), isn't it?
Or sort result is incorrect after patch of Bug 872497?
If so, it's problem since initial of NS_QuickSort()?
Or incorrect sort result due to stack over flow after patch of Bug 872497?

(In reply to Alfred Peters from comment #6)
> The recursion depth is thus obviously much too high.

After patch of Bug 872497, NS_QuickSort() merelyγ€€applies same sort algorythm for random data to "already sorted data in ascending order" and to "already sorted data in descending order".
If big random data, NS_QuickSort() can always produce such deep recursion depth. 

Why Tb still doesn't use std::stable_sort() or std::sort() of C++?
Does problem occur with std::stable_sort() or std::sort() of C++?
(In reply to Alfred Peters from comment #5)
> Created attachment 808284 [details]
> News subscribe dialog TB 17.0.8 with filter
> Already in version 17.0.8 was ns_quicksort incorrect. The first element is out of place.

How many news groups is sorted? Only 27 newsgroups which are seen in the screen shot?
Is newsgroup name already sorted in ascending order when newsgroup names is passed from news server?

Incorrect position is not first element only. alabama.test, which should be first element if ascending order, is also placed at wrong position. If sort key is newsgroup name, this is never due to non-stable sort.
Phenomenon is perhaps next.
  de.mein.test3 is chosen as pivot, then alabama.test at position 0 and de.mein.test3
  at mid are swapped.
  After it, NS_QuickSort() does do sorting of first half(smaller than pivot),
  de.mein.test3, alaba..., ..., de.mein..., alabama.test,
  then NS_QuickSort() does do sorting of second half(larger than pivot),
  de.mein.test3 to netgame.test.
  Because of bug in logic of NS_QuickSort() since initial of NS_QuickSort(),
  sort result is incorrect.
  (when N == 2**n +/- 1, forget to move/swap data at pivot?)
  (when N == 2**n +/- 1, infinite recursive call after patch of Bug 872497?)

Is there any reason to continue using NS_QuickSort() which can surely produce wrong sort result?
If no crash by backout of Bug 872497, wrong sort result is preferable for Tb or acceptable for Tb?
(In reply to Alfred Peters from comment #5)
> Already in version 17.0.8 was ns_quicksort incorrect. The first element is out of place.

Another question.

Sorting of newsgroup name is based on hiearchy of newsgroup name, and order/structure is correctly shown as expected if "filter"(Show items that contain) is not requested.

Is newsgroup names sorted by newsgroup name using NS_QuickSort() after picking up filtered newsgroups when "filter"(Show items that contain) is requested at subscribe of news account?
If yes, is comparison function correctly passed to NS_QuickSort()?

If no, it's "sort entire newsgroups in ascending order" then "pickup filtered newsgroup"?
  If so, how many newsgroups are defined at the news server?

(In reply to Alfred Peters from comment #6)
> With many results (~16000) that causes the stack overflow.

If many newsgroups is sorted, and if data is already fully sorted in ascending order, difference of "recursive call depth" between "before patch" and "afer patch" can be explained.
  Before patch : Insertion sort is applied to long element part at early stage of
                 NS_QuickSort() process, then recursive call depth is 1 at most.
  After  parch : Insertion sort is not applied unles elments length is less than 7.
                 This is same algorithm/logic as one for "fully random data",
                 and is always applied to "fully pre-sorted data in ascending order or
                 descending order" too after the patch.
                 So, "split to two parts then do recursive call for a half part"
                 is repeated until element part length reaches "less than 7".

If crash is due to stack overflow by "too deep recursive call", such stack overflow should always occur also in sorting of fully random data of same element number by NS_QuickSort().
Does problem like "infinte recursive call" occur in NS_Quicksort()?
Crash occurs in StringComparator.
Crash caused by fully broken sort result, instead of stack overflow?
(In reply to WADA from comment #12)
> Crash occurs in StringComparator.
> Crash caused by fully broken sort result, instead of stack overflow?

No, the stacks on crash-stacks say EXCEPTION_STACK_OVERFLOW
The incorrect comparison function causes the worst case of the quicksort algorithm, which caused the stack overflow.
Attachment #811557 - Flags: review?(Pidgeot18)
Comment on attachment 811557 [details] [diff] [review]
Correction of the comparison function LessThen () in class nsCStringLowerCaseComparator

Review of attachment 811557 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/news/src/nsNntpIncomingServer.cpp
@@ +73,5 @@
>    }
>  
>    bool LessThan(const nsCString &a, const nsCString &b) const
>    {
> +    return (Compare(a, b, nsCaseInsensitiveCStringComparator()) < 0);

nit: Extraneous parentheses.
Attachment #811557 - Flags: review?(Pidgeot18) → review+
> > +    return (Compare(a, b, nsCaseInsensitiveCStringComparator()) < 0);
> 
> nit: Extraneous parentheses.

OK, corrected.
Attachment #811557 - Attachment is obsolete: true
Attachment #812134 - Flags: review+
(In reply to Alfred Peters from comment #16)
> Created attachment 812134 [details] [diff] [review]

> > nit: Extraneous parentheses.

> OK, corrected.

I'm not the real reviewer. Could this be a problem?

If not, could someone set the checkin-needed keyword. I do not have sufficient rights to do so.
That's ok. Especially when carrying a review forward like that for nits fixed, it's good to add the reviewer also in the hg commit message. Like r=jcranmer.
Assignee: nobody → infofrommozilla
Status: NEW → ASSIGNED
Keywords: checkin-needed
OS: Windows NT → All
Hardware: x86 → All
https://hg.mozilla.org/comm-central/rev/b84ca03296e3
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 27.0
When is Thunderbird 27.0 scheduled for release to end users?  This is a critical bug.
TB27 will never be released to end users. But this should go into TB24.0.1 that will go soon.
Comment on attachment 812134 [details] [diff] [review]
Correction of the comparison function LessThen() in class nsCStringLowerCaseComparator

[Triage Comment]
We want to take this onto esr for 24.0.1 to get it fixed there.
Attachment #812134 - Flags: approval-comm-esr24+
Attachment #812134 - Flags: approval-comm-beta+
Attachment #812134 - Flags: approval-comm-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: