Closed Bug 1491919 Opened 2 years ago Closed 2 years ago

TEST-UNEXPECTED-FAIL | comm/mailnews/extensions/bayesian-spam-filter/test/unit/test_customTokenization.js

Categories

(MailNews Core :: General, enhancement)

enhancement
Not set
normal

Tracking

(thunderbird_esr6062+ fixed, thunderbird63 fixed, thunderbird64 fixed)

RESOLVED FIXED
Thunderbird 64.0
Tracking Status
thunderbird_esr60 62+ fixed
thunderbird63 --- fixed
thunderbird64 --- fixed

People

(Reporter: jorgk-bmo, Assigned: benc)

Details

(Whiteboard: [Thunderbird-testfailure: X all debug])

Attachments

(1 file, 2 obsolete files)

TEST-UNEXPECTED-FAIL | comm/mailnews/extensions/bayesian-spam-filter/test/unit/test_customTokenization.js | onMessageTraitDetails - [onMessageTraitDetails : 111] false == true

fails in debug builds only:

M-C last good: 5165e750ffabb930deb846410ab62dcc6d
M-C first bad: 5ecae696c54f6240e520e9a1517e7a5fc2
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5165e750ffabb930deb846410ab62dcc6d&tochange=5ecae696c54f6240e520e9a1517e7a5fc2
OK, so ab213805a838 passed, 536dcedd614e failed. That brings us to this:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ab213805a838&tochange=536dcedd614e
9332515c425d passed, 2fa27517819d failed:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9332515c425d&tochange=2fa27517819d
How nice, the link doesn't work (bug 1466381), so let me paste it here:

2fa27517819d	Henri Sivonen — Bug 1490972 - Limit the number of bytes poisoned to avoid quadratic behavior. r=froydnj
0bb09351c1c1	Henri Sivonen — Bug 1487341 - Make Truncate(), SetLength() and Capacity() more efficient by keeping memcpying to the minimum. r=froydnj
9332515c425d	Thomas Nguyen — Bug 1330487 - Part 12: Update web-platform tests meta using ---manifest-update r=heycam

Henri, the evidence is quite clear that your two bugs broke our test. I just don't understand what happened. The test fails in debug mode locally and in automation, passes in opt builds in automation.

I have little idea of what the test does, from its name test_customTokenization.js I assume it exercises some code we have here:
https://dxr.mozilla.org/comm-central/source/comm/mailnews/extensions/fts3/src
No idea how that could be related to your changes. Where should I start looking?
Flags: needinfo?(hsivonen)
(In reply to Jorg K (GMT+2) from comment #4)
> No idea how that could be related to your changes.

Chances are the fix for bug 1486706 was incomplete after all.

> Where should I start looking?

Look for XPCOM string SetCapacity() followed BeginWriting() writing past the string's reported Length() (and chance SetCapacity() to SetLength()) or for creation of PromiseFlat[C]String(foo), followed by the mutation of foo followed by the use of the PromiseFlat[C]String (change PromiseFlat[C]String to ns[C]String).
Flags: needinfo?(hsivonen)
Oops, different type of capacity, that one is for arrays, not strings.
There aren't many SetCapacity() call sites left in C-C:
https://dxr.mozilla.org/comm-central/search?q=SetCapacity(+path%3Acomm&redirect=false
Most of them are for arrays, I can only see two for strings:
comm/mailnews/base/search/src/nsMsgFilter.cpp
537 buffer.SetCapacity(512);
comm/mailnews/base/src/nsSpamSettings.cpp
662 buffer.SetCapacity(512);

Inspecting all the "PromiseFlat" uses seems a bit of a daunting task:
https://dxr.mozilla.org/comm-central/search?q=PromiseFlat+path%3Acomm&redirect=false

Henri, is there another way to find the culprit? Maybe add some code to trap the case when it happens?
Flags: needinfo?(hsivonen)
I've whole-sale replaced PromiseFlatCString with nsCString and PromiseFlatString with nsString in all of mailnews/ and the problem persists.
(In reply to Jorg K (GMT+2) from comment #8)
> Henri, is there another way to find the culprit?

I suggest running the test case under rr and trying to see where the problem string comes from.

> Maybe add some code to trap
> the case when it happens?

Valgrind (https://developer.mozilla.org/en-US/docs/Mozilla/Testing/Valgrind) should provide this, but I haven't actually tested.
Flags: needinfo?(hsivonen)
Attached patch change-all-to-nsString.patch (obsolete) — Splinter Review
Ben, as you know, all the people here work on "bustage fix" from time to time, so here's an item I'd like you to look at.

Mostly from comment #4: After landing of bug 1487341 and bug 1490972 one of our tests
  mach xpcshell-test comm/mailnews/extensions/bayesian-spam-filter/test/unit/test_customTokenization.js
fails in debug builds only.

Henri suggested in comment #5 to check use of SetCapacity() and PromiseFlat[C]String(). As per comment #8, there are only two harmless SetCapacity() call sites left, and they don't cause trouble, they can be commented out just making the code less efficient.

As per comment #9 I've changed all PromiseFlat[C]String() to ns[C]String(), see patch, and it doesn't make a difference. Next Henri suggested to use your compatriot's, that is Robert O'Callahan's "rr" tool, but I understand that that's for Linux.

The patch also contains a hunk in test_customTokenization.js:
  print(`=== ${aTokenString} -- ${value}`);

That prints:
=== to:careful reader <reader@example.org>,subject:live,mime-version:1.0,subject:long,subject:eat,subject:your,subject:vegetables,from:mom <mother@example.com>,date:tue, 30 apr 2008 00:12:17 -0700,received:host301,skip:t 70,received:1leegh-0003gn-rr,received:69),received:reader@example,received:(envelope-from,sender:bugzilla test setup <noreply@example.org>,received:(exim,received:hsd1,received:com>),received:c-1-2-3-4,received:helo=thecomputer),received:mon,,received:from,received:example,received:mar,received:net,received:esmtpa,received:<someone@example,received:13:24:06,received:with,received:for,received:([1,received:com,received:org; -- important

So basically, the test looks for "important" in a list of tokens an doesn't find it. I believe the word "important" should be found, since the test uses this test message:
https://searchfox.org/comm-central/source/mailnews/extensions/bayesian-spam-filter/test/unit/resources/ham1.eml#7
which in its body contains: vegetables are very important for your health and wealth.
You can see other tokens from the message, like "subject: eat your vegetables" in the printout above.

So it looks like the body is somewhat missing on not being tokenised.

Feel like some detective work?
Flags: needinfo?(benc)
Assignee: nobody → benc
Flags: needinfo?(benc)
try build:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=937af6fd67acee6d681c2252fcc0922545253b08

Turned out to be some naughty code writing directly to an uninitialised/unsized nsAutoCString.

I don't think valgrind would have caught it unless the string went over 64 bytes. Technically, until then it's writing into valid memory. It was the buffer-clobbering-in-debug in SetLength() which highlighted it.

I think an `assert(len>0)` in nsAutoCString::BeginWriting() would have caught it.
Think it's worth proposing such a patch?
Attachment #9011354 - Flags: review?(jorgk)
Hmm try build failed due to some other bustage. Suspect my c-c is out of sync, will try and sort it tomorrow.
Anyway, patch works for me locally (linux), YMMV :-)
change-all-to-nsString.patch looks unnecessary. I recommend against landing it.

fix_uninitialised_string_use.patch looks correct, though less performant than simply injecting one SetLength() without other changes:

 const char* readEnd = aCString.EndReading();
+result.SetLength(aCString.Length());
 char* writeStart = result.BeginWriting();

(In reply to Ben Campbell from comment #12)
> I think an `assert(len>0)` in nsAutoCString::BeginWriting() would have
> caught it.
> Think it's worth proposing such a patch?

That sort of thing might have false positives in cases where BeginWriting() is called but nothing is written because a comparison with Length() or EndWriting() causes a loop not to run.
Group: mail-core-security
This looks to me like a security bug even in releases that didn't have my patches that made the problem visible.
Comment on attachment 9010616 [details] [diff] [review]
change-all-to-nsString.patch

(In reply to Henri Sivonen (:hsivonen) from comment #14)
> change-all-to-nsString.patch looks unnecessary. I recommend against landing
> it.
Of course. I created the patch to rule out misuse of PromiseFlat[C]String. It was never intended for landing. Going back through the bug I see that I hadn't clearly said that.

Thanks for your help, Henri!
Attachment #9010616 - Attachment is obsolete: true
Comment on attachment 9011354 [details] [diff] [review]
fix_uninitialised_string_use.patch

Review of attachment 9011354 [details] [diff] [review]:
-----------------------------------------------------------------

Yes, we should do it the way Henri said. Looking the code, the un-escaped string will be shorter (of course).

::: mailnews/extensions/bayesian-spam-filter/src/nsBayesianFilter.cpp
@@ +844,3 @@
>  }
>  
>  // helper function to escape \n, \t, etc from a CString

that should read: un-escape.
Attachment #9011354 - Flags: review?(jorgk)
(In reply to Jorg K (GMT+2) from comment #17)
> Comment on attachment 9011354 [details] [diff] [review]
> fix_uninitialised_string_use.patch
> 
> Review of attachment 9011354 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Yes, we should do it the way Henri said. Looking the code, the un-escaped
> string will be shorter (of course).

result.SetLength(writeIter - writeStart);
is changed to
result.Truncate(writeIter - writeStart);
it will assert that the string didn't get longer.
OK, I did what Henri said ;-) - A little hard to attribute authorship here. I think the merit goes to Ben for finding the culprit, so I mentioned Henri in the commit message. OK?
Attachment #9011354 - Attachment is obsolete: true
Attachment #9011367 - Flags: review+
Attachment #9011367 - Flags: feedback?(hsivonen)
Comment on attachment 9011367 [details] [diff] [review]
fix_uninitialised_string_use.patch - Henri's ideas

Review of attachment 9011367 [details] [diff] [review]:
-----------------------------------------------------------------

> so I mentioned Henri in the commit message. OK?

OK.

Thanks.
Attachment #9011367 - Flags: feedback?(hsivonen) → feedback+
https://hg.mozilla.org/comm-central/rev/149dde1772d65e32046d704401cda8acc69c4c86
fix use of uninitialised string, ideas by Henri Sivonen. r=jorgk

I'll get this fixed in the ESR.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 64.0
Attachment #9011367 - Flags: approval-comm-esr60+
Attachment #9011367 - Flags: approval-comm-beta+
Group: mail-core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.