Detect malformed messages in MboxMsgInputStream and collect telemetry
Categories
(MailNews Core :: General, task)
Tracking
(thunderbird_esr128? fixed, thunderbird132 fixed, thunderbird133 fixed)
People
(Reporter: benc, Assigned: benc)
References
Details
Attachments
(5 files, 1 obsolete file)
48 bytes,
text/x-phabricator-request
|
corey
:
approval-comm-beta+
corey
:
approval-comm-release+
corey
:
approval-comm-esr128+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
corey
:
approval-comm-beta+
corey
:
approval-comm-release+
corey
:
approval-comm-esr128+
|
Details | Review |
2.70 KB,
text/plain
|
sancus
:
data-review+
|
Details |
48 bytes,
text/x-phabricator-request
|
corey
:
approval-comm-beta+
corey
:
approval-comm-release+
corey
:
approval-comm-esr128-
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
corey
:
approval-comm-esr128+
|
Details | Review |
If a message in an mbox is bad (either due to damage to the mbox or a bad storeToken
in the msgDBHdr), the mbox reader (MboxMsgInputStream) should spot that the reading does not start at a "From " line and:
- log an error
- report via telemetry that badness has been detected
- fail (probably. The mbox code currently has a tolerant-while-reading, strict-while-writing policy. Reading should probably be less easy-going).
Comment 2•3 months ago
|
||
I suggest that also inform the user.
Since you have limited this bug to backend behavior and telemetry, I suggest we use bug 1926832 for the UI part.
Comment 3•3 months ago
|
||
(In reply to Ben Campbell from comment #0)
If a message in an mbox is bad (either due to damage to the mbox or a bad
storeToken
in the msgDBHdr), the mbox reader (MboxMsgInputStream) should spot that the reading does not start at a "From " line and:
I'm aware of one more scenario where we have corruption. (Maybe there are more?)
The beginning of the message might be correct, and pass the above test.
However, the end of the message might be truncated.
As I understand it, we cannot rely on stored messages sizes to detect whether a message is incomplete.
I think that means, only for MIME messages with the topmost content-type being any multipart variaction, we have a way to reliably detect truncation. If the message is missing the terminating multipart boundary, we know the message is corrupted.
(If the outermost part isn't multipart, but is e.g. base64 encoded, we could potentially detect some truncation. However, if the truncation happens after a valid 4-byte base64 block, we could falsely assume the message is complete.)
Comment 4•3 months ago
|
||
To check that a message starts with "From " at the storeToken offset, maybe nsMsgDBFolder::GetMsgInputStream is a good place to check that.
That function returns a stream beginning at that position. I believe we cannot peek into the stream.
Potential approach:
- construct the stream
- read the first line from the stream (or the first bytes)
- check that we get the expected "From " at the beginning
- if no match, pass that information on, or record it somewhere for later consumption
- destroy the stream
- construct the stream again
- return this second stream
Does that make sense?
Comment 5•3 months ago
|
||
I have investigated how to check for the MIME incompleteness.
I've learned that we're bootstrapping MIME processing in MimeMessage_close_headers(), mimemsg.cpp
The function will call mime_create to construct the right MIME structure handler, will execute its processing (parsing).
In the multipart implementation, in mimemult.cpp, we could track whether the terminator MimeMultipartBoundaryTypeTerminator is found or not. (And we only need to do that for the topmost MIME part.) That object could store that result somewhere. We could potentially add a new attribute "isTruncated" to MimeObject.
After mime part parsing/processing is done (and having returned to MimeMessage_close_headers, we can check whether the flag isTruncated was set.
The code in MimeMessage_close_headers already passes on other information to the message window. It does so by SetMailCharacterSetToMsgWindow, which sets a charset attribute in the associated nsMailChannel.
We could do the same, for the new isTruncated flag.
All of the above is the backend code that should be done in this bug here.
The message window code will eventually arrive in onDOMContentLoaded in msgHdrView.js. Today it already reads the channel.mailCharacterSet attribute. In the future, it could also read channel.isTruncated and trigger the desired type of notification in the window. That UI work should probably be done in a separate step in bug 1926832.
Assignee | ||
Comment 6•3 months ago
|
||
(In reply to Kai Engert (:KaiE:) from comment #4)
That function returns a stream beginning at that position. I believe we cannot peek into the stream.
It can be much simpler. We're already parsing the stream (in order to parse the From line, header block, then handle transforming the body to remove any escaped lines).
So the already parser knows if there's a missing "From " line.
I wrote the parser to be as accepting as possible of dud data (i.e. be really strict about what you write, but accepting of what you read).
But really, it should just fail the Read() if the "From " line is missing.
So that's what I'll do now.
I'm thinking of adding a specific error code to return (I'm 90% sure we can do that), so that layers further up can turn it into a user warning without re-coupling UI to the low-level stuff.
But for a start I'm just aiming to make it fail, log a warning and add telemetry.
Assignee | ||
Comment 7•3 months ago
|
||
(In reply to Kai Engert (:KaiE:) from comment #5)
In the multipart implementation, in mimemult.cpp, we could track whether the terminator MimeMultipartBoundaryTypeTerminator is found or not. (And we only need to do that for the topmost MIME part.) That object could store that result somewhere. We could potentially add a new attribute "isTruncated" to MimeObject.
Shouldn't misformed Mime structure just be a standard failure? That way it just falls out in the standard error handling (assuming mime errors are caught somewhere :-).
You can always define a specific error code (I think!) so that code further up can handle it (akin to your isTruncated attr) but I don't see why it should be treated any differently to other mime errors.
After all, you don't need any folder corruption to trigger this - you can easily craft a malicious message with crappy mime data.
That UI work should probably be done in a separate step in bug 1926832.
Makes sense to me!
Comment 8•3 months ago
|
||
(In reply to Ben Campbell from comment #7)
Shouldn't misformed Mime structure just be a standard failure? That way it just falls out in the standard error handling (assuming mime errors are caught somewhere :-).
The MIME processing uses a streaming approach and is lenient.
If you wanted to prevent any MIME structure with missing ending to be completely rejected, you'd require a two-phase approach. In phase one, you'd have to parse the data completely (and remember it), and only after you ensure correct structure (including the terminator), you allow rendering, and run through processing (again).
As I understand it, the current MIME implementation uses a single-pass streaming approach.
It will process and render contents as it's being processed. The missing terminator would be detected only after the output has already been passed on to consumers.
Assignee | ||
Comment 9•3 months ago
|
||
Assignee | ||
Comment 10•3 months ago
|
||
Assignee | ||
Updated•3 months ago
|
Assignee | ||
Updated•3 months ago
|
Updated•3 months ago
|
Comment 11•3 months ago
|
||
Pushed by john@thunderbird.net:
https://hg.mozilla.org/comm-central/rev/49b86b626ce9
Modernise test_downloadOffline.js. r=mkmelin
Updated•3 months ago
|
Assignee | ||
Comment 12•3 months ago
|
||
Assignee | ||
Comment 13•3 months ago
|
||
Comment 14•3 months ago
|
||
Comment on attachment 9434697 [details]
1926810-data-request.md
Approved.
- Is there Documentation?
- This collection is documented in the Glean Dictionary at https://dictionary.telemetry.mozilla.org/
- https://hg.mozilla.org/comm-central/mail/metrics.yaml
- Is there a control mechanism that allows the user to turn this data collection off?
Yes: Standard telemetry preferences.
If the request is for permanent data collection, is there someone who will monitor the data over time?
Yes, Ben Campbell.
Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?
Category 1: Technical Data.
Is the data collection request for default-on or default-off?
default-on
Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?
No.
Is the data collection covered by the existing Thunderbird privacy notice? If unsure: escalate to legal if:
Yes.
Does the data collection use a third-party collection tool? If yes, escalate to legal.
No.
Assignee | ||
Updated•3 months ago
|
Comment 15•3 months ago
|
||
Pushed by john@thunderbird.net:
https://hg.mozilla.org/comm-central/rev/e4fbff3c7f6c
Fail mbox reads if offset into mbox file doesn't start at a "From " separator. r=mkmelin
https://hg.mozilla.org/comm-central/rev/a4cd40367993
Add telemetry to report potential mbox corruption. r=mkmelin, sancus
Assignee | ||
Updated•3 months ago
|
Updated•3 months ago
|
Assignee | ||
Comment 16•3 months ago
|
||
Comment on attachment 9434154 [details]
Bug 1926810 - Modernise test_downloadOffline.js. r=#thunderbird-reviewers
This one just updates an xpcshell-test and so isn't critical, but the other patches build on it, so probably easiest to uplift them all.
[Approval Request Comment]
Regression caused by (bug #): Part of the push for Bug 1890230
User impact if declined: none
Testing completed (on c-c, etc.): just c-c
Risk to taking this patch (and alternatives if risky): low
Assignee | ||
Comment 17•3 months ago
|
||
Comment on attachment 9434159 [details]
Bug 1926810 - Fail mbox reads if offset into mbox file doesn't start at a "From " separator. r=#thunderbird-reviewers
[Approval Request Comment]
Regression caused by (bug #): Part of the push for Bug 1890230
User impact if declined: Continued propagation of existing damaged messages.
Testing completed (on c-c, etc.): just c-c
Risk to taking this patch (and alternatives if risky): low
Assignee | ||
Comment 18•3 months ago
|
||
Comment on attachment 9434699 [details]
Bug 1926810 - Add telemetry to report potential mbox corruption. r=#thunderbird-reviewers
[Approval Request Comment]
Regression caused by (bug #): Part of the push for Bug 1890230
User impact if declined: Lack of early-warning system for corruption in the wild, and inability to accurately gauge effectiveness of fixes.
Testing completed (on c-c, etc.): just c-c
Risk to taking this patch (and alternatives if risky): low
Assignee | ||
Updated•3 months ago
|
Assignee | ||
Updated•3 months ago
|
Assignee | ||
Updated•3 months ago
|
Assignee | ||
Comment 19•3 months ago
|
||
Doh. Sorry - didn't realise I could set uplift request flags for both versions at once.
Comment 20•3 months ago
|
||
Comment on attachment 9434154 [details]
Bug 1926810 - Modernise test_downloadOffline.js. r=#thunderbird-reviewers
[Triage Comment]
Approved for beta
Comment 21•3 months ago
|
||
Comment on attachment 9434159 [details]
Bug 1926810 - Fail mbox reads if offset into mbox file doesn't start at a "From " separator. r=#thunderbird-reviewers
[Triage Comment]
Approved for beta
Comment 22•3 months ago
|
||
Comment on attachment 9434699 [details]
Bug 1926810 - Add telemetry to report potential mbox corruption. r=#thunderbird-reviewers
[Triage Comment]
Approved for beta
Comment 23•3 months ago
|
||
bugherder uplift |
Thunderbird 133.0b3:
https://hg.mozilla.org/releases/comm-beta/rev/92a341eab4d2
https://hg.mozilla.org/releases/comm-beta/rev/fad2bd2c6f71
https://hg.mozilla.org/releases/comm-beta/rev/c708adecd173
Comment 24•3 months ago
|
||
Port of https://hg.mozilla.org/comm-central/rev/a4cd40367993075816de5667d5911ef7ebc3e286 but with Telemetry instead of Glean.
Comment 25•3 months ago
|
||
Comment on attachment 9434154 [details]
Bug 1926810 - Modernise test_downloadOffline.js. r=#thunderbird-reviewers
[Triage Comment]
Approved for esr128
Comment 26•3 months ago
|
||
Comment on attachment 9434159 [details]
Bug 1926810 - Fail mbox reads if offset into mbox file doesn't start at a "From " separator. r=#thunderbird-reviewers
[Triage Comment]
Approved for esr128
Comment 27•3 months ago
|
||
Comment on attachment 9436297 [details]
Bug 1926810 - [ESR128] Add telemetry to report potential mbox corruption. r=#thunderbird-reviewers
[Triage Comment]
Approved for esr128
Comment 28•3 months ago
|
||
Comment on attachment 9434699 [details]
Bug 1926810 - Add telemetry to report potential mbox corruption. r=#thunderbird-reviewers
[Triage Comment]
I've approved the esr128 specific patch. Thanks!
Comment 29•3 months ago
|
||
Comment on attachment 9434154 [details]
Bug 1926810 - Modernise test_downloadOffline.js. r=#thunderbird-reviewers
[Triage Comment]
Approved for release
Comment 30•3 months ago
|
||
Comment on attachment 9434159 [details]
Bug 1926810 - Fail mbox reads if offset into mbox file doesn't start at a "From " separator. r=#thunderbird-reviewers
[Triage Comment]
Approved for release
Comment 31•3 months ago
|
||
Comment on attachment 9434699 [details]
Bug 1926810 - Add telemetry to report potential mbox corruption. r=#thunderbird-reviewers
[Triage Comment]
Approved for release
Comment 32•3 months ago
|
||
bugherder uplift |
Comment 33•3 months ago
|
||
bugherder uplift |
Thunderbird 132.0.1:
https://hg.mozilla.org/releases/comm-release/rev/c5613672388a
https://hg.mozilla.org/releases/comm-release/rev/36f3d52faddd
https://hg.mozilla.org/releases/comm-release/rev/abbdcce25267
Assignee | ||
Updated•3 months ago
|
Assignee | ||
Comment 34•3 months ago
|
||
Just discovered an extra little quirk to address:
If you Seek()
past the end of a filestream, that's not an error.
And if you call Read()
while positioned past the end, that doesn't cause an error either. It just returns an EOF.
But we want the mbox to return an error for such cases.
Assignee | ||
Comment 35•3 months ago
|
||
nsFileInputStream will happily let you Seek() past the end, and Read() from there, all without causing an error.
The Read() just returns 0 bytes of data (indicating EOF).
So this patch just tweaks the mbox parser to require that there is a
message at the given position. Previously there was an exception to allow
nsMsgBrkMBoxStore::AsyncScan() to handle the corner case of an empty mbox
file. But AsyncScan() works fine with this patch, so we're OK.
Updated•3 months ago
|
Comment 36•3 months ago
|
||
Comment on attachment 9437637 [details]
Bug 1926810 - Make sure reading messages from beyond the end of an mbox file throws a read error. r=#thunderbird-reviewers
Revision D228941 was moved to bug 1931258. Setting attachment 9437637 [details] to obsolete.
Assignee | ||
Comment 37•3 months ago
|
||
Sorry - moving that to Bug 1931258 and reclosing this one.
Description
•