Deleting an attachment with \n is O(N^2)

RESOLVED FIXED

Status

MailNews Core
Backend
RESOLVED FIXED
11 years ago
9 years ago

People

(Reporter: Andrew Schultz, Assigned: Andrew Schultz)

Tracking

({perf})

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(2 attachments, 2 obsolete attachments)

(Assignee)

Description

11 years ago
Created attachment 250299 [details]
stacktrace

I tried to delete a 10MB attachment with current trunk and it took 5 or 10 minutes.  The code

1. looks through the whole attachment for \r (fails)
2. looks for a \n (succeeds)
3. advances past the \n
4. looks through the rest of the whole attach for \r (fails!)

It should just use PL_strnpbrk so it can look for both characters at once.
(Assignee)

Comment 1

11 years ago
Created attachment 250302 [details] [diff] [review]
use PL_strnpbrk once instead of memchr twice
Attachment #250302 - Flags: superreview?(bienvenu)
Attachment #250302 - Flags: review?(bienvenu)

Comment 2

11 years ago
Comment on attachment 250302 [details] [diff] [review]
use PL_strnpbrk once instead of memchr twice

looks good, thx for the patch!
Attachment #250302 - Flags: superreview?(bienvenu)
Attachment #250302 - Flags: superreview+
Attachment #250302 - Flags: review?(bienvenu)
Attachment #250302 - Flags: review+
(Assignee)

Comment 3

11 years ago
FIXED
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED

Comment 4

11 years ago
this caused a regression - bug 366151. My inclination is to back this out since it's an easy patch to reconstruct, and the regression is so severe...

Comment 5

11 years ago
re-opening - I backed this out. I don't have time to figure out what went wrong with the patch right this minute - my guess is that it didn't handle something like a line with just a linebreak, i.e., an empty line, because that's what triggers the parsing of the headers, and that parsing is never triggered.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Comment 6

11 years ago
Comment on attachment 250302 [details] [diff] [review]
use PL_strnpbrk once instead of memchr twice

>+    end = PL_strnprbrk(start, "\r\n", endBuffer - start + 1);
For a start this needs to be PL_strnpbrk (both times!), but this function is truly awful, I feel like submitting it to http://thedailywtf.com/, apart from the wiry flow control the two bugs I spotted were a) >From lines are forced to end with MSG_LINEBREAK while the rest is copied literally b) no check for a CRLF split over two reads (although I can't see an easy cure for the latter though).
(Assignee)

Comment 7

11 years ago
> For a start this needs to be PL_strnpbrk (both times!)

right!  :(

> a) >From lines are forced to end with MSG_LINEBREAK while the rest is copied 
> literally

Why is >From special cased?  Is the sole purpose of that block to prepend ">" to the line?  

Why not just do
  mCopyState->m_fileStream->write(">", 1);
?

> b) no check for a CRLF split over two reads

Actually, this is easy to fix.  Just tell PL_strnpbrk to stop short of the last character.
Status: REOPENED → ASSIGNED
(Assignee)

Comment 8

11 years ago
Created attachment 251429 [details] [diff] [review]
patch v2

This
1. uses PL_strnpbrk instead of PL_strnrprbrk
2. uses memmove instead of memcpy since the memory regions could (potentially) be overlapping)
3. doesn't leak memory on realloc OOM
4. doesn't special-case "From " lines so much, using Neil's (clever but evil) *--start = '>'
5. ensures there are 2 extra bytes in the buffer... the old code ensured 0 and used 1.

It doesn't handle CRLF that split over multiple calls.  As it is, the method doesn't know if it will be called again.  My only idea would be to assume that the buffer has the same type of line breaks everywhere.
Attachment #250302 - Attachment is obsolete: true
Attachment #251429 - Flags: superreview?(bienvenu)
Attachment #251429 - Flags: review?(bienvenu)
(Assignee)

Comment 9

11 years ago
Created attachment 251430 [details] [diff] [review]
patch v2.1

forgot to save before attaching
Attachment #251429 - Attachment is obsolete: true
Attachment #251430 - Flags: superreview?(bienvenu)
Attachment #251430 - Flags: review?(bienvenu)
Attachment #251429 - Flags: superreview?(bienvenu)
Attachment #251429 - Flags: review?(bienvenu)

Comment 10

11 years ago
Did the old code handle CRLF's split over multiple calls?  And what happens with the new code when CRLF's span blocks? Will we corrupt messages, and/or attachments not getting stripped?
(Assignee)

Comment 11

11 years ago
The old code also did not handle that case.  With both new and old code, the data written to m_fileStream is correct.  The problem is that 
  mCopyState->m_parseMsgState->ParseAFolderLine(start, lineLength);
receives the following:

text textCRLF
text textCRLF
text textCR
LF
text textCRLF

It looks like nsParseMailMessageState::ParseFolderLine might
1. Think the headers were finished if the break happened while still reading headers (and ignore the rest of the headers).
2. Think there were 1 too many lines in the body if the headers had already been read.

And for the 10MB attachment I'm looking at now, the attachment doesn't get deleted anyway.  :(  That's bug 350803.  I'd guess that if the attachment deleting code was actually working, it wouldn't be invoking this method to copy the attachment.

Comment 12

11 years ago
OK, thx, Andrew, I'll give this patch a try

Comment 13

11 years ago
Comment on attachment 251430 [details] [diff] [review]
patch v2.1

OK, I've convinced myself that the *--start = '>' thing is OK - but a couple comments in the code might help:-) Something to the effect that if the start of the block is "From ", we've allowed space at the beginning of mCopyState->m_dataBuffer, and if the "From " is in the middle of the block, we've already written out the previous line, so it's OK that we're overwriting the end of the previous line.
Attachment #251430 - Flags: superreview?(bienvenu)
Attachment #251430 - Flags: superreview+
Attachment #251430 - Flags: review?(bienvenu)
Attachment #251430 - Flags: review+
(Assignee)

Comment 14

11 years ago
OK.  landed with an additional comment and actually doing #5 from comment 8 (the comment was correct in the attached patch, but the code only reserved 1 byte).
Status: ASSIGNED → RESOLVED
Last Resolved: 11 years ago11 years ago
Resolution: --- → FIXED

Comment 15

11 years ago
(In reply to comment #8)
>5. ensures there are 2 extra bytes in the buffer... the old code ensured 0 and used 1.
Well, it wrote it, but I'm not sure it ever read it again ;-)
[My guess is that it's useful for debugging]
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.