Last Comment Bug 362512 - apply_rfc2047_encoding has issues with really long headers
: apply_rfc2047_encoding has issues with really long headers
: fixed1.8.0.9, fixed1.8.1.1
Product: MailNews Core
Classification: Components
Component: MIME (show other bugs)
: Trunk
: x86 Windows XP
-- normal (vote)
: ---
Assigned To: David :Bienvenu
Depends on:
  Show dependency treegraph
Reported: 2006-12-01 10:30 PST by David :Bienvenu
Modified: 2008-07-31 04:30 PDT (History)
3 users (show)
dveditz: blocking1.8.1.1+
dveditz: blocking1.8.0.9+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---

proposed fix (1.44 KB, patch)
2006-12-01 10:47 PST, David :Bienvenu
mscott: superreview+
dveditz: approval1.8.0.9+
Details | Diff | Splinter Review

Description User image David :Bienvenu 2006-12-01 10:30:14 PST
When generating rfc2047 encoded headers, we're not calculating the size of the buffer we'll need correctly, which produces the predictable nasty issues.
Comment 1 User image David :Bienvenu 2006-12-01 10:47:01 PST
Created attachment 247198 [details] [diff] [review]
proposed fix

This fixes the calculation to account for the multiple lines we'll generate. Given a utf8 decoded header, we'll generate something like this:

Subject: Re:

So I basically count up the per-line overhead, and the max chars per line, and calculate how big our buffer has to be. I think this calculation is a lot more conservative than the old one, but I'll have to run with it a bit to make sure.

The code downstream that crunches the passed in buffer also needs to be fixed.
Comment 2 User image Scott MacGregor 2006-12-01 10:54:20 PST
Comment on attachment 247198 [details] [diff] [review]
proposed fix

2 libmime bug patches in a week David, lucky you!
Comment 3 User image David :Bienvenu 2006-12-01 11:26:58 PST
fixed on trunk and branch.
Comment 4 User image David :Bienvenu 2006-12-01 13:38:31 PST
Comment on attachment 247198 [details] [diff] [review]
proposed fix

nominating for - might be useful to get users who get a lot of rfc 2047 encoded headers (e.g., mozilla Japan) to try this out.
Comment 5 User image David :Bienvenu 2006-12-01 13:58:32 PST
Masayuki-san and Kohei-san, I was wondering if you could test this fix on folders that contain a lot of rfc-2047 encoded headers. The particular test case I was fixing had to do with a message with a Re: subject which was rfc 2047 encoded - when parsing the headers for the db, we were crashing because the header was 6 lines long. I changed the logic for calculating the length, but I don't have a lot of test cases - I thought you might have more examples you could test this against, by deleting the .msf files for some folders and having them get regenerated.

I've checked this fix into the trunk and 1.8.1 branch, so tomorrow's builds should have the change. We're thinking of taking this into, which comes out soon.

thx for any help you can give!

- David
Comment 6 User image Daniel Veditz [:dveditz] 2006-12-01 14:25:35 PST
Please file a bug to investigate/fix the downstream code you mention in comment 1
Comment 7 User image David :Bienvenu 2006-12-01 14:30:43 PST
Bug 362542 filed to do follow-up investigation
Comment 8 User image Daniel Veditz [:dveditz] 2006-12-01 14:32:09 PST
CC'ing Georgi: he found the other recent MIME allocation error and this might give him more ideas of places to look -- happy hunting!
Comment 9 User image Daniel Veditz [:dveditz] 2006-12-01 14:38:16 PST
Comment on attachment 247198 [details] [diff] [review]
proposed fix

approved for 1.8.0 branch, a=dveditz for drivers
Comment 10 User image georgi - hopefully not receiving bugspam 2006-12-04 01:28:22 PST
this code seems complex.
PRInt32 perLineOverhead = strlen(charset) + 10;
if (perLineOverhead > foldlen) ... return;
PRInt32 charsPerLine = (foldlen - perLineOverhead) / 4;
[1] RInt32 maxNumLines = (strlen(src) / charsPerLine) + 1;

is |charsPerLine| guaranteed to be nonzero ( |foldlen == perLineOverhead|) causing division by 0 in [1] ?
Comment 11 User image David :Bienvenu 2006-12-04 08:16:35 PST
I tried to make the code clearer by making the calculations explicit. Does that make it seem more complicated? Or is it the calculation itself that seems complicated?

by default, foldlen  72, and the max charset string len is supposed to be 60, so in general, foldlen will be greater than the per-line overhead. But I don't know if there's any kind of enforcement of the max charset string (i.e., can a malicious message define an arbitrarily long charset for a header, and end up in this code?) I don't know enough about this code to know if that's possible.
Comment 12 User image Masayuki Nakano [:masayuki] 2006-12-06 22:58:19 PST

Thank you for the CCing, and sorry for the delay.
But I don't understand well. What should we test?
Comment 13 User image David :Bienvenu 2006-12-07 07:15:07 PST
Masayuki-san, the main things to test are generation of rfc 2047 headers when the header itself is very long, e.g., a very long subject. It turns out we also run through this code when you read a message with a re: subject, and it's rfc 2047 encoded.

Note You need to log in before you can comment on or make changes to this bug.