Closed
Bug 674488
Opened 13 years ago
Closed 13 years ago
libmime passes badly encoded char* into JS land as AUTF8Strings, results in complete garbage
Categories
(MailNews Core :: MIME, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
Thunderbird 8.0
People
(Reporter: protz, Assigned: protz)
References
Details
Attachments
(3 files, 2 obsolete files)
In mimemsg.cpp:156 we create a "munged subject", i.e. we record the subject of a message/rfc822 part in its MimeHeaders structure so that we can re-use it later to give a meaningful name to the attachment.
In mimemoz2.cpp:251, we use that information to assign a real_name to the nsMsgAttachmentData (e.g. if the attached message's subject was "Check out this lolcat", then the attachment name is "Check out this lolcat.eml"; smart, huh?).
In mimemoz2.cpp:752 we pass the real_name (which is a char*) as a AUTF8String into JS-land through XPConnect. This is the call to mimeEmitterStartAttachment. At this stage, doing printf("%s", tmp->real_name) yields (in my console):
N�o perca esta - confira a fidelidade dos mesmo antes de excluir - N�o � SPAM.eml
In jsmimeemitter.js:343, we obtain the real_name as the first parameter. At this stage, aName is the char* I've been talking about since the beginning.
let re = /[\u0000-\u0008\u000b-\u000c\u000e-\u001f]/g;
dump("\u001b[01;36mName : "+aName+"\u001b[00m\n\n");
dump("\u001b[01;36mName : "+aName.replace(re, "")+"\u001b[00m\n\n");
The first dump yields "N" only. Where it gets interesting is the second line displays:
N㵳灁汰ⵥ瑳汹ⵥ灳湡派牡敫楴杮嘠物污ⴠ朦㭴㰠⁁%40gmail.com@imap.googlemail.com/&Jw0- caml-list
Looks like someone is reading into some random memory location because the ASCII chars near the end of the string are the name of a folder that's not even the one the message is in.
I don't know if this is the regexp engine's fault, or XPConnect, or whatever. I guess on the Thunderbird side, we should just make sure munged_subject is properly encoded.
I'm CCing mrbkap because that does like something suspicious which shouldn't happen.
Assignee | ||
Comment 1•13 years ago
|
||
Assignee | ||
Comment 2•13 years ago
|
||
I'm trying to reproduce from a sample webpage, but I think the problem lies within XPConnect. Here's the result of
for (let i = 0; i < aName.length; ++i)
dump(aName.charCodeAt(i)+", ");
[78, 0, 15731, 28737, 27760, 11621, 29811, 27769, 11621, 28787,
28257, 27966, 29281, 25963, 26996, 26478, 22048, 29289, 27745, 11552, 26406,
15220, 15392, 8257, 10, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 58860, 0,
1200, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0];
Assignee | ||
Comment 3•13 years ago
|
||
David, here's a patch that I think should fix the issue. I'm being very conservative, and I'm refusing to trust the name if there's no charset provided. Given the effects that this entails, I'd tend to lean on the strict side. If there's a charset provided, I try to decode the munged_subject after we've parsed all the headers of the attached message. If the decoding fails, I'm also discarding the munged subject (this results in a "Forwarded Message.eml" attachment).
Let me know what you think about it!
Assignee: nobody → jonathan.protzenko
Status: NEW → ASSIGNED
Attachment #548825 -
Flags: review?(dbienvenu)
Assignee | ||
Comment 4•13 years ago
|
||
Sorry, uploaded the wrong file.
Attachment #548825 -
Attachment is obsolete: true
Attachment #548825 -
Flags: review?(dbienvenu)
Attachment #548827 -
Flags: review?(dbienvenu)
Comment 5•13 years ago
|
||
Comment on attachment 548827 [details] [diff] [review]
The right patch
Looks OK in general - code might be cleaner w.r.t. memory management if you could use nsCStrings and Adopt - it might be nice to do that, though libmime so far does not.
A unit test (mozmill or xpcshell test) would be helpful as well.
Assignee | ||
Comment 6•13 years ago
|
||
I'll write the test. Could you clarify what you have in mind concerning Adopt? I'm not sure where that would be helpful...
Comment 7•13 years ago
|
||
ns[C]String.Adopt assumes management of allocated memory, so you don't have to have explicit PR_Free's. So, this code:
+ char *charset;
+ char *contentType = MimeHeaders_get(msg->hdrs, HEADER_CONTENT_TYPE, PR_FALSE, PR_FALSE);
+ if (contentType) {
+ charset = MimeHeaders_get_parameter(contentType, "charset", nsnull, nsnull);
+ PR_Free(contentType);
+ }
+
+ // If we've got a charset, use nsMsgI18NConvertToUnicode to magically decode
+ // the munged subject.
+ if (charset) {
+ nsresult rv = nsMsgI18NConvertToUnicode(charset, orig, dest);
+ PR_Free(charset);
could be
nsCString charset;
nsCString contentType;
contentType.Adopt(MimeHeaders_get(msg->hdrs, HEADER_CONTENT_TYPE, PR_FALSE, PR_FALSE);
if (!contentType.IsEmpty())
charset.Adopt(MimeHeaders_get_parameter(contentType, "charset", nsnull, nsnull);
// If we've got a charset, use nsMsgI18NConvertToUnicode to magically decode
// the munged subject.
if (!charset.IsEmpty()) {
+ nsresult rv = nsMsgI18NConvertToUnicode(charset, orig, dest);
You could do something similar with munged_subject.
Comment 8•13 years ago
|
||
Comment on attachment 548827 [details] [diff] [review]
The right patch
clearing review request
Attachment #548827 -
Flags: review?(dbienvenu)
Assignee | ||
Comment 9•13 years ago
|
||
Here's an updated patch.
- I'm now using nsCStrings for strings that are local to the new chunk in mimemsg.cpp. Since I'm not managing munged_subject personally, I figured out it would be too much trouble to change all the uses to use the new string API.
- I also wrote a test. I believe it's correct, since dumping all the charCodes from the resulting JS strings returns garbage when the patch is not applied, and returns the correct charCodes when the patch is applied.
Ideally, people should use the MIME syntax for encoding subjects (e.g. =?windows-1252?q?S=E9minaire_des?= doctorants) but having an improperly encoded subject should not send us off the rails like this, so I still think it's worthwhile to give decoding a shot before discarding the subject for the attachment name.
Attachment #548827 -
Attachment is obsolete: true
Attachment #549349 -
Flags: review?(dbienvenu)
Updated•13 years ago
|
Attachment #549349 -
Flags: review?(dbienvenu) → review+
Assignee | ||
Comment 10•13 years ago
|
||
http://hg.mozilla.org/comm-central/rev/6e04b4cf944f
I'm still confused as to why, when printing out the string from js, it would go print some random parts of memory, but well...
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 8.0
You need to log in
before you can comment on or make changes to this bug.
Description
•