Open Bug 126896 Opened 23 years ago Updated 2 years ago

libmime plaintext converters: AssignWithConversion

Categories

(MailNews Core :: Internationalization, defect)

defect

Tracking

(Not tracked)

People

(Reporter: BenB, Unassigned)

Details

(Whiteboard: intl)

Attachments

(2 files)

in the TXT->HTML libmime converters, in the parse_line function, we do
   nsString lineSourceStr;
   lineSourceStr.AssignWithConversion(line, length);
but line is UTF8, not? Shouldn't that be
   nsString lineSourceStr(NS_ConvertUTF8ToUCS2(line, length));
or similar?
mailnews/mime/src/mimetp(la|fl).cpp is the source...
That is right. Does TXT->HTML currently handle characters above 255?
I think there is other bug filed for this code for a performance issue (because
of the nsString allocation). In future, I would like to change to pass
PRUnichar* to the function and delay the UTF-8 conversion after the TXT->HTML
conversion.
Status: NEW → ASSIGNED
> Does TXT->HTML currently handle characters above 255?

Yes, mozTXTToHTMLConv works with nsString / PRUnichar internally and is written
with unicode in mind. (In fact, it will probably break, if you do not use UCS2.)

> In future, I would like to change to pass PRUnichar* to the function and delay
> the UTF-8 conversion after the TXT->HTML conversion.

?
In parse_line, we get char* from libmime and that's what we have to output, too.
>In parse_line, we get char* from libmime and that's what we have to output, too.
What I meant was change that char* to PRUnichar* and convert to UTF-8 after the
TXT->HTML conversion.

Anyway, 
331 nsString lineSourceStr;
332 lineSourceStr.AssignWithConversion(line, length);

corresponds to

483 nsCAutoString lineResultCStr;
484 CopyUCS2toASCII(lineResultUnichar, lineResultCStr);

so both need the change.
Ignoring about the UTF-8 conversion, line 331 can change to use nsAutoString for
performance.
My testing result for MimeInlineTextPlain_parse_line() using Quantify shows that, 

nsString::nsString takes 9.06% of the function while
nsCAutoString::nsCAutoString 0.88%.






Done. Also fixed one unnecessary string copy.

nhotta, please review and assign to me (if you don't mind).

I tried some I18N testcases I got from Rich and I see no difference. All works
fine before (! - no idea why) and after the change.

What do we do with the following in the SaveAs codepath? Are the comments still
true?
 newcstr = ToNewCString(lineSource);
	  // lineSource uses nsString but the string is NOT unicode
[...}
 lineResult.AssignWithConversion(cstr.get());
	  // create nsString which contains NON unicode 
	  // as the following code expecting it
Keywords: mozilla1.0, review
>I tried some I18N testcases I got from Rich and I see no difference. All works
>fine before (! - no idea why) and after the change.
I think this is because the parser does not parse non ASCII characters.

I am going to check performance with the patch because that is important here.


The result of using two mails (bugzilla mail and the same body re-sent as flowed).

columns label:
num of calls, function time(microseconds), F+D time, F time(%), D time(%), Avg F
time, Min F time, Max F time 

1) without the patch
MimeInlineTextPlain_parse_line

13
5.06
5559.86
0.00
0.00
0.39
0.35
0.58


MimeInlineTextPlainFlowed_parse_line

20
9.77
15263.19
0.00
0.00
0.49
0.48
0.60



2) with the patch
MimeInlineTextPlain_parse_line

13
4.85
5333.32
0.00
0.00
0.37
0.33
0.56


MimeInlineTextPlainFlowed_parse_line

20
9.29
7681.36
0.00
0.00
0.46
0.45
0.57


With the patch, F+D time (function and its descendants time) decreased.

Cc to bienvenu since he was looking at the performance issue before.
>What do we do with the following in the SaveAs codepath? 
>Are the comments still true?
No, that part has to change because the input string in case of the SaveAs case
is not in UTF-8 but encoded in mail charset. So the patch would break non ASCII
charactgers for SaveAs.
We can remove that special code by doing either,
1) convert the input string from mail charset to UTF-8 in advance (by the
caller) for SaveAs
2) skip ScanTXT for SaveAs

Ben, could you check who is calling the parse functions for SaveAs?

Comment on attachment 70910 [details] [diff] [review]
Proposed Fix, Version 1

in mimeptfl.cpp, I don't think so it's a good idea to use an nsAutoString
as it's very likely that at least one of
the line of the message exceed 64 characters.  It's better to keep the actual
code which preset the buffer size to 100
characters
That's right. The i18n special case inflate/deflate for itself in mimeptfl.cpp,
so not affected by the patch.
Product: MailNews → Core
Product: Core → MailNews Core
QA Contact: ji → i18n
Assignee: nhottanscp → nobody
Status: ASSIGNED → NEW
Whiteboard: intl
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: