If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

libmime plaintext converters: AssignWithConversion

NEW
Unassigned

Status

MailNews Core
Internationalization
16 years ago
5 years ago

People

(Reporter: BenB, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: intl)

Attachments

(2 attachments)

(Reporter)

Description

16 years ago
in the TXT->HTML libmime converters, in the parse_line function, we do
   nsString lineSourceStr;
   lineSourceStr.AssignWithConversion(line, length);
but line is UTF8, not? Shouldn't that be
   nsString lineSourceStr(NS_ConvertUTF8ToUCS2(line, length));
or similar?
(Reporter)

Comment 1

16 years ago
mailnews/mime/src/mimetp(la|fl).cpp is the source...

Comment 2

16 years ago
That is right. Does TXT->HTML currently handle characters above 255?
I think there is other bug filed for this code for a performance issue (because
of the nsString allocation). In future, I would like to change to pass
PRUnichar* to the function and delay the UTF-8 conversion after the TXT->HTML
conversion.
Status: NEW → ASSIGNED
(Reporter)

Comment 3

16 years ago
> Does TXT->HTML currently handle characters above 255?

Yes, mozTXTToHTMLConv works with nsString / PRUnichar internally and is written
with unicode in mind. (In fact, it will probably break, if you do not use UCS2.)

> In future, I would like to change to pass PRUnichar* to the function and delay
> the UTF-8 conversion after the TXT->HTML conversion.

?
In parse_line, we get char* from libmime and that's what we have to output, too.

Comment 4

16 years ago
>In parse_line, we get char* from libmime and that's what we have to output, too.
What I meant was change that char* to PRUnichar* and convert to UTF-8 after the
TXT->HTML conversion.

Anyway, 
331 nsString lineSourceStr;
332 lineSourceStr.AssignWithConversion(line, length);

corresponds to

483 nsCAutoString lineResultCStr;
484 CopyUCS2toASCII(lineResultUnichar, lineResultCStr);

so both need the change.
Ignoring about the UTF-8 conversion, line 331 can change to use nsAutoString for
performance.
My testing result for MimeInlineTextPlain_parse_line() using Quantify shows that, 

nsString::nsString takes 9.06% of the function while
nsCAutoString::nsCAutoString 0.88%.






(Reporter)

Comment 5

16 years ago
Created attachment 70910 [details] [diff] [review]
Proposed Fix, Version 1

Done. Also fixed one unnecessary string copy.

nhotta, please review and assign to me (if you don't mind).

I tried some I18N testcases I got from Rich and I see no difference. All works
fine before (! - no idea why) and after the change.

What do we do with the following in the SaveAs codepath? Are the comments still
true?
 newcstr = ToNewCString(lineSource);
	  // lineSource uses nsString but the string is NOT unicode
[...}
 lineResult.AssignWithConversion(cstr.get());
	  // create nsString which contains NON unicode 
	  // as the following code expecting it
(Reporter)

Updated

16 years ago
Keywords: mozilla1.0, review

Comment 6

16 years ago
>I tried some I18N testcases I got from Rich and I see no difference. All works
>fine before (! - no idea why) and after the change.
I think this is because the parser does not parse non ASCII characters.

I am going to check performance with the patch because that is important here.


Comment 7

16 years ago
The result of using two mails (bugzilla mail and the same body re-sent as flowed).

columns label:
num of calls, function time(microseconds), F+D time, F time(%), D time(%), Avg F
time, Min F time, Max F time 

1) without the patch
MimeInlineTextPlain_parse_line

13
5.06
5559.86
0.00
0.00
0.39
0.35
0.58


MimeInlineTextPlainFlowed_parse_line

20
9.77
15263.19
0.00
0.00
0.49
0.48
0.60



2) with the patch
MimeInlineTextPlain_parse_line

13
4.85
5333.32
0.00
0.00
0.37
0.33
0.56


MimeInlineTextPlainFlowed_parse_line

20
9.29
7681.36
0.00
0.00
0.46
0.45
0.57


With the patch, F+D time (function and its descendants time) decreased.

Cc to bienvenu since he was looking at the performance issue before.

Comment 8

16 years ago
Created attachment 71367 [details]
Two plain text message used for the performance measurement.

Comment 9

16 years ago
>What do we do with the following in the SaveAs codepath? 
>Are the comments still true?
No, that part has to change because the input string in case of the SaveAs case
is not in UTF-8 but encoded in mail charset. So the patch would break non ASCII
charactgers for SaveAs.
We can remove that special code by doing either,
1) convert the input string from mail charset to UTF-8 in advance (by the
caller) for SaveAs
2) skip ScanTXT for SaveAs

Ben, could you check who is calling the parse functions for SaveAs?

Comment on attachment 70910 [details] [diff] [review]
Proposed Fix, Version 1

in mimeptfl.cpp, I don't think so it's a good idea to use an nsAutoString
as it's very likely that at least one of
the line of the message exceed 64 characters.  It's better to keep the actual
code which preset the buffer size to 100
characters

Comment 11

16 years ago
That's right. The i18n special case inflate/deflate for itself in mimeptfl.cpp,
so not affected by the patch.
Product: MailNews → Core
(Assignee)

Updated

9 years ago
Product: Core → MailNews Core
QA Contact: ji → i18n
Assignee: nhottanscp → nobody
Status: ASSIGNED → NEW
Whiteboard: intl
You need to log in before you can comment on or make changes to this bug.