insert linefeeds into messages so that saving as draft and sending will always succeed

VERIFIED FIXED in mozilla0.9.2

Status

MailNews Core
Backend
P1
normal
VERIFIED FIXED
17 years ago
10 years ago

People

(Reporter: (not reading, please use seth@sspitzer.org instead), Assigned: (not reading, please use seth@sspitzer.org instead))

Tracking

Trunk
mozilla0.9.2

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [nsbeta1+][PDT+])

Attachments

(6 attachments)

quoting sfraser:

"I think the content serializer should have a flag that enforces a line length 
(which, as seth should know, should be < 1000 octets so that NNTP works)."

with out this, long lines will cause saving as draft, copying to sent folder, 
posting emails and sending emails to fail.

I'll start investigating a fix.

Comment 1

17 years ago
putting into 0.9.2.  
Keywords: nsbeta1
Priority: -- → P1
Summary: content serializer should have a flag that enforces a line length → content serializer should have a flag that enforces a line length
Whiteboard: [nsbeta1+]
Target Milestone: --- → mozilla0.9.2
OS: Windows 2000 → All
QA Contact: esther → stephend
Hardware: PC → All

Comment 2

17 years ago
adding PDT+
Whiteboard: [nsbeta1+] → [nsbeta1+][PDT+]
spun off from bug #83381

investigating this now...
Status: NEW → ASSIGNED
after investigating, there are some issues that are making me lean towards 
putting the fix in the mailnews code and not changing the content serializer.

please comment.

in the mailnews code, after we've gotten the body from editor (by calling 
nsIEditorShell::GetContentsAs()], we tweak the body.

the tweaks I see are:

1) structure / link conversion.

this is everybody's favorite code that turns *foo* into <b>foo</b> and 
http://foo.com into <a href="http://foo.com">http://foo.com</a>

2) charset conversions

I'm not sure if the charset conversions can alter the body's length, but I know 
the first one will.  that's important because if I were to fix the html content 
serializer to have linebreaks every x bytes, the post processing we do could 
alter the body and break us.

here's what I propose to do:

I'll add some code that will check the "final" body, and insert linebreaks if 
necessary.  this code will only need to be called in two places in 
nsMsgCompose.cpp and nsMsgSend.cpp, after we've gotten and tweaked the body.

this approach is simpler (and less risky)  since I won't have to keep track of 
the number of chars since the linebreak in nsHTMLContentSerializer.  that gets 
a little hairy, for example, when the message has links.  we over cross into 
nsXMLContentSerializer::SerializeAttr().

the code I plan to add will scan the final body, making sure there is a 
linefeed every so many bytes.  in most cases, we should be ok and no action 
will be required.  when action is required, we'll pay the price with string 
copy and I'll insert the linebreak.

while I wait for feedback, I'll start working on a fix.

I think this fix will be low risk / high reward enough to take it for 0.9.2
I've attached a patch.

this patch will we'll insert CRLFs if necessary.  in the current test case 
(HTML reply to a message with a large <pre> block) we insert the CRLFs and I'm 
able to save (as draft) and send the message.

ducarroz, mscott please review.

I'm also looking for feedback about the decision to do this in mailnews, 
instead of in the HTML content serializer.

Comment 7

17 years ago
Yes, charset conversions alter the length. It also true for the HTML entities
like &nbsp;.

Comment 8

17 years ago
Comments:
1. This patch looks like it will always strdup the body (though I guess it was
   being copied before your patch). It's unforunate that this copy has to take
   place if no modifications are needed. If you have to insert CRLF, then
   you'll allocate and copy the entire string twice.

2. It looks like the old copy code did not assume a nul-terminated
   'attachment1 [details] [diff] [review]_body' string (since it passes the length), but your new PL_strdup
   call does.

3. +  // XXX TODO 
   +  // march backwards and determine the "best" place for the CRLF
   How important do you think it is to do this? Currently, your code
   may cause unexpected linebreaks.

4. Can you always assume that the string has CRLF breaks at this point?

5. In the
   +    if ((body[i] == nsCRT::CR) && (body[i+1] == nsCRT::LF)) {
   clause, don't you want to ++i to account for the LF?

6. +        body[i] = '\0';
   +        newBody.Append(body+lastPos);
   +        newBody.Append(temp);

   Rather than nuking chars in body, can't you just append with a length?
to answer point 1. The copy is needed because we don't have control on the life
of the original data! That's why we are keeping our own internal copy. Vive mime :-)

Simon is right about point 2, you cannot presume the buffer received would be
always null terminated. Please use a strndup instead.
Comments:
> 1. This patch looks like it will always strdup the body (though I guess it was
>  being copied before your patch). It's unforunate that this copy has to take
>  place if no modifications are needed. If you have to insert CRLF, then
>  you'll allocate and copy the entire string twice.

as ducarroz pointed out, we already have to do the copy (without rewriting more 
code to do less copies.)  yes, to insert a CRLF will cause another copy.

> 2. It looks like the old copy code did not assume a nul-terminated
>   'attachment1 [details] [diff] [review]_body' string (since it passes the length), but your new 
> PL_strdup  call does.

since it used memcpy(), I worried about that.  but I checked:

the length (attachment1 [details] [diff] [review]_body_length) is achieved with PL_strlen().

line 4263, nsMsgSend.cpp:

 attachment1 [details] [diff] [review]_body_length = PL_strlen(attachment1 [details] [diff] [review]_body);

and line 630, nsMsgCompose.cpp:

bodyLength = PL_strlen(bodyString);

> +  // XXX TODO 
> +  // march backwards and determine the "best" place for the CRLF
> How important do you think it is to do this? Currently, your code
> may cause unexpected linebreaks.

when this bug bites you, it is most likely going to be when quoting large <pre> 
blocks.  since we removed newlines from the original <pre> block, there is 
going to be some unexpected line breaks.  jfrancis logged a bug on karnaze to 
get the ball rolling.  I think he depends on layout fixing something before he 
can remove his "\n" -> <br> code (to work around cursor placement.)

I don't want to spend too many cycles on a work around to a work around.

I'd rather fix the send / save problems first and worry about the bad CRLF

> 4. Can you always assume that the string has CRLF breaks at this point?

good question.  I don't know for sure.  I better go check what happens on mac 
and linux.

> 5. In the if ((body[i] == nsCRT::CR) && (body[i+1] == nsCRT::LF)) {
> clause, don't you want to ++i to account for the LF?

I could, but it would be a trade:  one compare for one add.

after finding CRLF, we know that in the next iteration of the loop, body[i] 
will be LF.   

or am I missing something?


> 6. +        body[i] = '\0';
>    +        newBody.Append(body+lastPos);
>    +        newBody.Append(temp);
>
>   Rather than nuking chars in body, can't you just append with a length?

I didn't know there was a string API for that.  looks like there is, I'll 
rewrite to use it.

thanks for the feedback, sfraser.
the second patch is the same as the first, except I've addressed sfraser's 
issue #6.

I'm still investigating the CRLF (on all platforms) issue.

Comment 13

17 years ago
small nit: can we get rid of the nsCAutoString and make it a nsCString. Why
waste the extra copying. Messages bodies are most frequently going to be larger
than 29 bytes (or whatever the initial array allocation is these days for the
auto string).

Also, can't we pre-allocate the new body string based on the size of the
original body length? This will reduce the amount of re-allocations that have to
go on as we append to the body strings. nsString isn't very efficient in this
regards. It's just going to be adding 64 bytes or so at a time when it needs
more space. Then it reallocates again. So let's avoid all that.....



mscott:  good points, not small nits at all.

if the default size for nsCAutoString is that small, it doesn't pay to put it 
on the stack, just to copy it off to the heap.

your pre-allocate idea is slick.  I can do use something like:

original length + ((original length / limit between CRLFs) * 2)

which should give me how many bytes I'll need in the worst case.  (2 bytes for 
each CRLF)

I'll go look at the nsCString apis to see how to specify a size.

I'll also finish investigating if it is always CRLF (on all platforms)

new patch coming.
Created attachment 38372 [details] [diff] [review]
patch, v3.  (addresses mscott's points)
patch (version 3) addresses mscott's points:  use nsCString instead of
nsCAutoString, and use SetCapacity() to avoid needless copies and
re-allocations.

sfraser is right, I messed up on the CRLF issue.

at first, I thought I was ok looking at the comments in nsHTMLContentSerializer
and nsPlainTextSerializer:

// Set the line break character:
if ((mFlags & nsIDocumentEncoder::OutputCRLineBreak)
  && (mFlags & nsIDocumentEncoder::OutputLFLineBreak)) // Windows/mail
    mLineBreak.AssignWithConversion("\r\n");

but, it turns out we aren't doing that for mail.  (I'll fix the comments)

instead, we end up using the platform's line break.  this is correct because
later in the mailnews code, we'll turn the platform line break into CRLF when we
need to.

new fix coming up.  thanks for the catch, smfr.
I don't really see the point of doing:

+  // body will not have any null bytes, so we can use PL_strdup
+  char *body = PL_strdup(bodyStr);
+  if (!body) return NS_ERROR_OUT_OF_MEMORY;

at the entry of the function, why not doing the copy only at the end when we
detect that we can use the body as it (in the else part of the if statment)?
originally I had to do it at the beginning because I was inserting null bytes so
that I could use Append(str).  but now that I'm using Append(str,length) [as
smfr suggested] I can delay that allocation.  good catch, ducarroz.

I can also delay the setting of the nsCString's capacity until I know I'll need
it.

new patch coming.
there might be a way to save another string allocation / copy when I have to
insert linebreaks.  (currently I allocate once to get a nsCString and then again
when I strdup that string at the end.)  I'll work on removing that extra copy.

ok, I'm testing a patch that reduces the allocations to one (in all cases) and
copying down to a minimum.  (at worst, we only copy the string once.)

I'll attach it.

I've tested on linux, next up win32.
Created attachment 38377 [details] [diff] [review]
patch, v5.  (I forgot to update some comments that were specific to nsCString usage)
working good on win32, also.

I'll send mail asking for re-review from sfmr, ducarroz and mscott.

thanks for the excellent feedback.
R=ducarroz

Comment 25

17 years ago
sr=sfraser
updating summary.  this doesn't have to do with the html content serializer 
anymore.
Summary: content serializer should have a flag that enforces a line length → insert linefeeds into messages so that saving as draft and sending will always succeed

Comment 27

17 years ago
a= asa@mozilla.org for checkin to the trunk.
(on behalf of drivers)
Blocks: 83989
fixed.

I'll attach a test case for QA to use when verifying.
Created attachment 38649 [details]
message that exposes the problem.  create a local folder with this message, do html reply and hit save (or send) verify.
fixed.

stephend, let me know if you have questions about how to verify with that 
sample email.
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED
Verified FIXED using builds:

Windows 2K - 2001-06-18-04
Mac OS 9.1 - 2001-06-18-08
RedHat 7.1 - 2001-06-18-08

When using the message that Seth attached (saved in the Local Folders as a .eml
that contains the HTML message), I can use Reply, Reply All, and Save (which I
verified by looking in my Drafts folder).
Status: RESOLVED → VERIFIED

Comment 32

17 years ago
*** Bug 77062 has been marked as a duplicate of this bug. ***
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.