Closed Bug 19954 Opened 20 years ago Closed 20 years ago

[4.xP][PP][dogfood]{html40}submission of newlines in textareas

Categories

(Core :: DOM: Core & HTML, defect, P3)

x86
Linux
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: dbaron, Assigned: sfraser_bugs)

References

Details

(Whiteboard: [PDT+][12/10])

DESCRIPTION:  Newlines in textareas should be submitted as CR-LF (i.e.,
%0D%0A).  This is specified in section 17.13.4 of HTML 4.0 and is the behavior
of NN 4.x (on Linux, anyway).

Currently they are being sent as LF (%0A) on Linux, and I believe (based on
sfraser's comments in bug 16813) they are being sent as CR (%0D) on Mac.  (I'd
guess they're correct on Windows, but that should be checked.)

STEPS TO REPRODUCE:
 * load http://bugzilla.mozilla.org/enter_bug.cgi
 * Type the following into the textarea:
ZZZZZZZ
MMMMMMM
 * click "Remember values as bookmarkable template"
 * examine the URL that is presented to be bookmarked

ACTUAL RESULTS:
 * the text is encoded as ZZZZZZZ%0AMMMMMMM (on Linux), and something else on
other platforms...

EXPECTED RESULTS:
 * it should be encoded as ZZZZZZZ%0D%0AMMMMMMM

DOES NOT WORK CORRECTLY ON:
 * Linux, mozilla, 1999-11-22-08-M12

WORKS CORRECTLY ON:
 * Linux NN 4.61

ADDITIONAL INFORMATION:
This should be tested with both GET and POST form submission, and for both
content types described in HTML 4.0.  See HTML 4.0:
http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.13.4
Assignee: karnaze → pollmann
Eric, reassign to Steve if appropriate.
Actually, I think this should apply anytime there are newlines in anything to be
submitted.  This could happen in lots of ways.  It could be caused by entities
in HTML (CR, LF, or CR-LF) or through JS.  It could occur in textareas, hidden
inputs, control values for radio/checkbox inputs, etc.

So it should probably be handled at the level of form submission...
cc self.
I think the solution here is to add a flag to nsIDocumentEncoder that says 'I
want HTML spec line breaks', and use that in
nsGfxTextControlFrame::GetTextControlFrameState() when we get the value.
Assignee: pollmann → sfraser
I'm fixing this.
Status: NEW → ASSIGNED
Target Milestone: M12
setting to M12 and setting assigned
Target Milestone: M12
I believe that we should be doing linefeed conversions at the same time we handle

conversion to Unicode, which happens in CNavDTD::CollectSkippedContent(). That

way, we are converting line breaks before stuff gets inserted into teh content

model, which sounds right.



rickg, is this good? Here's the diff:



Index: CNavDTD.cpp

===================================================================

RCS file: /cvsroot/mozilla/htmlparser/src/CNavDTD.cpp,v

retrieving revision 3.241

diff -u -2 -r3.241 CNavDTD.cpp

--- CNavDTD.cpp	1999/11/23 03:07:07	3.241

+++ CNavDTD.cpp	1999/11/24 00:06:45

@@ -1919,4 +1919,5 @@

   int aIndex=0;

   int aMax=mSkippedContent.GetSize();

+  PRBool aMustConvertLinebreaks = PR_FALSE;



   for(aIndex=0;aIndex<aMax;aIndex++){

@@ -1932,4 +1933,8 @@

       if((eHTMLTag_textarea==theNodeTag) && (eToken_entity==theTokenType)) {

         ((CEntityToken*)theNextToken)->TranslateToUnicodeStr(mScratch);

+        // since this is an entity, we know that it's only one character.

+        // check to see if it's a CR, in which case we'll need to do line

+        // termination conversion at the end.

+        aMustConvertLinebreaks |= (mScratch[0] == kCR);

       }

       else theNextToken->GetSource(mScratch);

@@ -1939,4 +1944,17 @@

     mTokenRecycler->RecycleToken(theNextToken);

   }

+

+  // if the string contained CRs (hence is either CR, or CRLF terminated)

+  // we need to convert line breaks

+  if (aMustConvertLinebreaks)

+  {

+    PRInt32  offset;

+    while ((offset = aNode.mSkippedContent.Find("\r\n")) != kNotFound)

+      aNode.mSkippedContent.Cut(offset, 1);		// remove the CR

+

+    // now replace remaining CRs with LFs

+    aNode.mSkippedContent.ReplaceChar("\r", kNewLine);

+  }

+

   // Let's hope that this does not hamper the  PERFORMANCE!!

   mLineNumber += aNode.mSkippedContent.CountChar(kNewLine);
Ack! Here's a non-doublespaced version of the patch.

Index: CNavDTD.cpp
===================================================================
RCS file: /cvsroot/mozilla/htmlparser/src/CNavDTD.cpp,v
retrieving revision 3.241
diff -u -2 -r3.241 CNavDTD.cpp
--- CNavDTD.cpp	1999/11/23 03:07:07	3.241
+++ CNavDTD.cpp	1999/11/24 00:09:04
@@ -1919,4 +1919,5 @@
   int aIndex=0;
   int aMax=mSkippedContent.GetSize();
+  PRBool aMustConvertLinebreaks = PR_FALSE;

   for(aIndex=0;aIndex<aMax;aIndex++){
@@ -1932,4 +1933,8 @@
       if((eHTMLTag_textarea==theNodeTag) && (eToken_entity==theTokenType)) {
         ((CEntityToken*)theNextToken)->TranslateToUnicodeStr(mScratch);
+        // since this is an entity, we know that it's only one character.
+        // check to see if it's a CR, in which case we'll need to do line
+        // termination conversion at the end.
+        aMustConvertLinebreaks |= (mScratch[0] == kCR);
       }
       else theNextToken->GetSource(mScratch);
@@ -1939,4 +1944,17 @@
     mTokenRecycler->RecycleToken(theNextToken);
   }
+
+  // if the string contained CRs (hence is either CR, or CRLF terminated)
+  // we need to convert line breaks
+  if (aMustConvertLinebreaks)
+  {
+    PRInt32  offset;
+    while ((offset = aNode.mSkippedContent.Find("\r\n")) != kNotFound)
+      aNode.mSkippedContent.Cut(offset, 1);		// remove the CR
+
+    // now replace remaining CRs with LFs
+    aNode.mSkippedContent.ReplaceChar("\r", kNewLine);
+  }
+
   // Let's hope that this does not hamper the  PERFORMANCE!!
   mLineNumber += aNode.mSkippedContent.CountChar(kNewLine);
Looking for a review on that patch. Parser people?
Adding Harish, as he's the parser people.  Unfortunately, he's out on vacation
for a while.  I don't know dates.
Waiting for resolution on this and 16813 which are related; waiting for parser
people to return from vacation. Will address on 11/29
Target Milestone: M12
Summary: [4.xP][PP]{html40}submission of newlines in textareas → [4.xP][PP][dogfood]{html40}submission of newlines in textareas
M12
Whiteboard: [PDT+]
Putting on PDT+ radar.
OK, ignore that patch above: it's a fix for 16813, and not for this bug.
I'm not sure how to fix this one. The alternatives are:
1. In nsGfxTextControlFrame::GetTextControlFrameState(), editor->OutputToString()
   with a flag that says to output text with CRLF breaks. This means that any
   time someone gets the value of a textarea (e.g. from JS) they'll get CRLF
   termination. That might be bad for JS.

2. Do the conversion in nsFormFrame::URLEncode(). We'd have to convert from
   platform line breaks back to CRLF, so this would be less efficient.

Thoughts?
Because of my comment on this bug at 11/23/99 13:23, I think I'd prefer (2), if
I understand them correctly.
If we do add the new output flag, it sounds like we should also change the
nsGfxTextControlFrame's API so that nsFormFrame can pass a flag down into it
telling it whether to use CRLF mode or not.

This is a little ugly, but no more so than having to add the flag in the first
place.
I agree with David.  Solution 2) sounds better than solution 1).  I think that
the logic that will have to be added to URLEncode() will be just as efficient as
what would have to added to GetTextControlFrameState().  However, if you do
decide to make the change to URLEncode(), please check to see if you need to
make it for multipart forms as well (nsFormFrame::ProcessAsMultipart()).

A testcase for Multipart might look like:
<HTML>
 <BODY>
  <FORM ENCTYPE="multipart/form-data" ACTION="http://pollmann.net/echo.cgi">
   <TEXTAREA NAME="foo">Test
Case</TEXTAREA>
  </FORM>
 </BODY>
</HTML>

The reason you'll need to test this separately is that ProcessAsMultipart
doesn't call URLEncode on the form values.  I'm not sure what Nav and IE do with
multipart forms regarding CR/LF.
I have fixed for line termination conversion in nsFormFrame, but there is still a
problem when you go Back to a form page; the FrameManger's state for that frame
holds a string that also needs to go through line termination conversion. My
first solution above would fix that case too; the second solution requires more
line termination conversion code in nsTextControlFrame::SaveState() and maybe
RestoreState(). So I'm tending towards that solution.
Sorry, "that solution" == the first one (always get CRLF data from textareas).
Whiteboard: [PDT+] → [PDT+][12/3]
Still awaiting input on some issues, from pollman and a DOM person, in
particular, what line termination would be expected when you get the value of a
textarea in JS.
Blocks: 12658
linking to 12658, composer pdt+ bug tracking
I tested the linebreaks gotten from textareas in Nav 4.x, and IE 5.0 on various
platforms. In all cases, platform-native line breaks are being used.
Tested in what?  JS?  Form submission?  (It's certainly not true in NN 4.61
Linux for form submission.)

If the answer is JS, then I think that's further reason to fix this bug at the
level of form submission encoding.
Sorry, yes, at the level of getting textarea.value from JS. I am working on a
solution to fix this bug at the form submission level.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Checked in fixes, in nsFormFrame.cpp.

This will need to be verified on all 3 platforms. QA should verify that CRLF data
is being sent to the server in each case.
QA Contact update.
Status: RESOLVED → REOPENED
Okay, the good news is it works on:
	- WinNT 1999120516 mozilla &
	- MacOS86 1999120508 netscape.

The bad news is it still doesn't work on Linux6 1999120608 mozilla.

Reopening bug.
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED
Whiteboard: [PDT+][12/3] → [PDT+][12/10]
ckritzer: When you say 'it still doesn't work', what problem do you observe?
Oh yeah, details...those would be good...

Specificly, there is an extra line feed after every carriage return in a
bugzilla form submission on Linux.

If you type:

AAAAAAA
BBBBBBB
CCCCCCC

and submit the bugzilla form in 1999120708 Linux6 mozilla, here's what it will
look like:

AAAAAAA

BBBBBBB

CCCCCCC

Does that make sense?
This is
a test.
Chris: I just tried submitting a few lines with a return
(like this) between them to this bug using today's Linux build, and they seem to
have submitted correctly.
(another return) Could you please explain in painstaking detail exactly what
you're doing, and what the result is?
AFAICT, this bug is fixed in Linux mozilla 1999-12-06-13-M12.  Bug 16813 may not
be.
Bug 16813 is also mostly fixed.  I just filed bug 21121 for the remaining
issues.  I filed it using a template with proper CR-LF encoding (bug 16813), and
it worked fine.  I also created a bug template, and that gave the proper
encoding in the URL (this bug).

However, there are two seemingly random newlines in bug 21121 that I did not
type.  They need to be investigated.  Perhaps I should open another bug...
Based on dbaron's commens, marking fixed. The 'extra newline' problem is known
(the output system is doing extra wrapping when it should not). It is covered by
but 20603.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago20 years ago
Resolution: --- → FIXED
Marking fixed, because sfraser said he would, but didn't.
Dang! I'm *still* seeing this (with the 1999120715 build)... Akkana, Dave,
Simon, what version of linux are you running?
Do you install over previous builds, or use a new directory?
Do you toss out mozilla registry & profiles, or keep them build to build?
My answers to the above are Linux6 (gnome) on an HP Vectra XA, new directory,
and keep them build to build...
**What** are you still seeing?  I think that's the key question here.

(I install builds in new directories, but keep my ~/.mozilla/ directory.  I use
Linux 2.2.5 (which comes with RedHat 6.0) and gnome 1.0.53.)
Status: RESOLVED → VERIFIED
Marking VERIFIED FIXED.

Tested on Linux6 1999120808 mozilla.

Also verified (and behaves correctly) on:
	- Win98 1999120808 mozilla
	- MacOS86 1999120808 commercial
Blocks: 21564
No longer blocks: 21564
Component: HTML: Form Submission → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.