Closed Bug 92721 Opened 23 years ago Closed 14 years ago

[quirks]FORM does not work if ACTION parameter is mis-quoted

Categories

(Core :: DOM: HTML Parser, defect)

x86
Windows 2000
defect
Not set
minor

Tracking

()

RESOLVED WONTFIX
Future

People

(Reporter: yancey, Unassigned)

References

()

Details

(Keywords: compat, testcase)

Attachments

(2 files)

Reproducible: Always

I'll set this as PC:Win2K,  but I know it exhibits the same behavior under 
Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:0.9.2+) Gecko/20010728. 
Probably should be All:All, but I haven't tried it on other systems yet. 


Technically, this is bad HTML. However, it works under both IE and Communicator.

Here is the offending tag...

<form method=post action=http://cgi.cascss.unt.edu:8000/cgi-bin/untdirectory">

This is a large email directory, so any name like smith should return results.
After further investigation, I realized that Mozilla's view source is not
showing the full text of the original HTML.  Here is the line in the bad tag in
the real source...

<FORM method=post "action=http://cgi.cascss.unt.edu:8000/cgi-bin/untdirectory">
As you can see, Mozilla does not show the quote in front of 'action'.

When you submit the form, mozilla attempts to load something and will wait
forever for a page that will never load.  I do not know what URL it sends.
I was finally able to get a look at the web server logs.  Mozilla is sending the
closing quotation mark to the webserver as part of the script name.  The error
logs report that they cannot find the untdirectory" script.
over to parser
Assignee: rods → harishd
Status: UNCONFIRMED → NEW
Component: Form Submission → Parser
Ever confirmed: true
QA Contact: vladimire → bsharma
Discussed this in IRC with Yancey, and we've found the problem (because the
quote is at the beginning of the name-value pair, the string after the equals
sign is sent to ConsumeAttributeValueText rather than ConsumeQuotedText). 
Apparently this works for both IE and NN; do we want to add it to Quirks mode?
Seems to be a duplicate of BUG 88243
This is not a dupe of bug 88243.
Status: NEW → ASSIGNED
The issue here seems to be that CAttributeToken::Consume does not expect a
random quoted string to be inside the start tag.  As an example, if you had this...

<body "title='demo page'">

From my short review of the code, it looks like the CAttributeToken::Consume
function looks for text to put into mTextKey, but stops at the first quote
instead of discovering the actual key name. It then just skips ahead until it
finds the equal sign, completely dropping the intended key name, possibly
leaving the name of the key as a null string!

It should probably call ConsumeQuotedString in this case when it finds the first
quote, but calls ConsmeAttributeValueText instead. It might decide to ignore the
quoted string, but should it at least realize that it is indeed a quoted
string?I intended to dump out the token for debugging, but ran out of time while
attempting to figure out how to do that.  If you can show me how to dump the
token to the console, I will be able to verify my suspicions.. and possibly
provide a patch.

I tried the following example and it works fine:

<html>
 <body "bgcolor='red'" text="yellow">
   <p> should be yellow on a red background </p>
 </body>
</html>
Removing the single quotes from that testcase results in yellow text on a 
*green* background!
That sounds like a different bug ( somewhere in layout..I guess ). Replace "red"
with #ff0000; and you should see the correct background.
I do see the correct background in this testcase:
<html>
 <body "bgcolor=#FF0000" text="yellow">
  <p> should be yellow on a red background </p>
 </body>
</html>
but I think that's because of the way we parse the hex color 
attributes.  "bgcolor=#FF0000"" will also create a red background.
I'm attaching an interesting little testcase.  It seems that what the parser is
doing when it encounters a malformed attribute like this is that it interprets
the leading quote as an attribute.  It then proceeds to put the rest of the
attribute into a *second* attribute, which has a trailing quote at the end.

The testcase contains a table with an onclick attribute setting the Javascript
and a malformed width attribute, "width=50%".  The onclick pops up an alert
containing the number of attributes for the table (using
this.attributes.length).  It currently returns 3, rather than 2.
Suggested stategy:

1) Remove the double-quote from theTerminalsChars.  (In fact, I'm not entirely
sure what the rationale is for some of the stuff defined in there. 
ReadWhitespace and SkipWhitespace both consume space, \b, and \t, so those are
unnecessary.  We need to keep \r and \n for View Source, but those will not
affect our parsing in any of the normal modes.) 

2) Create a new Boolean variable, call it advancedQuote (for the purposes of
this comment, I don't know what the naming conventions are here...)  Initialize
to false.

3) [The code described in this step should be enclosed in a quirks-mode-only
conditional.] At line 1641, call nsScanner::Peek to look at the next character.
 (The whitespace has been consumed, so this is the first non-whitespace
character in the attribute.)  If the character is a double quote or a single
quote, advance the scanner by one, so the quote gets dropped, and change
advancedQuote to true.

4) Add "|| advancedQuote" to the conditional on line 1671, so that such
attributes will have their value sent to ConsumeQuotedText rather than
ConsumeAttributeValue for stripping.

Thoughts?
I figured out how to turn on the debugging code and here's the result I get with
the following tag...

<FORM method=post "action=http://cgi.cascss.unt.edu:8000/cgi-bin/untdirectory">

[attr] method=post: 0
[attr] =: 0
[attr] action=http://cgi.cascss.unt.edu:8000/cgi-bin/untdirectory": 0

Should it create an attribute token with a null mTextKey and mTextValue?

Should there ever be a quote in mTextValue?
>Should it create an attribute token with a null mTextKey and mTextValue?
>Should there ever be a quote in mTextValue?

We're dealing with a malformed testcase here...and hence the unexpected behavior.

Chistopher: Do you think it's worth taking a risk to fix such malformedness?
Tell me how many sites are affected by this problem and if there are a bunch
then we should consider fixing it. IMO, just fix the content not the user agent.
harishd: beats me, I have no idea how common this malformation is.  I'm not 
terribly enthusiastic about seeing Quirks Mode expanded either, but part of the 
fix (removing the double-quote from theTerminalsChars) would also help us in 
standards mode (in standards mode, there shouldn't be an action attribute at 
all).  It's your component, you make the call.  As for risk, the main effect I 
can see of removing the double-quote is that it would break malformed key-value 
pairs of form key"value", but I don't know that anyone ever generates them.  
Having only examined the parser very recently, I'll have to defer to you on 
assessing the overall risk.

(Also, ignore my above comments on whitespace in theTerminalsChars; I plead 
temporary stupidity.)
Harish: I do not wish to waste more of your time.  I believe that I can create a
patch that will be of minimal risk and solve the issue, so I will continue to
work on this as I have time and will post my patch when I am satisfied that it
works correctly. Please leave this bug open until I can get my patch ready
(probably next week).

If everyone tests the patch and likes it, then we can decide whether it should
be submitted to the tree.  Please spend your time on more important bugs now.
Attached patch Proposed PatchSplinter Review
Christopher: Please try the patch to see if it solves your issue(s).
A patch from BUG 91437 has modified the way that nsHTMLTokens.cpp handles
whitespace. I will be reviewing this again to see how the updates affect
misquoted attributes.
I think the better approach to achieve this behavior is to include kQuote and
kApostrophe in the nsReadEndCondition of an unquoted attribute value. My patch
to bug 57724 does that. Then the quote will be ignored like the quote in front
of the key.
Ahh, but then do those quotes appear in view-source?  I feel that view-source
must be a 100% accurate representation of the original source sent from the web
server or else it is useless. Otherwise, how do we find our own mistakes in our
HTML or scripts?

I have found the w3c's docs to say that the value may be quoted and must be if
it contains spaces. I have not found any statement saying that quoting the key
or the entire key=value construct is illegal. Even if it is not valid XML,
quoting the entire construct appears to be allowable HTML according to the w3c
specs! If there is a document I missed, please post its location here.

By the way, I greatly appreciate all your work on optimizing the tokenizer.
> Ahh, but then do those quotes appear in view-source?

Not now (like the first quote), but with my changes in bug 57724 they will.

> I have found the w3c's docs to say that the value may be quoted and must be if
> it contains spaces. I have not found any statement saying that quoting the key
> or the entire key=value construct is illegal.

HTML inherits its syntax from SGML. AFAIK the SGML spec isn't available online,
but a summary of its syntax is at http://www.tiac.net/users/bingham/sgmlsyn/
(won't be very useful without further knowledge of SGML, though).

QA Contact: bsharma → moied
This bug has been marked "future" because the original netscape engineer working 
on this is over-burdened. If you feel this is an error, that you or another
known resource will be working on this bug,or if it blocks your work in some way 
-- please attach your concern to the bug for reconsideration -----
Target Milestone: --- → Future
Keywords: patch, testcase
Summary: FORM does not work if ACTION parameter is mis-quoted → [quirks]FORM does not work if ACTION parameter is mis-quoted
Keywords: compat
Assignee: harishd → nobody
Status: ASSIGNED → NEW
QA Contact: moied → parser
Tokenization is now HTML5-compliant and, on this point, matches Opera and almost matches IE8. Marking WONTFIX. If you disagree, please file a bug against the HTML5 spec in the W3C Bugzilla.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: