Closed
Bug 241788
Opened 19 years ago
Closed 7 years ago
URL: link with href attribute javascript - new line - : is wrongly parsed
Categories
(Core :: Networking, defect)
Tracking
()
RESOLVED
FIXED
mozilla47
Tracking | Status | |
---|---|---|
firefox47 | --- | fixed |
People
(Reporter: martijn.martijn, Assigned: valentin)
References
()
Details
(Keywords: testcase, Whiteboard: [necko-backlog])
Attachments
(3 files)
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a) Gecko/20040418 Firefox/0.8.0+ Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a) Gecko/20040418 Firefox/0.8.0+ Originally mentioned here: http://forums.mozillazine.org/viewtopic.php?t=71959&sid=0da1a41302e6fd6b4521f71e4eb83893 Probably related to bug 47078 (or depending on it). This piece of code: <a href="javascript :alert('test')" style="white-space:pre;">a href=" javascript:alert('test')"></a> (watch the new-line) When hovering over this link, you get to see: "javascript:///alert('test')" This is incorrect. When clicking on the link I get a javascript error: Error: unterminated regular expression literal Source File: javascript:///alert('test') Line: 1 Source Code: /alert('test') I will attach a testcase. Notice the difference in behavior between the different versions of Mozilla: Mozilla1.01: Gives 'file not found' Mozilla1.4.1 Displays alert box 'test' Mozilla1.5a and further gives the behavior I described above. Internet Explorer does the same as Mozilla1.4.1, displays alert box 'test'. I think the behavior from Mozilla1.01 is the most correct one. Well, whatever you decide what the correct behavior is, current behavior is certainly not correct :) Reproducible: Always Steps to Reproduce: 1. See testcase 2. 3. Actual Results: Gives js error Expected Results: file not found error
Reporter | ||
Comment 1•19 years ago
|
||
![]() |
||
Comment 2•19 years ago
|
||
This has nothing to do with the HTML parser -- this is the URI parser all the way.
Assignee: parser → darin
Component: HTML: Parser → Networking
QA Contact: benc
Comment 3•19 years ago
|
||
bz: well, it's certainly not the IO service that causes this: js> ios=Components.classes["@mozilla.org/network/io-service;1"].getService(); [xpconnect wrapped nsISupports] js> ios.nsIIOService.newURI("javascript\n:alert('foo');", null, null); uncaught exception: [Exception... "Component returned failure code: 0x804b000a [nsIIOService.newURI]" nsresult: "0x804b000a (<unknown>)" location: "JS frame :: typein :: <TOP_LEVEL> :: line 4" data: no] it refuses to create such a uri.
![]() |
||
Comment 4•19 years ago
|
||
> bz: well, it's certainly not the IO service that causes this: Wrong test, biesi. The non-URI char before the '\n' means that this URI has nothing that the IO service considers a scheme, and of course a relative URI with no base will fail. >js> ios=Components.classes["@mozilla.org/network/io-service;1"].getService(); >[xpconnect wrapped nsISupports] >js> ios.nsIIOService.newURI("javascript\n:alert('foo');", null, null); Try: ios = Components.classes["@mozilla.org/network/io-service;1"].getService(); base = ios.nsIIOService.newURI("http://mozilla.org/", null, null); uri = ios.nsIIOService.newURI("javascript\n:alert('foo');", null, base); which is what has to happen in a real document, of course. The real problem, as I see it, is that the ExtractScheme() thing that IOService does in NewURI doesn't find a scheme in that string, but nsStandardURL::Resolve() does (see code at http://lxr.mozilla.org/seamonkey/source/netwerk/base/src/nsStandardURL.cpp#1512). Perhaps ExtractScheme() should do something similar?
![]() |
||
Comment 5•19 years ago
|
||
> The non-URI char before the '\n'
I meant "before the ':'", of course.
Comment 6•19 years ago
|
||
(In reply to comment #4) > Perhaps ExtractScheme() should do something similar? note that it would not match the expected results from comment 0 :) why is \r and \n stripped out?
![]() |
||
Comment 7•19 years ago
|
||
> note that it would not match the expected results from comment 0 :) I don't know where those expected results came from, exactly, but it'd make more sense to do what IE does, no? > why is \r and \n stripped out? See bug 102312. This is basically the same bug; the fix for that one doesn't work if the base and target URI have different protocols. The URI RFC has nothing to say on the matter past what it says at <http://www.gbiv.com/protocols/uri/rfc/rfc2396.html#rfc.section.E> and one could argue that that doesn't apply here, but realistically we want to be stripping out newlines in URI processing, imo.
Status: UNCONFIRMED → NEW
Ever confirmed: true
From the RFC: "In some cases, extra whitespace (spaces, linebreaks, tabs, etc.) may need to be added to break long URI across lines. The whitespace should be ignored when extracting the URI." Shouldn't ioService be removing the whitespace (spaces, new lines, tabs)?
Summary: link with href attribute javascript - new line - : is wrongly parsed → URL: link with href attribute javascript - new line - : is wrongly parsed
![]() |
||
Comment 9•19 years ago
|
||
Ben, that's talking about cases when we're extracting a URI from text or something. This is not quite the case here. Here, the contents of the src attribute are unambiguously a URI; this attribute value just happens to contain chars that are illegal in a URI, so we have to do some sort of error-handling. But I do agree that the sane error-handling is to strip out whitespace consistently.
Comment 10•19 years ago
|
||
(In reply to comment #7) > I don't know where those expected results came from, exactly, but it'd make more > sense to do what IE does, no? hm, yeah, I suppose. either way, we should be consistent, wrt how ExtractScheme and StandardURL handle schemes.
Updated•17 years ago
|
Assignee: darin → nobody
QA Contact: benc → networking
Comment 11•7 years ago
|
||
valentin, what should we do with this bug?
Flags: needinfo?(valentin.gosu)
Whiteboard: [necko-backlog]
Assignee | ||
Comment 12•7 years ago
|
||
This is still an issue - Chrome and Edge seem to be dealing with the last use case (stripping \n but not spaces) Keeping the need-info flag so I can take a shot at a patch.
Assignee | ||
Comment 13•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c12d35e57a8b
Assignee | ||
Comment 14•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1a4f4b68895d
Assignee | ||
Comment 15•7 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/32147/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/32147/
Assignee | ||
Comment 16•7 years ago
|
||
* net_ExtractURLScheme now uses mozilla::Tokenizer * net_FilterURIString also filters characters in the scheme now * removed startPos and endPos parameters for net_FilterURIString and introduced net_IsAbsoluteURL Review commit: https://reviewboard.mozilla.org/r/32149/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/32149/
Assignee | ||
Updated•7 years ago
|
Attachment #8711391 -
Flags: review?(honzab.moz)
Assignee | ||
Updated•7 years ago
|
Attachment #8711392 -
Flags: review?(honzab.moz)
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → valentin.gosu
Flags: needinfo?(valentin.gosu)
![]() |
||
Updated•7 years ago
|
Attachment #8711391 -
Flags: review?(honzab.moz) → review+
![]() |
||
Comment 17•7 years ago
|
||
Comment on attachment 8711391 [details] MozReview Request: Bug 241788 - mozilla::Tokenizer - token type for \n should whitespace if given in constructor r=honzab https://reviewboard.mozilla.org/r/32147/#review29395
![]() |
||
Updated•7 years ago
|
Attachment #8711392 -
Flags: review?(honzab.moz)
![]() |
||
Comment 18•7 years ago
|
||
Comment on attachment 8711392 [details] MozReview Request: Bug 241788 - net_FilterURIString should filter \r\n\t from the entire URL r=honzab https://reviewboard.mozilla.org/r/32149/#review29421 ::: netwerk/base/nsURLHelper.cpp:493 (Diff revision 1) > +bool static? ::: netwerk/base/nsURLHelper.cpp:521 (Diff revision 1) > - if (scheme) > + while (p.CheckChar(net_IsValidSchemeChar) || p.CheckWhite()) { so you now accept whitespaces AFTER a schema has started? means you accept URLs like: " \r\nht\r\n tp://example.com" I thought the bug was just about accepting leading white chars, OTOH, I'm not that deeply familiar with the latest URI specs. ::: netwerk/base/nsURLHelper.cpp:524 (Diff revision 1) > - else > + if (p.CheckChar(':')) { nit: to stick with "early return failure" pattern, rather do if (!p.CheckChar(':')) { return MARLF_URI; } ::: netwerk/base/nsURLHelper.cpp:526 (Diff revision 1) > + scheme.StripChars("\r\n\t"); tab is removed but space is not? odd. ::: netwerk/base/nsURLHelper.cpp:547 (Diff revision 1) > return false; seems like this method will return false while your new code will success. ::: netwerk/base/nsURLHelper.cpp:574 (Diff revision 1) > - break; > + p.SkipWhites(); we accept ws here? ::: netwerk/test/unit/test_standardurl.js:278 (Diff revision 1) > + var url = stringToURL(" \r\n\th\nt\rt\tp://ex\r\n\tample.com/path\r\n\t/\r\n\tto/fil\r\n\te.e\r\n\txt?que\r\n\try#ha\r\n\tsh \r\n\t "); we really want to support this? OK, so when I type something like this to a URL bar, the path portion gets escaped BEFORE we create the URL object of it? Because any spaces in the path/query will otherwise be removed with this.
Assignee | ||
Comment 19•7 years ago
|
||
https://reviewboard.mozilla.org/r/32149/#review29421 > so you now accept whitespaces AFTER a schema has started? means you accept URLs like: > > " \r\nht\r\n tp://example.com" > > I thought the bug was just about accepting leading white chars, OTOH, I'm not that deeply familiar with the latest URI specs. Actually no. A white space ' ' is only allowed at the begining or end. In between we only allow \n\r\t - but not ' '. This is what Blink does as well. In this context p.CheckWhite() only matches the characters I gave in the Tokenizer's constructor \r\n\t > tab is removed but space is not? odd. Yes! Space is not allowed inside the URL, only in leading or trailing characters. > seems like this method will return false while your new code will success. Right. The fact is that net_IsValidScheme("ht\ntp") should still return false. That's not a valid scheme. It's only that net_ExtractURLScheme will extract the valid scheme even though it has \r\t\n embedded in it, and net_FilterURIString will strip these characters from all of the URI. > we accept ws here? Yes. WS as in \r\n\t, but not ' ' > we really want to support this? OK, so when I type something like this to a URL bar, the path portion gets escaped BEFORE we create the URL object of it? Because any spaces in the path/query will otherwise be removed with this. That is still valid. Spaces inside the URL are not stripped so they get escaped eventually. I'll add a unit test for that case too, to make sure we don't mess it up.
Assignee | ||
Comment 20•7 years ago
|
||
Comment on attachment 8711391 [details] MozReview Request: Bug 241788 - mozilla::Tokenizer - token type for \n should whitespace if given in constructor r=honzab Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32147/diff/1-2/
Assignee | ||
Comment 21•7 years ago
|
||
Comment on attachment 8711392 [details] MozReview Request: Bug 241788 - net_FilterURIString should filter \r\n\t from the entire URL r=honzab Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32149/diff/1-2/
Attachment #8711392 -
Flags: review?(honzab.moz)
![]() |
||
Comment 22•7 years ago
|
||
Comment on attachment 8711392 [details] MozReview Request: Bug 241788 - net_FilterURIString should filter \r\n\t from the entire URL r=honzab https://reviewboard.mozilla.org/r/32149/#review30307 ::: netwerk/base/nsURLHelper.cpp:532 (Diff revisions 1 - 2) > - return NS_ERROR_MALFORMED_URI; > + return NS_OK; when MOZILLA_XPCOMRT_API is defined, this should return a failure?
Attachment #8711392 -
Flags: review?(honzab.moz) → review+
Assignee | ||
Comment 23•7 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #22) > > - return NS_ERROR_MALFORMED_URI; > > + return NS_OK; > when MOZILLA_XPCOMRT_API is defined, this should return a failure? Yeah, I guess that would be best.
Assignee | ||
Comment 24•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/5342641913be9b90f5309a42ae226a1d340f30c6 Bug 241788 - mozilla::Tokenizer - token type for \n should whitespace if given in constructor r=honzab https://hg.mozilla.org/integration/mozilla-inbound/rev/77f8588dad8c0349f88df7798a2e612ada487ffe Bug 241788 - net_FilterURIString should filter \r\n\t from the entire URL r=honzab
Assignee | ||
Comment 25•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/9072e402ee233d2f37835e2a5e9f5c4d1b9728bf Bug 241788 - Add missing run_next_test() to xpcshell-test a=testonly
Comment 26•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/5342641913be https://hg.mozilla.org/mozilla-central/rev/77f8588dad8c https://hg.mozilla.org/mozilla-central/rev/9072e402ee23
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox47:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in
before you can comment on or make changes to this bug.
Description
•