Last Comment Bug 665706 - semicolon in URI path handled incorrectly
: semicolon in URI path handled incorrectly
Status: RESOLVED FIXED
: addon-compat, dev-doc-complete
Product: Core
Classification: Components
Component: Networking (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla9
Assigned To: Julian Reschke
:
:
Mentors:
http://lists.w3.org/Archives/Public/p...
Depends on: 732567 682762 682845
Blocks: 713472
  Show dependency treegraph
 
Reported: 2011-06-20 14:20 PDT by Julian Reschke
Modified: 2014-01-17 06:50 PST (History)
7 users (show)
mounir: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
rough patch, work in progress (27.84 KB, patch)
2011-08-07 14:01 PDT, Julian Reschke
no flags Details | Diff | Splinter Review
remove special (historic) treatment of ";" in URIs (18.97 KB, patch)
2011-08-08 11:12 PDT, Julian Reschke
rjesup: review+
Details | Diff | Splinter Review

Description Julian Reschke 2011-06-20 14:20:47 PDT
User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Build Identifier: 

Given a URI such as

  http://example.org/foo/bar;pathparam?query#frag

the URL decomposition DOM attributes return

  /foo/bar

as pathname. Should be

  /foo/bar;pathparam

See 

  http://lists.w3.org/Archives/Public/public-iri/2011Apr/0035.html

for Boris' explanation why this happens.

Reproducible: Always
Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2011-06-21 08:40:15 PDT
This has nothing to do with the DOM.
Comment 2 Julian Reschke 2011-06-21 09:08:05 PDT
(In reply to comment #1)
> This has nothing to do with the DOM.

...but it does behave correctly network-wise...

That being said; any idea where the code is located causing this?
Comment 3 Julian Reschke 2011-06-21 11:05:48 PDT
OK, the relevant code is in nsURLParsers.cpp, it seems.

The clean approach would be to wipe out any special treatment of ";" even in the interfaces, but I'm not sure how disruptive this would be...
Comment 4 Mounir Lamouri (:mounir) 2011-06-29 06:27:06 PDT
(In reply to comment #3)
> OK, the relevant code is in nsURLParsers.cpp, it seems.
> 
> The clean approach would be to wipe out any special treatment of ";" even in
> the interfaces, but I'm not sure how disruptive this would be...

Do not hesitate to give it a try if your are able to write a patch that seems correct to you and attach it here for a review. Don't be scared of doing something wrong: the reviewer will catch it.
Comment 5 Julian Reschke 2011-08-07 14:01:39 PDT
Created attachment 551346 [details] [diff] [review]
rough patch, work in progress

This compiles, and seems to work.

TODO: escapes ";" in paths while it doesn't have to

TODO: update test cases
Comment 6 Julian Reschke 2011-08-08 11:12:41 PDT
Created attachment 551508 [details] [diff] [review]
remove  special (historic) treatment of ";" in URIs

- removes "param" component from URL interfaces

- updates nsEscape not to escape ";"

- fixes test case that assumed that http://example.com/; and http://example.com; should be the same

- fixes two unmaintained CPP based test cases to compile, but did not fix them otherwise (see related bug 677248)
Comment 7 Randell Jesup [:jesup] 2011-08-09 09:12:21 PDT
Comment on attachment 551508 [details] [diff] [review]
remove  special (historic) treatment of ";" in URIs

r=me on all but the http://example.com; issue

Reading RFC 3986, it doesn't appear valid to end an authority section with ';' - does the uri1.spec = "http://example.com;" fail?  (Of course, we may not force compliance with the BNF.)

Do we fail if you put other not-allowed characters in that position?  (Or rather, does it now do the same with ';' as it does with other incorrect authority characters?) If it is the same handling as other incorrect ones, then r=me on the entire thing.  If we're not handling it the same as other incorrect characters, then r-.  If we're handling incorrect characters in authority incorrectly but consistently, then please file a new bug on that and r=me on this.

I'll mark it r+ for now
Comment 8 Masatoshi Kimura [:emk] 2011-08-09 09:25:08 PDT
> authority   = [ userinfo "@" ] host [ ":" port ]
> host        = IP-literal / IPv4address / reg-name
> reg-name    = *( unreserved / pct-encoded / sub-delims )
> sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
>             / "*" / "+" / "," / ";" / "="
";" looks to be allowed for the authority part.
Comment 9 Julian Reschke 2011-08-09 09:46:18 PDT
(In reply to Randell Jesup [:jesup] from comment #7)
> Reading RFC 3986, it doesn't appear valid to end an authority section with
> ';' - does the uri1.spec = "http://example.com;" fail?  (Of course, we may
> not force compliance with the BNF.)

Yes, ";" is allowed in the authority component.

But the primary reason why this change should be ok is that ";" has no special meaning in the URI; it doesn't affect the way it's split into components. Thus, it's handled just like "a", for instance.

This makes it different from the other things that were tested in that test class; those were about characters that are indeed special, such as "?" and "#".
 
> Do we fail if you put other not-allowed characters in that position?  (Or
> rather, does it now do the same with ';' as it does with other incorrect
> authority characters?) If it is the same handling as other incorrect ones,
> then r=me on the entire thing.  If we're not handling it the same as other
> incorrect characters, then r-.  If we're handling incorrect characters in
> authority incorrectly but consistently, then please file a new bug on that
> and r=me on this.
> 
> I'll mark it r+ for now

As stated above, it *is* allowed. That being said, for the purpose of URI parsing it is handled exactly like something that is not allowed, such as "{"; it's just not special, and thus ends up in the component it appears in. Does this clarify?
Comment 10 Mounir Lamouri (:mounir) 2011-08-23 01:26:02 PDT
Pushed:
http://hg.mozilla.org/mozilla-central/rev/1a09781a5480
Comment 11 Boris Zbarsky [:bz] (still a bit busy) 2011-08-29 08:32:37 PDT
Why was this checked in without super-review?  It's got API changes!

I filed bug 682845 on the obvious sr-level issues a cursory glance shows.
Comment 12 Boris Zbarsky [:bz] (still a bit busy) 2011-08-29 08:37:43 PDT
We probably need to document the API changes here, at least.
Comment 13 Eric Shepherd [:sheppy] 2011-11-08 08:20:52 PST
Documentation updated:

https://developer.mozilla.org/en/nsIURL
https://developer.mozilla.org/en/XPCOM_Interface_Reference/nsIURLParser

Also mentioned on Firefox 9 for developers and Updating add-ons for Firefox 9.
Comment 14 Daniel Stenberg [:bagder] 2011-12-25 11:22:08 PST
Ok, isn't this so that with this change now Firefox accepts this URI?

"www.example.com;foo=bar"

Is that really intended? It seems to me that Firefox is now rather lonely in doing this interpretation and I think it hurts us all to have this different opinion on how to parse URIs...
Comment 15 Julian Reschke 2011-12-25 13:09:43 PST
(In reply to Daniel Stenberg from comment #14)
> Ok, isn't this so that with this change now Firefox accepts this URI?
> 
> "www.example.com;foo=bar"

That's not the intent, and I believe it accepted those before; it just treated them differently.
 
> Is that really intended? It seems to me that Firefox is now rather lonely in
> doing this interpretation and I think it hurts us all to have this different
> opinion on how to parse URIs...

If you think that ";" in authorities should get special treatment, you may be right; maybe open a separate bug?
Comment 16 Daniel Stenberg [:bagder] 2011-12-25 14:44:00 PST
Thanks, I've submitted #713472 as a new bug for this.

Note You need to log in before you can comment on or make changes to this bug.