Note: There are a few cases of duplicates in user autocompletion which are being worked on.

semicolon in URI path handled incorrectly

RESOLVED FIXED in mozilla9

Status

()

Core
Networking
RESOLVED FIXED
6 years ago
4 years ago

People

(Reporter: Julian Reschke, Assigned: Julian Reschke)

Tracking

(Depends on: 1 bug, {addon-compat, dev-doc-complete})

Trunk
mozilla9
addon-compat, dev-doc-complete
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

6 years ago
User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Build Identifier: 

Given a URI such as

  http://example.org/foo/bar;pathparam?query#frag

the URL decomposition DOM attributes return

  /foo/bar

as pathname. Should be

  /foo/bar;pathparam

See 

  http://lists.w3.org/Archives/Public/public-iri/2011Apr/0035.html

for Boris' explanation why this happens.

Reproducible: Always

Comment 1

6 years ago
This has nothing to do with the DOM.
Status: UNCONFIRMED → NEW
Component: DOM → Networking
Ever confirmed: true
QA Contact: general → networking
(Assignee)

Comment 2

6 years ago
(In reply to comment #1)
> This has nothing to do with the DOM.

...but it does behave correctly network-wise...

That being said; any idea where the code is located causing this?
(Assignee)

Comment 3

6 years ago
OK, the relevant code is in nsURLParsers.cpp, it seems.

The clean approach would be to wipe out any special treatment of ";" even in the interfaces, but I'm not sure how disruptive this would be...
(In reply to comment #3)
> OK, the relevant code is in nsURLParsers.cpp, it seems.
> 
> The clean approach would be to wipe out any special treatment of ";" even in
> the interfaces, but I'm not sure how disruptive this would be...

Do not hesitate to give it a try if your are able to write a patch that seems correct to you and attach it here for a review. Don't be scared of doing something wrong: the reviewer will catch it.
OS: Windows 7 → All
Hardware: x86 → All
Version: unspecified → Trunk
(Assignee)

Comment 5

6 years ago
Created attachment 551346 [details] [diff] [review]
rough patch, work in progress

This compiles, and seems to work.

TODO: escapes ";" in paths while it doesn't have to

TODO: update test cases
Assignee: nobody → julian.reschke
(Assignee)

Comment 6

6 years ago
Created attachment 551508 [details] [diff] [review]
remove  special (historic) treatment of ";" in URIs

- removes "param" component from URL interfaces

- updates nsEscape not to escape ";"

- fixes test case that assumed that http://example.com/; and http://example.com; should be the same

- fixes two unmaintained CPP based test cases to compile, but did not fix them otherwise (see related bug 677248)
Attachment #551346 - Attachment is obsolete: true
Attachment #551508 - Flags: review?(rjesup)
(Assignee)

Updated

6 years ago
Target Milestone: --- → mozilla9
Comment on attachment 551508 [details] [diff] [review]
remove  special (historic) treatment of ";" in URIs

r=me on all but the http://example.com; issue

Reading RFC 3986, it doesn't appear valid to end an authority section with ';' - does the uri1.spec = "http://example.com;" fail?  (Of course, we may not force compliance with the BNF.)

Do we fail if you put other not-allowed characters in that position?  (Or rather, does it now do the same with ';' as it does with other incorrect authority characters?) If it is the same handling as other incorrect ones, then r=me on the entire thing.  If we're not handling it the same as other incorrect characters, then r-.  If we're handling incorrect characters in authority incorrectly but consistently, then please file a new bug on that and r=me on this.

I'll mark it r+ for now
Attachment #551508 - Flags: review?(rjesup) → review+
> authority   = [ userinfo "@" ] host [ ":" port ]
> host        = IP-literal / IPv4address / reg-name
> reg-name    = *( unreserved / pct-encoded / sub-delims )
> sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
>             / "*" / "+" / "," / ";" / "="
";" looks to be allowed for the authority part.
(Assignee)

Comment 9

6 years ago
(In reply to Randell Jesup [:jesup] from comment #7)
> Reading RFC 3986, it doesn't appear valid to end an authority section with
> ';' - does the uri1.spec = "http://example.com;" fail?  (Of course, we may
> not force compliance with the BNF.)

Yes, ";" is allowed in the authority component.

But the primary reason why this change should be ok is that ";" has no special meaning in the URI; it doesn't affect the way it's split into components. Thus, it's handled just like "a", for instance.

This makes it different from the other things that were tested in that test class; those were about characters that are indeed special, such as "?" and "#".
 
> Do we fail if you put other not-allowed characters in that position?  (Or
> rather, does it now do the same with ';' as it does with other incorrect
> authority characters?) If it is the same handling as other incorrect ones,
> then r=me on the entire thing.  If we're not handling it the same as other
> incorrect characters, then r-.  If we're handling incorrect characters in
> authority incorrectly but consistently, then please file a new bug on that
> and r=me on this.
> 
> I'll mark it r+ for now

As stated above, it *is* allowed. That being said, for the purpose of URI parsing it is handled exactly like something that is not allowed, such as "{"; it's just not special, and thus ends up in the component it appears in. Does this clarify?
(Assignee)

Updated

6 years ago
Status: NEW → ASSIGNED
Keywords: checkin-needed
Flags: in-testsuite+
Keywords: checkin-needed
Whiteboard: [inbound]
Pushed:
http://hg.mozilla.org/mozilla-central/rev/1a09781a5480
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Whiteboard: [inbound]
Blocks: 682762

Updated

6 years ago
No longer blocks: 682762
Depends on: 682762

Comment 11

6 years ago
Why was this checked in without super-review?  It's got API changes!

I filed bug 682845 on the obvious sr-level issues a cursory glance shows.

Updated

6 years ago
Depends on: 682845

Comment 12

6 years ago
We probably need to document the API changes here, at least.
Keywords: dev-doc-needed
Keywords: addon-compat
Documentation updated:

https://developer.mozilla.org/en/nsIURL
https://developer.mozilla.org/en/XPCOM_Interface_Reference/nsIURLParser

Also mentioned on Firefox 9 for developers and Updating add-ons for Firefox 9.
Keywords: dev-doc-needed → dev-doc-complete
Ok, isn't this so that with this change now Firefox accepts this URI?

"www.example.com;foo=bar"

Is that really intended? It seems to me that Firefox is now rather lonely in doing this interpretation and I think it hurts us all to have this different opinion on how to parse URIs...
(Assignee)

Comment 15

6 years ago
(In reply to Daniel Stenberg from comment #14)
> Ok, isn't this so that with this change now Firefox accepts this URI?
> 
> "www.example.com;foo=bar"

That's not the intent, and I believe it accepted those before; it just treated them differently.
 
> Is that really intended? It seems to me that Firefox is now rather lonely in
> doing this interpretation and I think it hurts us all to have this different
> opinion on how to parse URIs...

If you think that ";" in authorities should get special treatment, you may be right; maybe open a separate bug?
Thanks, I've submitted #713472 as a new bug for this.
Depends on: 732567
Blocks: 713472
You need to log in before you can comment on or make changes to this bug.