Percent decoding in URL bar transforms valid URLs into different valid URLs

NEW
Unassigned

Status

()

P2
normal
3 years ago
3 months ago

People

(Reporter: valentin, Unassigned)

Tracking

(Blocks: 5 bugs)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [fxsearch])

Attachments

(2 attachments, 1 obsolete attachment)

Load URL: http://example.com/?x=%33&a=hello
I click on the URL bar and change hello to bye, and hit enter.
The loaded URL becomes: http://example.com/?x=3&a=bye
Now x is 3 instead of %33 - even though the user never touched it.
Can you explain why exactly this is a problem? Here's my understanding: %33 is the encoding sequence for 3 and 3 is an allowed character in URLs; servers should treat them the same.

(Unassigning until I actually decide to work on this, assuming there's even a good way to address this.)
Assignee: dao → nobody
JavaScript code in the page may not handle the 2 URLs the same:
For the first, window.location.search is ?x=%33&a=hello
For the second, window.location.search is ?x=3&a=bye
That would be a bug in that JS code, wouldn't it, because what I said is also true for location.search, i.e. %33 and 3 should be treated the same (e.g. by calling decodeURIComponent). In order to transfer %33 literally, the % in there needs to be encoded as %25.
(In reply to Dão Gottwald [:dao] from comment #3)
> That would be a bug in that JS code, wouldn't it, because what I said is
> also true for location.search, i.e. %33 and 3 should be treated the same
> (e.g. by calling decodeURIComponent). In order to transfer %33 literally,
> the % in there needs to be encoded as %25.

Could you explain why it would do that?
Can you point to the spec when it mandates we should do that?
What is the reason Firefox would decode URIs in JS APIs, while other browsers don't do it?
Flags: needinfo?(dao)
(In reply to Valentin Gosu [:valentin] from comment #4)
> (In reply to Dão Gottwald [:dao] from comment #3)
> > That would be a bug in that JS code, wouldn't it, because what I said is
> > also true for location.search, i.e. %33 and 3 should be treated the same
> > (e.g. by calling decodeURIComponent). In order to transfer %33 literally,
> > the % in there needs to be encoded as %25.
> 
> Could you explain why it would do that?

Do what? What's "it"?

> Can you point to the spec when it mandates we should do that?

Again, do what? I don't understand what you're referring to.

> What is the reason Firefox would decode URIs in JS APIs, while other
> browsers don't do it?

I didn't say Firefox should decode location.search. I expect JS code that wants to act on these characters to use decodeURIComponent such that %33 and 3 become the same, because they're really meant to express the same thing.
Flags: needinfo?(dao)
(In reply to Dão Gottwald [:dao] from comment #5)
> 
> I didn't say Firefox should decode location.search. I expect JS code that
> wants to act on these characters to use decodeURIComponent such that %33 and
> 3 become the same, because they're really meant to express the same thing.

I understand now what you mean, but I'm still inclined to disagree.
The developer is the one who decides what the URL is meant to express. And while http://example.com/?x=%33 and http://example.com?x=3 do seem to encode the same information, they are not the same, by any standard. We should respect that.
(In reply to Valentin Gosu [:valentin] from comment #6)
> The developer is the one who decides what the URL is meant to express. And
> while http://example.com/?x=%33 and http://example.com?x=3 do seem to encode
> the same information, they are not the same, by any standard. We should
> respect that.

According to the URL standard, they do encode the same information, not just "seem to". It makes no sense for developers to arbitrarily claim that a given encoding sequence means something different to them.

Comment 8

3 years ago
Actually it does. Percent-encoded sequences are just representing bytes. We sometimes treat them as utf-8, but there's no requirement they have to be utf-8 encoded sequences.
Hi Dao,

Now that bug 824887 has been resolved, can we move to make sure we are consistent in how we handle percent encoded URLs?

Simple use case:
Load http://example.org/?test%5C%20test
Copying the URL yields http://example.org/?test%5C%20test
Clicking the URL bar and hitting enter loads http://example.org/?test\%20test
Copying the URL now yields http://example.org/?test\%20test

It would be great if we could apply the same strategy from bug 824887 to this issue - if the text in the URL bar hasn't been changed, load gBrowser.currentURI instead.

Do you think you could take this on? Thanks!
Flags: needinfo?(dao)
I don't have time for this right now, but it wouldn't quite solve bug 1026938 anyway. E.g. when the user modifies the location bar value, we can't use gBrowser.currentURI and still need to deal with the backslash somehow. And since http://example.org/?test\%20test is in fact not a valid URI, it's also not a good example for this bug.
No longer blocks: 1026938
Flags: needinfo?(dao)
Created attachment 8725615 [details] [diff] [review]
bug1163959.patch

Dao, I poked around through the code, and this seems to work. I assume it's not the best approach, but maybe you could suggest a better one
Attachment #8725615 - Flags: feedback?(dao)
Comment on attachment 8725615 [details] [diff] [review]
bug1163959.patch

AFAIK the intent of the _value property is different, setting it like this might be a problem. Also, userTypedValue isn't set for autocompleted values, you should check the pageproxystate attribute instead.
Attachment #8725615 - Flags: feedback?(dao) → feedback-
This does seem to work.
>          if (this.getAttribute("pageproxystate") == "valid") {
>            this._value = gBrowser.currentURI.spec;
>          }
I got the idea of using this._value from <method name="onInput">
Let me know if this is OK, or if not what would be a better alternative.
It seems pretty hacky. A better way might be to change onBeforeValueGet. I haven't fully thought this through, though. This is hairy code.
(In reply to Dão Gottwald [:dao] from comment #14)
> It seems pretty hacky. A better way might be to change onBeforeValueGet. I
> haven't fully thought this through, though. This is hairy code.
Changing onBeforeValueGet makes the urlbar not work at all :(
Created attachment 8743263 [details] [diff] [review]
Don't load percent decoded URL when user hits enter in the URL bar

Dao, I'm resubmitting the patch with the changes you suggested. If you feel we should take another approach, that's fine, but we should really fix this soon. People have been bugging me about it :)
Attachment #8743263 - Flags: review?(dao+bmo)
Attachment #8725615 - Attachment is obsolete: true
Assignee: nobody → valentin.gosu
Status: NEW → ASSIGNED
Comment on attachment 8743263 [details] [diff] [review]
Don't load percent decoded URL when user hits enter in the URL bar

Still seems like a misuse of _value to me. Can you show me that onBeforeValueGet patch that didn't work?
Attachment #8743263 - Flags: review?(dao+bmo)
Created attachment 8743862 [details] [diff] [review]
onBeforeValueGet.diff

(In reply to Dão Gottwald [:dao] from comment #17)
> Still seems like a misuse of _value to me. Can you show me that
> onBeforeValueGet patch that didn't work?

This pretty much breaks the URL bar.
(In reply to Valentin Gosu [:valentin] from comment #18)
> Created attachment 8743862 [details] [diff] [review]
> onBeforeValueGet.diff
> 
> (In reply to Dão Gottwald [:dao] from comment #17)
> > Still seems like a misuse of _value to me. Can you show me that
> > onBeforeValueGet patch that didn't work?
> 
> This pretty much breaks the URL bar.

Interesting. Blair, you implemented onBeforeValueGet in bug 1067903. Any idea what's going on here?
Flags: needinfo?(bmcbride)
did you try to print _value when you replace it? is it maybe a moz-action: url and you are replacing it with a plain url? Maybe you could _parseActionUrl and makeActionURL the action url after replacing the underlying url. But touching this code may open a can of worms, and I honestly don't know what may break.
(In reply to Marco Bonardo [::mak] from comment #20)
> did you try to print _value when you replace it? is it maybe a moz-action:
> url and you are replacing it with a plain url?

That's right... I was replacing an about: URI

> Maybe you could _parseActionUrl and makeActionURL the action url after replacing the
> underlying url. But touching this code may open a can of worms, and I
> honestly don't know what may break.

However, even if I only do this for http/s, I don't think onBeforeValueGet is the right place for this code, as it still breaks stuff. The method gets called for every character typed in the URL bar, and since it returns the current URI, it doesn't allow me to navigate to another page.

From what I can tell handleEnter is still the best place to do this. We just need to make sure nothing breaks. So far I haven't found any issues.
(In reply to Dão Gottwald [:dao] from comment #19)
> Interesting. Blair, you implemented onBeforeValueGet in bug 1067903. Any
> idea what's going on here?

Looks like it's breaking the detection of when pageproxystate should be set to invalid. ie, it ends up always comparing gBrowser.currentURI.spec to itself, and so pageproxystate is always valid.

This kinda feels like a similar problem to bug 1104165. It's possible the patch from that bug shouldn't have used losslessDecodeURI to set .value (ie, currently ._value stores the decoded value - maybe it should be storing the non-decoded value).
Flags: needinfo?(bmcbride)
Marco, do you have time to take this, or do you think you can find someone to work on it?
I'm seeing more and more bugs such as bug 1348876 comment 6.
Thanks!
Assignee: valentin.gosu → mak77
ni for previous comment.
Flags: needinfo?(mak77)
Nope, I'm unlikely to have time to handle this.
I'll need to be triaged by the team and put into perspective with the other projects. (setting [fxsearch] but no priority to put it into triage]

Why is this suddenly critical after 2 years?
Assignee: mak77 → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(mak77)
Whiteboard: [fxsearch]
Not exactly critical, but we keep getting reports about this being broken.
Priority: -- → P2

Updated

8 months ago
Blocks: 1425024

Updated

3 months ago
Blocks: 1421431
You need to log in before you can comment on or make changes to this bug.