Open Bug 1163959 Opened 10 years ago Updated 2 years ago

Percent decoding in URL bar can make certain sites no longer visible

Categories

(Firefox :: Address Bar, defect, P3)

defect
Points:
5

Tracking

()

People

(Reporter: valentin, Unassigned)

References

(Blocks 3 open bugs)

Details

(Keywords: site-compat)

Attachments

(2 files, 1 obsolete file)

Load URL: http://example.com/?x=%33&a=hello I click on the URL bar and change hello to bye, and hit enter. The loaded URL becomes: http://example.com/?x=3&a=bye Now x is 3 instead of %33 - even though the user never touched it.
Can you explain why exactly this is a problem? Here's my understanding: %33 is the encoding sequence for 3 and 3 is an allowed character in URLs; servers should treat them the same. (Unassigning until I actually decide to work on this, assuming there's even a good way to address this.)
Assignee: dao → nobody
JavaScript code in the page may not handle the 2 URLs the same: For the first, window.location.search is ?x=%33&a=hello For the second, window.location.search is ?x=3&a=bye
That would be a bug in that JS code, wouldn't it, because what I said is also true for location.search, i.e. %33 and 3 should be treated the same (e.g. by calling decodeURIComponent). In order to transfer %33 literally, the % in there needs to be encoded as %25.
(In reply to Dão Gottwald [:dao] from comment #3) > That would be a bug in that JS code, wouldn't it, because what I said is > also true for location.search, i.e. %33 and 3 should be treated the same > (e.g. by calling decodeURIComponent). In order to transfer %33 literally, > the % in there needs to be encoded as %25. Could you explain why it would do that? Can you point to the spec when it mandates we should do that? What is the reason Firefox would decode URIs in JS APIs, while other browsers don't do it?
Flags: needinfo?(dao)
(In reply to Valentin Gosu [:valentin] from comment #4) > (In reply to Dão Gottwald [:dao] from comment #3) > > That would be a bug in that JS code, wouldn't it, because what I said is > > also true for location.search, i.e. %33 and 3 should be treated the same > > (e.g. by calling decodeURIComponent). In order to transfer %33 literally, > > the % in there needs to be encoded as %25. > > Could you explain why it would do that? Do what? What's "it"? > Can you point to the spec when it mandates we should do that? Again, do what? I don't understand what you're referring to. > What is the reason Firefox would decode URIs in JS APIs, while other > browsers don't do it? I didn't say Firefox should decode location.search. I expect JS code that wants to act on these characters to use decodeURIComponent such that %33 and 3 become the same, because they're really meant to express the same thing.
Flags: needinfo?(dao)
(In reply to Dão Gottwald [:dao] from comment #5) > > I didn't say Firefox should decode location.search. I expect JS code that > wants to act on these characters to use decodeURIComponent such that %33 and > 3 become the same, because they're really meant to express the same thing. I understand now what you mean, but I'm still inclined to disagree. The developer is the one who decides what the URL is meant to express. And while http://example.com/?x=%33 and http://example.com?x=3 do seem to encode the same information, they are not the same, by any standard. We should respect that.
(In reply to Valentin Gosu [:valentin] from comment #6) > The developer is the one who decides what the URL is meant to express. And > while http://example.com/?x=%33 and http://example.com?x=3 do seem to encode > the same information, they are not the same, by any standard. We should > respect that. According to the URL standard, they do encode the same information, not just "seem to". It makes no sense for developers to arbitrarily claim that a given encoding sequence means something different to them.
Actually it does. Percent-encoded sequences are just representing bytes. We sometimes treat them as utf-8, but there's no requirement they have to be utf-8 encoded sequences.
Hi Dao, Now that bug 824887 has been resolved, can we move to make sure we are consistent in how we handle percent encoded URLs? Simple use case: Load http://example.org/?test%5C%20test Copying the URL yields http://example.org/?test%5C%20test Clicking the URL bar and hitting enter loads http://example.org/?test\%20test Copying the URL now yields http://example.org/?test\%20test It would be great if we could apply the same strategy from bug 824887 to this issue - if the text in the URL bar hasn't been changed, load gBrowser.currentURI instead. Do you think you could take this on? Thanks!
Flags: needinfo?(dao)
I don't have time for this right now, but it wouldn't quite solve bug 1026938 anyway. E.g. when the user modifies the location bar value, we can't use gBrowser.currentURI and still need to deal with the backslash somehow. And since http://example.org/?test\%20test is in fact not a valid URI, it's also not a good example for this bug.
No longer blocks: 1026938
Flags: needinfo?(dao)
Attached patch bug1163959.patch (obsolete) — Splinter Review
Dao, I poked around through the code, and this seems to work. I assume it's not the best approach, but maybe you could suggest a better one
Attachment #8725615 - Flags: feedback?(dao)
Comment on attachment 8725615 [details] [diff] [review] bug1163959.patch AFAIK the intent of the _value property is different, setting it like this might be a problem. Also, userTypedValue isn't set for autocompleted values, you should check the pageproxystate attribute instead.
Attachment #8725615 - Flags: feedback?(dao) → feedback-
This does seem to work. > if (this.getAttribute("pageproxystate") == "valid") { > this._value = gBrowser.currentURI.spec; > } I got the idea of using this._value from <method name="onInput"> Let me know if this is OK, or if not what would be a better alternative.
It seems pretty hacky. A better way might be to change onBeforeValueGet. I haven't fully thought this through, though. This is hairy code.
(In reply to Dão Gottwald [:dao] from comment #14) > It seems pretty hacky. A better way might be to change onBeforeValueGet. I > haven't fully thought this through, though. This is hairy code. Changing onBeforeValueGet makes the urlbar not work at all :(
Dao, I'm resubmitting the patch with the changes you suggested. If you feel we should take another approach, that's fine, but we should really fix this soon. People have been bugging me about it :)
Attachment #8743263 - Flags: review?(dao+bmo)
Attachment #8725615 - Attachment is obsolete: true
Assignee: nobody → valentin.gosu
Status: NEW → ASSIGNED
Comment on attachment 8743263 [details] [diff] [review] Don't load percent decoded URL when user hits enter in the URL bar Still seems like a misuse of _value to me. Can you show me that onBeforeValueGet patch that didn't work?
Attachment #8743263 - Flags: review?(dao+bmo)
(In reply to Dão Gottwald [:dao] from comment #17) > Still seems like a misuse of _value to me. Can you show me that > onBeforeValueGet patch that didn't work? This pretty much breaks the URL bar.
(In reply to Valentin Gosu [:valentin] from comment #18) > Created attachment 8743862 [details] [diff] [review] > onBeforeValueGet.diff > > (In reply to Dão Gottwald [:dao] from comment #17) > > Still seems like a misuse of _value to me. Can you show me that > > onBeforeValueGet patch that didn't work? > > This pretty much breaks the URL bar. Interesting. Blair, you implemented onBeforeValueGet in bug 1067903. Any idea what's going on here?
Flags: needinfo?(bmcbride)
did you try to print _value when you replace it? is it maybe a moz-action: url and you are replacing it with a plain url? Maybe you could _parseActionUrl and makeActionURL the action url after replacing the underlying url. But touching this code may open a can of worms, and I honestly don't know what may break.
(In reply to Marco Bonardo [::mak] from comment #20) > did you try to print _value when you replace it? is it maybe a moz-action: > url and you are replacing it with a plain url? That's right... I was replacing an about: URI > Maybe you could _parseActionUrl and makeActionURL the action url after replacing the > underlying url. But touching this code may open a can of worms, and I > honestly don't know what may break. However, even if I only do this for http/s, I don't think onBeforeValueGet is the right place for this code, as it still breaks stuff. The method gets called for every character typed in the URL bar, and since it returns the current URI, it doesn't allow me to navigate to another page. From what I can tell handleEnter is still the best place to do this. We just need to make sure nothing breaks. So far I haven't found any issues.
(In reply to Dão Gottwald [:dao] from comment #19) > Interesting. Blair, you implemented onBeforeValueGet in bug 1067903. Any > idea what's going on here? Looks like it's breaking the detection of when pageproxystate should be set to invalid. ie, it ends up always comparing gBrowser.currentURI.spec to itself, and so pageproxystate is always valid. This kinda feels like a similar problem to bug 1104165. It's possible the patch from that bug shouldn't have used losslessDecodeURI to set .value (ie, currently ._value stores the decoded value - maybe it should be storing the non-decoded value).
Flags: needinfo?(bmcbride)
Blocks: 1310987
Marco, do you have time to take this, or do you think you can find someone to work on it? I'm seeing more and more bugs such as bug 1348876 comment 6. Thanks!
Assignee: valentin.gosu → mak77
ni for previous comment.
Flags: needinfo?(mak77)
Blocks: 1348876
Nope, I'm unlikely to have time to handle this. I'll need to be triaged by the team and put into perspective with the other projects. (setting [fxsearch] but no priority to put it into triage] Why is this suddenly critical after 2 years?
Assignee: mak77 → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(mak77)
Whiteboard: [fxsearch]
Not exactly critical, but we keep getting reports about this being broken.
Blocks: 1026938
Priority: -- → P2
Blocks: 1425024
Blocks: 1421431
See Also: → 1509452

Updating the summary as this is turning into a compatibility issue of sorts. Perhaps the safe thing to do here would be to only attempt to decode non-ASCII code points, as those always roundtrip in the same way.

Keywords: site-compat
Summary: Percent decoding in URL bar transforms valid URLs into different valid URLs → Percent decoding in URL bar can make certain sites no longer visible

I feel this is not resolved as the behaviour of Firefox is different to MSEdge and Chrome. Consequently, it is not possible to send certian URL's to the server and have them interpreted correctly by firefox.

In this example, I'm trying to perform a database LIKE operation with percent appended to the string, which the server expectes to see encoded as "%25" in the URL. Firefox will decode the string and send it to the server. However this is not the expected behaviour and the server ignores the percent and hence returns the wrong results.

This URL, clicked from a link on a page, in an email or whereever
http://apex.somewhere.com/pls/apex/f?p=AppNo:PageNo:::::IRLIKE_COLUMNNAME:VALUE%25:

Is being decode to this:
http://apex.somewhere.com/pls/apex/f?p=AppNo:PageNo:::::IRLIKE_COLUMNNAME:VALUE%:

to send to the server.

This results in the server not recognising the percent after the VALUE, hence it's not possible to perform a database LIKE via Firefox.
Percent symbol after the string is the wildcard in the oracle database in this example.

Severity: normal → S3
Priority: P2 → P3
Keywords: papercut
See Also: → 1429786
See Also: → 1486664
Points: --- → 5
Keywords: papercut
Whiteboard: [fxsearch]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: