Open Bug 1348876 Opened 8 years ago Updated 9 months ago

Escaped backslashes replaced with backslashes (leading to forward slashes when hitting enter again)

Categories

(Firefox :: Address Bar, defect, P5)

47 Branch
defect

Tracking

()

Tracking Status
firefox-esr45 --- unaffected
firefox52 --- wontfix
firefox-esr52 --- fix-optional
firefox53 --- fix-optional
firefox54 --- wontfix
firefox55 --- wontfix
firefox56 --- wontfix
firefox57 --- fix-optional

People

(Reporter: jaraco, Unassigned)

References

(Depends on 1 open bug, Regression)

Details

(Keywords: parity-chrome, regression)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Firefox/52.0 Build ID: 20170302120751 Steps to reproduce: On macOS with FF 52, enter a URL with a backslash into the location bar. I'm entering a JSON query in the location bar and need to escape some curly braces (escaped with a backslash, and that backslash needs a backslash escape as well), so I enter a URL like this into my browser: https://example.com/query/{"field": "\\{value\\}"}/ Actual results: Firefox replaces the backslashes with slashes, destroying the query, and requesting the following path: /query/%7B%22field%22:%20%22//%7Bvalue//%7D%22%7D/ aka /query/{"field": "//{value//}"}/ This path contains six parts rather than the intended two. Expected results: Firefox should instead request a path as entered by a user like so: /query/%7B%22field%22:%20%22%5C%5C%7Bvalue%5C%5C%7D%22%7D/ If one instead enters into the location bar the quoted %5C character for the backslashes, the query is transmitted as expected and the browser renders a literal backslash in the location bar. However, because these backslashes are now literals, if one re-submits that location, the literal backslashes are converted to forward slashes. It appears to be impossible to reliably enter backslashes into the location bar. Presumably this behavior was added as a convenience for users that can't discriminate between the two characters. Firefox should not munge the URL, should delegate this "convenience" behavior to a plugin, or should at the very least provide a configuration option to disable this rewrite.
Blocks: 652186
Status: UNCONFIRMED → NEW
Component: Untriaged → Networking
Ever confirmed: true
Flags: needinfo?(valentin.gosu)
Keywords: regression
Product: Firefox → Core
Version: 52 Branch → 48 Branch
Version: 48 Branch → 47 Branch
(In reply to Jason R. Coombs from comment #0) First of all, the path of a URL is a really bad place for a JSON. If you really need to do that, I would recommend you try with the search/query or hash/ref part of the URL. By doing this you avoid the backslash replacing algorithm, as it stops when it encounters ? or # in the input. Also, as it happens, your input is handled exactly the same in both Chrome and Firefox. I'm inclined to close this bug as invalid, as we are following the spec in this case. Anne, do you agree? [1] https://url.spec.whatwg.org/
Flags: needinfo?(valentin.gosu) → needinfo?(annevk)
Safari TP handles it like us as well. And yes, this is what https://url.spec.whatwg.org/ requires (although technically we could decide to handle address bar input differently with a different parser since it's UX, in practice we're not going to do that).
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(annevk)
Resolution: --- → INVALID
Hi all. Thanks for the response and the link to the spec. I'd not read it before, but as a maintainer of a popular web framework, I'm eager to know more. I hope you'll be willing to help me understand better. Here's what I've deduced from the spec. 1. A url is composed of several components, including "A URL-path-segment string", which is "zero or more URL units" and where [URL Units are URL code points and percent-encoded bytes](https://url.spec.whatwg.org/#url-units). Therefore, I would at least expect the three characters '%5C' to be valid. 2. In [URL Rendering](https://url.spec.whatwg.org/#url-rendering), it states "Other parts of the URL should have their sequences of percent-encoded bytes replaced with code points resulting from percent decoding those sequences converted to bytes, unless that renders those sequences invisible." That explains why a %5C would be rendered as a literal "\". 3. The "path percent-encode set" doesn't include "\", which based on the [encoding routine](https://url.spec.whatwg.org/#utf-8-percent-encode) indicates that "\" shouldn't be percent encoded. Between (1) and (3), I can see where it does seem as if "\" does somehow fall through the cracks, not having a proper representation either as percent-encoded bytes nor as a URL code point. Nevertheless, as I read the parsing routine, the presence of such a character doesn't seem to trigger any validation errors. Furthermore, the parsing routine for the path state and the query state are nearly identical in the substeps for parsing valid characters, so I would expect a literal backslash to be parsed similarly. The only references I see to a "\" in the path segment of a URL are relevant to "special" URLs, which doesn't appear to apply to the examples above. What I don't see is where the spec stipulates that a "\" should be converted to a "/". Is that a result of the spec being ambiguous about the "\" character? Or am I missing how this substitution is implicated in the spec? It's striking to me that a character that appears unchorded on most keyboards would be simply invalid for a URL and conflated with another character by convention or specification. If what you say is true that the spec does in fact dictate this behavior, then my feeling is the spec is flawed in its ability to represent otherwise common, viable characters. To say that JSON cannot be readily represented in the path of a URL without triggering disruptive UI behaviors feels broken, but if could enlighten me, I would very much appreciate it.
Only the parser is applicable to this question and special URLs very much apply since HTTP(S) URLs are special URLs per https://url.spec.whatwg.org/#url-miscellaneous. Note that you can still use \ if you want, you just need to escape it using percent-encoding (that may or may not work in the address bar though, since that does its own pre/post-processing that some of us are not a fan of and others are).
Thanks for the clarification. Now that I revisit that section, I see you're right. They are special, which means that "\" characters are illegal in the path segment and produce a validation error. > that may or may not work in the address bar though, since that does its own pre/post-processing that some of us are not a fan of and others are That's what I'm reporting here. Entering %5C in the address bar works on the first request, but because the URL is rendered, it gets transformed into an invalid URL with a literal "\". If a literal "\" is illegal and only %5C should be used, then the address bar UI should not be transforming it to an illegal character. Chrome doesn't have this problem because it doesn't try to render the %5C (or any of the encoded characters for that matter).
Okay, if the concern is specific to the address bar I'll reopen and relocate this bug to the appropriate place.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Component: Networking → Location Bar
Product: Core → Firefox
Summary: Backslashes replaced with forward slashes in location bar → Escaped backslashes replaced with backslashes (leading to forward slashes when hitting enter again)
Version: 47 Branch → unspecified
Status: REOPENED → NEW
Version: unspecified → 47 Branch
Priority: -- → P3
Too late for 54. Mark 54 won't fix.
No longer blocks: 652186
Regressed by: 652186
Has Regression Range: --- → yes
Severity: normal → S3
Keywords: parity-chrome
Priority: P3 → P5
You need to log in before you can comment on or make changes to this bug.