Closed Bug 824887 Opened 12 years ago Closed 9 years ago

Round brackets are escaped when copy-and-pasted from the location box.

Categories

(Firefox :: Address Bar, defect)

17 Branch
x86_64
Windows 7
defect
Not set
major

Tracking

()

RESOLVED FIXED
Firefox 47
Tracking Status
firefox47 --- fixed

People

(Reporter: unorthodox.engineers, Assigned: dao)

References

(Blocks 1 open bug, )

Details

(Keywords: reproducible)

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11 Steps to reproduce: Select and copy (ctrl+c) a page link with a hash fragment containing ROUND BRACKETS from the location box and paste the resulting URL/IRI into any text editor, box, or email client. Actual results: All round brackets are 'percent escaped' in contravention of RFC3986 and RFC3987, and differently to the behavior exhibited by FF when saving the link by any other method: (bookmarks, dragging the link to the desktop or another browser) Links copied this way and pasted into other editors become broken if subsequently used by non-FF browsers. Expected results: Round brackets should never be escaped. Or un-escaped on entry, as FF also does. They are reserved gen-delims in all URL/URI/IRI schemes. See here for previous discussion: https://support.mozilla.org/en-US/questions/749465
Severity: normal → major
Component: Untriaged → Location Bar
Keywords: reproducible
oops, wrong support discussion URL. This one: https://support.mozilla.org/en-US/questions/945166#answer-392158
This is intentional, see bug 458565.
Status: UNCONFIRMED → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
Blocks: 458565
Hi Dao, As noted in 501719 comment 2, this current behaviour disagrees with rfc3986 [1] It also goes against the URL spec [2] and it is a hack no other browser implements. Moreover, it breaks other workflows and applications that properly implement the URL spec (there are other bug reports apart from the 2 duplicates) I think we should reconsider this bug, or at least find a way to improve the user experience. Even a pref would help the situation here, but I think the Desktop team can figure out a good solution for this. [1] http://tools.ietf.org/html/rfc3986#section-2.2 [2] https://url.spec.whatwg.org/#url-code-points
Blocks: url
Status: RESOLVED → REOPENED
Ever confirmed: true
Flags: needinfo?(dao)
Resolution: WONTFIX → ---
This also breaks copy-and-paste of some links in the HTML standard. I really wish the address bar would not try to do its own parsing of URLs.
(In reply to Valentin Gosu [:valentin] from comment #5) > Hi Dao, > As noted in 501719 comment 2, this current behaviour disagrees with rfc3986 > [1] > It also goes against the URL spec [2] and it is a hack no other browser > implements. > Moreover, it breaks other workflows and applications that properly implement > the URL spec (there are other bug reports apart from the 2 duplicates) It's pretty much impossible to properly implement the URL spec while also preventing false positives when detecting URLs in plain text. Hence bug 458565. (In reply to Anne (:annevk) from comment #6) > This also breaks copy-and-paste of some links in the HTML standard. Can you provide the link? How common are such cases? Bug 458565 is a pretty common case and we shouldn't regress this unless there's a net win for users and websites.
Flags: needinfo?(dao)
E.g., https://html.spec.whatwg.org/multipage/forms.html#color-state-(type=color). If you follow https://html.spec.whatwg.org/multipage/forms.html#color-state-%28type=color%29 (which you get from copy-and-pasting) you end up at https://html.spec.whatwg.org/multipage/forms.html#forms. That no other browser has this weird behavior should be enough of a demonstration that bug 458565 is not representative.
Ironically, you're saying bug 458565 isn't representative and at the same time you provide another example for it: Clicking the first link leads to https://html.spec.whatwg.org/multipage/forms.html#forms too since the closing parenthesis doesn't get linkified. I have to manually select and copy beyond the link -- something many average users won't be able to figure out. So this is broken no matter what we do, whereas encoding parentheses on copy at least works for sites that aren't strict about the difference. It would be interesting to know how often encoded parentheses are accepted vs. how often they lead to failures in the wild.
Let's not use a bug in Bugzilla as an argument here.
It's not a bug ... it's a problem that manifests all over the web where linkification is a thing. Humans put closing parentheses after URLs and then software can't tell whether or not they're part of the URL.
Sure it is, there is plenty of software that deals better with such URLs, e.g., IRCCloud. Just have to pay close attention to the URL Standard.
IRCCloud seems to treat a trailing closing parenthesis as part of the URL if and only if there's an opening parenthesis in the link, or something like that. It's a trick to make a sucky situation less sucky, but it's neither a perfect solution nor does it have much to do with the URL standard.
(In reply to unorthodox.engineers from comment #14) Your comment seems detached from the rest of the discussion. You care about spec compliance, I get it, and I care too, but it's not the only factor here nor does it automatically trump every other aspect. In particular, I'd still be interested in getting a better idea of how widespread web servers are that take the URL spec as seriously as you do, because that would put some extra weight behind your argument. I suspect they're rather uncommon. At least, search engines and wikipedia don't seem to care, and those are probably the most popular sites where users come across parentheses in URLs. (Note that the whatwg.org example with parentheses in the URL fragment part is a separate problem, unrelated to web server behavior.) > I would suggest that whether or not people enclose URLs in brackets and fail > to copy-paste the right parts later has no bearing. Well, I disagree. Browsers serve users and when users fail with a clear pattern, we have a problem. > The same applies for all other quotation marks, no? Users seem to cope. What other quotation marks?
(In reply to Saair Quaderi from comment #16) > I really don't understand the controversy here. Firefox is doing the > following: > > 1) copying incorrect links into the clipboard which are broken Define broken? Yes, there's the URL standard, but then there's the real world too. See comment 15. Let's try not to go in circles. > 2) going against what every other browser is doing We're under no obligation to follow other browsers, especially when we differ in order to help users. Other browsers' behavior is a good data point in this case but not much more than that. > 3) misleading users by having one thing in the url bar and another thing in > the clipboard when the text is copied The logical consequence of decoding URLs so users can read them better, yes. This isn't limited to parentheses and hence a distraction as far as this bug is concerned. > 4) going against specifications for character encoding > > 5) introducing completely arbitrary complexity > > Why? > > 1) Because a few users who use a third-party syntax (markdown) don't follow > the rules of the syntax properly (add an escape character before parentheses > as the syntax says they should) Nobody was referring to markdown. For URLs in plaintext the rules of the syntax are just those of the English language or whatever language the text is written in. No more rants please.
A broken copy of a link is a link which is not functionally equivalent to the url it is meant to represent. As I have already explained, the result is that people who follow the copied link often do not end up at the correct website destination. Can we get a position on this bug from anyone else at Mozilla? The only person arguing to keep this bug is the person who introduced it. No more rants from me. I'm unfollowing and done with this.
(In reply to Saair Quaderi from comment #18) > A broken copy of a link is a link which is not functionally equivalent to > the url it is meant to represent. As I have already explained, the result is > that people who follow the copied link often do not end up at the correct > website destination. That's a definition I can work with. Now, the current behavior clearly doesn't just produce broken URLs, as https://en.wikipedia.org/wiki/Firefox_%28disambiguation%29 and http://en.wikipedia.org/wiki/Firefox_(disambiguation) work the same way, just as an example. This is not to say that wikipedia trumps every other website out there, but the picture isn't as simple as you paint it. > Can we get a position on this bug from anyone else at Mozilla? The only > person arguing to keep this bug is the person who introduced it. Valentin and Anne work for Mozilla.
The fact that Wikipedia happens to go through the trouble to redirect that does not make the two functionally equivalent for websites in general. That behavior should not be expected. I'm not inclined to go through the effort of making a server-side demo, but here's a simple client-side demo that demonstrates how encoding the parentheses against standards is a problem since it can yield bad experiences for users who receive the broken copy of the link: http://quaderi.github.io/firefox_bug_demo/index.html#(demo) http://quaderi.github.io/firefox_bug_demo/index.html#%28demo%29
Flags: needinfo?(dcamp)
It seems unlikely we can reach an agreement here, so I think we need the module owner to mediate. The fix for this bug is just removing 2 lines of code from urlbarBindings.xml , and we've seen strong arguments from both users, and Anne who is the editor of the URL spec. Dave, could you please pitch in on the subject? Thanks!
(In reply to Saair Quaderi from comment #20) > The fact that Wikipedia happens to go through the trouble to redirect that > does not make the two functionally equivalent for websites in general. That > behavior should not be expected. I'm not inclined to go through the effort > of making a server-side demo, but here's a simple client-side demo that > demonstrates how encoding the parentheses against standards is a problem > since it can yield bad experiences for users who receive the broken copy of > the link: > > http://quaderi.github.io/firefox_bug_demo/index.html#(demo) > > http://quaderi.github.io/firefox_bug_demo/index.html#%28demo%29 This is the same as the whatwg.org example, I acknowledged earlier that the URL fragment is a particular case (that could be treated separately). Still doesn't answer the question of how web servers tend to handle this. E.g. are wikipedia and popular search engines the exception in treating ( and %28 interchangeably? (In reply to Valentin Gosu [:valentin] from comment #21) > It seems unlikely we can reach an agreement here, I tried to move the discussion forward by asking the above question repeatedly, no response so far. Not sure if you think the answer to the question is obvious or it's wrong to ask that question. So yes, if you don't engage we're unlikely to reach an agreement. *But* I did some research and I think I found a solution that makes the discussion moot anyway. Between bug 458565 and now, bug 666964 landed and attempted to "use the actual loaded URI" on copy, except that it created that URI from the decoded URL bar value at which point we can't distinguish between encoded and plain parentheses anymore. If we make this code really use the loaded URI rather than creating a new one, we can simply restore the copy behavior we had before we even started decoding URLs in the URL bar.
(In reply to Dão Gottwald [:dao] from comment #22) > *But* I did some research and I think I found a solution that makes the > discussion moot anyway. Between bug 458565 and now, bug 666964 landed and > attempted to "use the actual loaded URI" on copy, except that it created > that URI from the decoded URL bar value at which point we can't distinguish > between encoded and plain parentheses anymore. Actually it was bug 668019 that changed where we get the URI from, because we can't use the loaded URI when copying an autocompleted URL.
(In reply to Dão Gottwald [:dao] from comment #22) > (In reply to Saair Quaderi from comment #20) > > The fact that Wikipedia happens to go through the trouble to redirect that > > does not make the two functionally equivalent for websites in general. That > > behavior should not be expected. I'm not inclined to go through the effort > > of making a server-side demo, but here's a simple client-side demo that > > demonstrates how encoding the parentheses against standards is a problem > > since it can yield bad experiences for users who receive the broken copy of > > the link: > > > > http://quaderi.github.io/firefox_bug_demo/index.html#(demo) > > > > http://quaderi.github.io/firefox_bug_demo/index.html#%28demo%29 > > This is the same as the whatwg.org example, I acknowledged earlier that the > URL fragment is a particular case (that could be treated separately). Still > doesn't answer the question of how web servers tend to handle this. E.g. are > wikipedia and popular search engines the exception in treating ( and %28 > interchangeably? It might be that the wikipedia link you provided has round brackets in the path, and the server unescapes them when loading the file from the filesystem. We do a similar thing with file:// URLs. But if the server uses the path as a REST url, or as an alias for some other resource, if we change ( to %28 it will probably break things. Also, if the round bracket is in query of hash of the URL, web servers don't unescape %28 because they shouldn't - per the URL spec. > > (In reply to Valentin Gosu [:valentin] from comment #21) > > It seems unlikely we can reach an agreement here, > > I tried to move the discussion forward by asking the above question > repeatedly, no response so far. Not sure if you think the answer to the > question is obvious or it's wrong to ask that question. So yes, if you don't > engage we're unlikely to reach an agreement. > I have engaged on a variety of bugs similar bugs, but it wears me down quickly. Right now copying from the URL bar does something no other browser does, changes the URL in a way that is not equivalent (as Anne has mentioned), and annoys lots of developers. I don't think there's anything else I should add. Continuing this discussion for another few months isn't productive, which is why I asked Dave to share his view of the matter. > *But* I did some research and I think I found a solution that makes the > discussion moot anyway. Between bug 458565 and now, bug 666964 landed and > attempted to "use the actual loaded URI" on copy, except that it created > that URI from the decoded URL bar value at which point we can't distinguish > between encoded and plain parentheses anymore. If we make this code really > use the loaded URI rather than creating a new one, we can simply restore the > copy behavior we had before we even started decoding URLs in the URL bar. I fully support that. It would also fix bug 1026938. So if you could write a quick path for that it would be awesome. Otherwise, removing these 3 lines would fix this issue once an for all: https://dxr.mozilla.org/mozilla-central/rev/e355cacefc881ba360d412853b57e8e060e966f4/browser/base/content/urlbarBindings.xml#619-622
(In reply to Valentin Gosu [:valentin] from comment #24) > I have engaged on a variety of bugs similar bugs, but it wears me down > quickly. > Right now copying from the URL bar does something no other browser does, > changes the URL in a way that is not equivalent (as Anne has mentioned), and > annoys lots of developers. I don't think there's anything else I should add. > Continuing this discussion for another few months isn't productive, which is > why I asked Dave to share his view of the matter. The bugs were similar in that they had something to do with encoding in the URL bar, that's about it. The issues are actually different; it's not like we had the same discussion over and over again. Also, it's not all that obvious that this particular issue "annoys lots of developers" when we've had about three reports in over seven years. > > *But* I did some research and I think I found a solution that makes the > > discussion moot anyway. Between bug 458565 and now, bug 666964 landed and > > attempted to "use the actual loaded URI" on copy, except that it created > > that URI from the decoded URL bar value at which point we can't distinguish > > between encoded and plain parentheses anymore. If we make this code really > > use the loaded URI rather than creating a new one, we can simply restore the > > copy behavior we had before we even started decoding URLs in the URL bar. > > I fully support that. It would also fix bug 1026938. So if you could write a > quick path for that it would be awesome. It won't really affect bug 1026938. > Otherwise, removing these 3 lines would fix this issue once an for all: > https://dxr.mozilla.org/mozilla-central/rev/ > e355cacefc881ba360d412853b57e8e060e966f4/browser/base/content/urlbarBindings. > xml#619-622 No, that would just copy the decoded URI.
Attached patch patchSplinter Review
This implements what I described at the end of comment 22
Assignee: nobody → dao
Status: REOPENED → ASSIGNED
Attachment #8719329 - Flags: review?(dolske)
This is awesome, Dao! Thanks for taking the bug!
Flags: needinfo?(dcamp)
Attachment #8719329 - Flags: review?(dolske) → review+
Status: ASSIGNED → RESOLVED
Closed: 12 years ago9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 47
Depends on: 1257804
QA Whiteboard: [good first verify]
Tested on Windows 10 x64, on the build from 2012-12-26, it was reproducible. Retested in on 47.0b3 and the bug was fixed. [testday-20160506]
Depends on: 1271088
Blocks: 1273521
Depends on: 1326634
Depends on: 1325535
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: