Closed Bug 1148861 Opened 9 years ago Closed 9 years ago

copying URL from location bar no longer URL-escapes what is copied (after the #)

Categories

(Core :: Networking, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox36 --- unaffected
firefox37 --- unaffected
firefox38 - fixed
firefox39 - fixed
firefox40 - fixed

People

(Reporter: dbaron, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: regression, Whiteboard: fixed by bug 1149913)

Attachments

(1 file, 1 obsolete file)

Copying a URL from the location bar that has spaces in it no longer correctly URL-escapes (as %20) those spaces.  This means that pasting that URL into other places that take URLs yields broken URLs.  See for example bug 1035678 comment 253, though I believe the same thing would happen when pasting a URL with a space into a tweet (Twitter would only shorten and linkify the part before the tweet).

This is a regression from January; a one-day mozilla-central regression range from nightlies is:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=3d846527576f&tochange=63006936ab99

Steps to reproduce:
1. go to https://treeherder.mozilla.org/#/jobs?repo=fx-team&filter-searchStr=Ubuntu%20VM%2012.04%20fx-team%20%20debug%20test%20mochitest-devtools-chrome-1
2. press Ctrl+L to go to the location bar
3. press Ctrl+C to copy
4. go to another application or a textarea in another browser tab
5. press Ctrl+V to paste

Expected results:
5. URL shows up as
https://treeherder.mozilla.org/#/jobs?repo=fx-team&filter-searchStr=Ubuntu%20VM%2012.04%20fx-team%20%20debug%20test%20mochitest-devtools-chrome-1

Actual results:
5. URL shows up as https://treeherder.mozilla.org/#/jobs?repo=fx-team&filter-searchStr=Ubuntu VM 12.04 fx-team  debug test mochitest-devtools-chrome-1

(Note that after step 1, the URL bar shows spaces both before and after the regression; the thing that regressed is what was copied.)
Bug not present in 36.0.4 (current mozilla-release).
Bug not present in 37.0 (current mozilla-beta).
Bug is present in 38.0a2 (2015-03-28) (current mozilla-aurora).

[Tracking Requested - why for this release]:  This seems like a reasonably serious regression in that it breaks copy-and-paste of URLs into other Web pages and other applications.
(In reply to David Baron [:dbaron] ⏰UTC-7 from comment #0)
> This is a regression from January; a one-day mozilla-central regression
> range from nightlies is:
> https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=3d846527576f&tochange=63006936ab99

The only obvious change in that range that popped out is bug 1093611.
Summary: copying URL from location bar no longer URL-escapes what is copied → copying URL from location bar no longer URL-escapes what is copied (after the #)
I didn't actually test that bug 1093611 is the cause, but I think it's highly probable.
Component: Location Bar → Networking
Product: Firefox → Core
Flags: needinfo?(valentin.gosu)
Blocks: 1134240
Indeed, this is due to bug 1093611, but it may be expected behaviour.
The problem here is that the original URL contains %20, which are unescaped by the URL bar, so they turn into spaces. The reason it worked before, was that the URL was escaped in memory.

In Bug 1124600 comment 3 I located the place where this is done. However, conditionally unescaping just some parts of the URL may be a bit difficult.
Flags: needinfo?(valentin.gosu)
Assignee: nobody → valentin.gosu
Status: NEW → ASSIGNED
This patch does a couple of things:
1. It introduces the browser.urlbar.copyEscapeParentheses pref, which defaults to false. While the encoding of parentheses might be useful in some cases, I think by default you should be able to get from the URL bar exactly the URL that you paste in.
2. It introduces the browser.urlbar.copyReturnsOriginalURI pref, which defaults to true (we might want a better name for this pref). This means when selecting the entire URL, and no change has been made to the URL bar, it returns the value of gBrowser.currentURI.

I think this is the way to go. Having the prefs around means that users can easily revert to the old behaviour if they want, but by default we do things in a sane way.
(In reply to Valentin Gosu [:valentin] from comment #4)
> Indeed, this is due to bug 1093611, but it may be expected behaviour.
> The problem here is that the original URL contains %20, which are unescaped
> by the URL bar, so they turn into spaces. The reason it worked before, was
> that the URL was escaped in memory.

Note that there are multiple ways to arrive at that URL; you can also get there by:
 1. going to https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound
 2. click on one of the builds
 3. in the panel in the lower left, click on the link following "Job:"

Do the patches fix copying in that case as well?
(In reply to David Baron [:dbaron] ⏰UTC-7 from comment #7)
> 
> Note that there are multiple ways to arrive at that URL; you can also get
> Do the patches fix copying in that case as well?

Yes, it seems to work just fine.
Also note that until this lands, you can flip the dom.url.encode_decode_hash pref to true to get the old behaviour back.
Comment on attachment 8585234 [details] [diff] [review]
When copying the entire URL return the underlying URL, instead of the escaped version

>--- a/browser/base/content/test/general/browser_urlbarCopying.js
>+++ b/browser/base/content/test/general/browser_urlbarCopying.js
>@@ -12,17 +12,16 @@ function test() {
>     gBrowser.removeTab(tab);
>     Services.prefs.clearUserPref(trimPref);
>     Services.prefs.clearUserPref(phishyUserPassPref);
>     URLBarSetURI();
>   });
> 
>   Services.prefs.setBoolPref(trimPref, true);
>   Services.prefs.setIntPref(phishyUserPassPref, 32); // avoid prompting about phishing
>-
>   waitForExplicitFinish();
> 
>   nextTest();
> }

? :)

>   // Test escaping
>   {
>-    loadURL: "http://example.com/()%C3%A9",
>-    expectedURL: "example.com/()\xe9",
>-    copyExpected: "http://example.com/%28%29%C3%A9"
>-  },
>-  {
>-    copyVal: "<example.com/(>)\xe9",
>-    copyExpected: "http://example.com/("
>-  },
>-  {
>-    copyVal: "e<xample.com/(>)\xe9",
>-    copyExpected: "xample.com/("
>-  },
>-
>-  {

Please don't remove these tests.

>+          // If the full URL is selected, the user probably wants
>+          // to copy the exact URL of the current location
>+          if (this.copyReturnsOriginalURI &&
>+              inputVal == selectedVal) {
>+            baseVal = gBrowser.currentURI.spec;
>+          }

There's already another code block with a comment saying "If the entire URL is selected, just use the actual loaded URI", and your comment says almost the exact same thing, which is rather confusing.

Also, creating a new URI from gBrowser.currentURI.spec and then use .spec from that seems unnecessarily complicated. How is that better than using gBrowser.currentURI.spec directly?

>-            uri = uriFixup.createFixupURI(inputVal, Ci.nsIURIFixup.FIXUP_FLAG_NONE);
>+            uri = uriFixup.createFixupURI(baseVal, Ci.nsIURIFixup.FIXUP_FLAG_NONE);

So the issue leading to this bug is that uriFixup.createFixupURI(inputVal, Ci.nsIURIFixup.FIXUP_FLAG_NONE).spec doesn't give us a properly escaped URL anymore? Why is that? nsIURIFixup.idl says createFixupURI attempts to "correct any errors in the syntax or other vagaries" and returns "a wellformed URI."

>           // If the entire URL is selected, just use the actual loaded URI.
>           if (inputVal == selectedVal) {
>             // ... but only if  isn't a javascript: or data: URI, since those
>             // are hard to read when encoded
>             if (!uri.schemeIs("javascript") && !uri.schemeIs("data")) {
>-              // Parentheses are known to confuse third-party applications (bug 458565).
>-              selectedVal = uri.spec.replace(/[()]/g, function (c) escape(c));
>+              selectedVal = uri.spec;
>+              if (this.copyEscapeParentheses) {
>+                // Parentheses are known to confuse third-party applications (bug 458565).
>+                selectedVal = selectedVal.replace(/[()]/g, function (c) escape(c));
>+              }

That's not what this bug was filed for. Can we please stay focused on the regression?
Attachment #8585234 - Flags: review?(dao) → review-
(In reply to Dão Gottwald [:dao] from comment #9)
> Also, creating a new URI from gBrowser.currentURI.spec and then use .spec
> from that seems unnecessarily complicated. How is that better than using
> gBrowser.currentURI.spec directly?

Forget this question; I don't think we can use gBrowser.currentURI at all, because the location bar's value can be modified, autocompleted, etc., so it won't match the browser's current URI.
(In reply to Dão Gottwald [:dao] from comment #9)
> Comment on attachment 8585234 [details] [diff] [review]

> So the issue leading to this bug is that uriFixup.createFixupURI(inputVal,
> Ci.nsIURIFixup.FIXUP_FLAG_NONE).spec doesn't give us a properly escaped URL
> anymore? Why is that? nsIURIFixup.idl says createFixupURI attempts to
> "correct any errors in the syntax or other vagaries" and returns "a
> wellformed URI."

The problem is that even though the escaped URL is still a valid URL, it is not equivalent to the original one. You're right, we can just use gBrowser.currentURI.spec.

> >+              if (this.copyEscapeParentheses) {
> >+                // Parentheses are known to confuse third-party applications (bug 458565).
> >+                selectedVal = selectedVal.replace(/[()]/g, function (c) escape(c));
> >+              }
> 
> That's not what this bug was filed for. Can we please stay focused on the
> regression?

IMO, it escaping parentheses is a symptom of the same bug, which is we should be returning the underlying  URL with no modifications. But if you insist, I can move that to another bug.

(In reply to Dão Gottwald [:dao] from comment #10)
> (In reply to Dão Gottwald [:dao] from comment #9)
> > Also, creating a new URI from gBrowser.currentURI.spec and then use .spec
> > from that seems unnecessarily complicated. How is that better than using
> > gBrowser.currentURI.spec directly?
> 
> Forget this question; I don't think we can use gBrowser.currentURI at all,
> because the location bar's value can be modified, autocompleted, etc., so it
> won't match the browser's current URI.

Modified values are handled by the if (this.selectionStart > 0 || this.valueIsTyped) return selectedVal;
bit at the beginning of the function.
However, you are right about autocompleted URLs. Is there a flag set when that happens?
Flags: needinfo?(dao)
(In reply to Valentin Gosu [:valentin] from comment #11)
> (In reply to Dão Gottwald [:dao] from comment #9)
> > So the issue leading to this bug is that uriFixup.createFixupURI(inputVal,
> > Ci.nsIURIFixup.FIXUP_FLAG_NONE).spec doesn't give us a properly escaped URL
> > anymore? Why is that? nsIURIFixup.idl says createFixupURI attempts to
> > "correct any errors in the syntax or other vagaries" and returns "a
> > wellformed URI."
> 
> The problem is that even though the escaped URL is still a valid URL, it is
> not equivalent to the original one.

AFAIK we make sure not to lose any information when decoding. (The function responsible for that is called losslessDecodeURI.) Also, that doesn't seem to answer my questions? Why is it now possible for nsIURIFixup.createFixupURI(...).spec to contain spaces?

> IMO, it escaping parentheses is a symptom of the same bug, which is we
> should be returning the underlying  URL with no modifications. But if you
> insist, I can move that to another bug.

We decided to explicitly encode parentheses when copying because various web sites didn't handle parentheses well (e.g. they failed to properly linkify pasted URLs), so no, it's not a symptom of the same bug.

> (In reply to Dão Gottwald [:dao] from comment #10)
> > Forget this question; I don't think we can use gBrowser.currentURI at all,
> > because the location bar's value can be modified, autocompleted, etc., so it
> > won't match the browser's current URI.
> 
> Modified values are handled by the if (this.selectionStart > 0 ||
> this.valueIsTyped) return selectedVal;
> bit at the beginning of the function.
> However, you are right about autocompleted URLs. Is there a flag set when
> that happens?

I don't think so. Maybe a combination of valueIsTyped and the pageproxystate attribute would work, but this is getting rather fragile and it would be better if we didn't depend on gBrowser.currentURI some of the time.
Flags: needinfo?(dao)
(In reply to Dão Gottwald [:dao] from comment #12)
> (In reply to Valentin Gosu [:valentin] from comment #11)
> > (In reply to Dão Gottwald [:dao] from comment #9)
> > > So the issue leading to this bug is that uriFixup.createFixupURI(inputVal,
> > > Ci.nsIURIFixup.FIXUP_FLAG_NONE).spec doesn't give us a properly escaped URL
> > > anymore? Why is that? nsIURIFixup.idl says createFixupURI attempts to
> > > "correct any errors in the syntax or other vagaries" and returns "a
> > > wellformed URI."
> > 
> > The problem is that even though the escaped URL is still a valid URL, it is
> > not equivalent to the original one.
> 
> AFAIK we make sure not to lose any information when decoding. (The function
> responsible for that is called losslessDecodeURI.) Also, that doesn't seem
> to answer my questions? Why is it now possible for
> nsIURIFixup.createFixupURI(...).spec to contain spaces?

Because of Bug 1093611.

> 
> > IMO, it escaping parentheses is a symptom of the same bug, which is we
> > should be returning the underlying  URL with no modifications. But if you
> > insist, I can move that to another bug.
> 
> We decided to explicitly encode parentheses when copying because various web
> sites didn't handle parentheses well (e.g. they failed to properly linkify
> pasted URLs), so no, it's not a symptom of the same bug.

Problem is that both ( and %28 are valid inside a URL's Query and Ref part, and we might be breaking URLs by escaping them. Same thing for brackets. Anyway, I will move that part of the patch to bug 1124600.

> 
> > (In reply to Dão Gottwald [:dao] from comment #10)
> > > Forget this question; I don't think we can use gBrowser.currentURI at all,
> > > because the location bar's value can be modified, autocompleted, etc., so it
> > > won't match the browser's current URI.
> > 
> > Modified values are handled by the if (this.selectionStart > 0 ||
> > this.valueIsTyped) return selectedVal;
> > bit at the beginning of the function.
> > However, you are right about autocompleted URLs. Is there a flag set when
> > that happens?
> 
> I don't think so. Maybe a combination of valueIsTyped and the pageproxystate
> attribute would work, but this is getting rather fragile and it would be
> better if we didn't depend on gBrowser.currentURI some of the time.

Actually, that seems to work quite nicely, thanks!

> Also, creating a new URI from gBrowser.currentURI.spec and then use .spec
> from that seems unnecessarily complicated. How is that better than using
> gBrowser.currentURI.spec directly?
> 

gBrowser.currentURI may not be exposable, so even though createFixupURI returns the same thing, we still need to do this.
Attachment #8585234 - Attachment is obsolete: true
(In reply to Valentin Gosu [:valentin] from comment #13)
> (In reply to Dão Gottwald [:dao] from comment #12)
> > (In reply to Valentin Gosu [:valentin] from comment #11)
> > > (In reply to Dão Gottwald [:dao] from comment #9)
> > > > So the issue leading to this bug is that uriFixup.createFixupURI(inputVal,
> > > > Ci.nsIURIFixup.FIXUP_FLAG_NONE).spec doesn't give us a properly escaped URL
> > > > anymore? Why is that? nsIURIFixup.idl says createFixupURI attempts to
> > > > "correct any errors in the syntax or other vagaries" and returns "a
> > > > wellformed URI."
> > > 
> > > The problem is that even though the escaped URL is still a valid URL, it is
> > > not equivalent to the original one.
> > 
> > AFAIK we make sure not to lose any information when decoding. (The function
> > responsible for that is called losslessDecodeURI.) Also, that doesn't seem
> > to answer my questions? Why is it now possible for
> > nsIURIFixup.createFixupURI(...).spec to contain spaces?
> 
> Because of Bug 1093611.

That's wrong, though. While I suppose it's fine for foo.hash to contain unencoded spaces, the constructed URI must not contain those. See http://tools.ietf.org/html/rfc3986#appendix-A
(In reply to Dão Gottwald [:dao] from comment #15)
> > > nsIURIFixup.createFixupURI(...).spec to contain spaces?
> > 
> > Because of Bug 1093611.
> 
> That's wrong, though. While I suppose it's fine for foo.hash to contain
> unencoded spaces, the constructed URI must not contain those. See
> http://tools.ietf.org/html/rfc3986#appendix-A

That's what bug 1093611 was about. See https://url.spec.whatwg.org/#fragment-state
``` Note: Unfortunately not using percent-encoding is intentional as implementations with majority market share exhibit this behavior. ```
https://url.spec.whatwg.org/#fragment-state appears to be about URL /parsing/. I think we need to be a little bit stricter when /producing/ URLs for external use. We've been expecting nsIURIFixup::createFixupURI (and nsIIOService::newURI) to give us interoperable URIs and all of a sudden it doesn't do that anymore, so I think we need to fix that rather than working around it.
(In reply to Dão Gottwald [:dao] from comment #17)
> https://url.spec.whatwg.org/#fragment-state appears to be about URL
> /parsing/. I think we need to be a little bit stricter when /producing/ URLs
> for external use. We've been expecting nsIURIFixup::createFixupURI (and
> nsIIOService::newURI) to give us interoperable URIs and all of a sudden it
> doesn't do that anymore, so I think we need to fix that rather than working
> around it.

I fully agree. However, the problem here isn't that createFixupURI doesn't give us a valid URL, but that it doesn't give us the original URI we had before percent-decoding it - which is the one we want.
"example.com/#a%20b" and "example.com/#a b" are essentially equivalent. But if I paste the first one in the URL bar, I think I should be able to get the same thing out of it.
(In reply to Valentin Gosu [:valentin] from comment #16)
> That's what bug 1093611 was about. See
> https://url.spec.whatwg.org/#fragment-state
> ``` Note: Unfortunately not using percent-encoding is intentional as
> implementations with majority market share exhibit this behavior. ```

It seems like there's a good reason for doing it our way, though, so maybe we should talk the other implementations into changing instead of breaking our implementation.
(In reply to Valentin Gosu [:valentin] from comment #18)
> (In reply to Dão Gottwald [:dao] from comment #17)
> > https://url.spec.whatwg.org/#fragment-state appears to be about URL
> > /parsing/. I think we need to be a little bit stricter when /producing/ URLs
> > for external use. We've been expecting nsIURIFixup::createFixupURI (and
> > nsIIOService::newURI) to give us interoperable URIs and all of a sudden it
> > doesn't do that anymore, so I think we need to fix that rather than working
> > around it.
> 
> I fully agree. However, the problem here isn't that createFixupURI doesn't
> give us a valid URL, but that it doesn't give us the original URI we had
> before percent-decoding it - which is the one we want.
> "example.com/#a%20b" and "example.com/#a b" are essentially equivalent. But
> if I paste the first one in the URL bar, I think I should be able to get the
> same thing out of it.

Now you seem to be talking about parsing again. Sure, putting "example.com/#a b" in the URL bar should work just like "example.com/#a%20b", but when copying the URL so the user can paste it /somewhere else/, we need to not have spaces in there. (Obviously, that's what this bug is about.) nsIURIFixup::createFixupURI and nsIIOService::newURI ensured this level of validity and now they don't. Maybe they still spit out valid URIs for some definition of valid, but it surely doesn't seem very useful. I wouldn't be surprised if the URL bar copying regression isn't the only fallout from this behavior change either. There are lots of places where we make nsIURIs out of strings to get "valid", normalized, interoperable URIs.
Chrome 43:

Paste in the URL bar: http://example.org/#a%20b
location.href is: http://example.org/#a%20b
Copied value from URL bar is: http://example.org/#a%20b

Paste in the URL bar: http://example.org/#a b
location.href is: http://example.org/#a b
Copied value from URL bar is: http://example.org/#a b

--------

Firefox 39a1:

Paste in the URL bar: http://example.org/#a%20b
location.href is: http://example.org/#a%20b
Copied value is: http://example.org/#a b

Paste in the URL bar: http://example.org/#a b
location.href is: http://example.org/#a b
Copied value from URL bar is: http://example.org/#a b

IE11 is has the same behaviour as Chrome.
Firefox is the only one that doesn't return the same URL we put in.
(In reply to Valentin Gosu [:valentin] from comment #21)
> Firefox is the only one that doesn't return the same URL we put in.

... which could actually be a good thing, since there are many places one would want to paste a URL where you don't want that URL to have spaces or certain other characters in it.
Also, output from Firefox 36, before Bug 1093611 landed:

Paste in the URL bar: http://example.org/#a%20b
location.href is: http://example.org/#a%20b
Copied value is: http://example.org/#a%20b

Paste in the URL bar: http://example.com/#a b
location.href is: http://example.org/#a%20b
Copied value is: http://example.com/#a%20b

As you can see, this also has flaws, and is different from other UAs.
Also, some websites tend to put JSON or even JavaScript in the hash version.
(In reply to David Baron [:dbaron] ⏰UTC-7 from comment #22)
> (In reply to Valentin Gosu [:valentin] from comment #21)
> > Firefox is the only one that doesn't return the same URL we put in.
> 
> ... which could actually be a good thing, since there are many places one
> would want to paste a URL where you don't want that URL to have spaces or
> certain other characters in it.

URLs are determined by developers so I don't think we should necessarily worry about other applications.
Unless the other applications are browsers, in which case we'd probably like to be compatible.
Copying URLs in a way that they can be pasted into text inputs that scan for and auto-link URLs (e.g., twitter, facebook, bugzilla, reddit, etc.) is something we should care about.
Can't argue there.
However, following Bug 1093611, we are allowed to have spaces in the #hash portion of the URL (just like Chrome and IE). Whether this is a good or bad idea, applying the attached patch would fix the use case in comment 0, and wouldn't cause any problems if we reverted 1093611.
Comment on attachment 8585391 [details] [diff] [review]
When copying the entire URL return the underlying URL, instead of the escaped version

I partly misunderstood your patch, as I hadn't realized that gBrowser.currentURI too can contain spaces now (that would be copied to the clipboard even with this patch). Other than other browsers doing it, I don't see why you think that's a good idea.
Attachment #8585391 - Flags: review?(dao) → review-
(In reply to Dão Gottwald [:dao] from comment #27)
> Comment on attachment 8585391 [details] [diff] [review]
> When copying the entire URL return the underlying URL, instead of the
> escaped version
> 
> I partly misunderstood your patch, as I hadn't realized that
> gBrowser.currentURI too can contain spaces now (that would be copied to the
> clipboard even with this patch). 

I'm out of ideas then. Feel free to take the bug or assign it to someone else.

> Other than other browsers doing it, I don't see why you think that's a good idea.

It seemed like a good fix for this bug.
Assignee: valentin.gosu → nobody
Can bug 1093611 be backed out?
Status: ASSIGNED → NEW
I don't think we should. From a spec and compat point of view the URL parsing is correct.
We've actually had requests to uplift it but didn't.
Agreed, we should not revert bug 1093611. This is a UI question, not a URL parser question.
(In reply to Anne (:annevk) from comment #31)
> Agreed, we should not revert bug 1093611. This is a UI question, not a URL
> parser question.

I'm not sure what this means. Unleashing URIs with spaces and other unescaped characters to all kinds of consumers that weren't expecting this surely seems like a deeper issue to me. The UI question (which isn't really a question; I'm certain that we don't want URLs with spaces in the clipboard) is merely a symptom.
Flags: needinfo?(honzab.moz)
Flags: needinfo?(bugs)
-> valentin
Flags: needinfo?(honzab.moz) → needinfo?(valentin.gosu)
bug 1093611 is about making gecko to work like other browsers.
If FF UI needs different behavior, it should use some different API or escape the values explicitly.

If we can't fix the regressions bug 1093611 caused in a timely manner, it should indeed be backed out, or disabled until the regressions are fixed.

But looks like there is a patch for this bug. Are there other known regressions?
Valentin, have you tested copy-pasting between different applications etc.
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #34)
> bug 1093611 is about making gecko to work like other browsers.

Have we established that this behavior is indeed sane and desirable? Having most of the URL escaped and some rest unescaped seems quite weird to me and I don't see how developers are supposed to deal with this. Extract the hash from the URL, encode it and put it back in the URL...?

> If FF UI needs different behavior, it should use some different API or
> escape the values explicitly.

nsIURIFixup, nsIIOService, nsIURI etc. are internal APIs, no Web APIs. What alternative would you suggest?

Also, putting URLs in text fields or some such for the user to copy and paste elsewhere is something Web content does. How do you expect Web content to deal with this?

> But looks like there is a patch for this bug. Are there other known
> regressions?

The attached patch isn't sufficient to get the desired behavior, see comment 27. Also note that we have other places (bookmarks, browser history, devtools...) where we let user copy URLs and don't want spaces and potentially other unescaped characters in there.

And I still have no idea if there's more fallout for some of the other countless nsIURI consumers. Has this been thoroughly thought through? Only a fraction of URIs are affected (those with complex hashes), so I don't think we can put confidence in the fact no other regressions have been reported yet.

CC'ing Gavin as the browser module owner. Is there a module owner responsible for nsIURI and friends?
I am sensitive to the objection that we are now pasting invalid urls into other applications like twitter, even though other browsers do it. I don't believe not doing this (i.e. normalizing invalid urls before pasting) caused a problem in the past. So I would like us to continue that behavior.

bug 1093611 is certainly a bug however, so that patch shouldn't just be abandoned. It seems to me (handwaving) that it could alternately be fixed by normalizing on the setter but not decoding on the url.hash getter. Would that fix the use case in that bug while also fixing the paste-into-twitter use case here?

If I understand the objections to that approach

1] it does not match chrome/ie.. though it seems we've never matched chrome/ie and our paste behavior hasn't been the source of complaints in the past.

2] it creates a literal setter/getter asymmetry. I'm actually ok with that as long as there is a semantic symmetry.

Thoughts? If this were acceptable to folks that would probably mean backing 1093611 out, reopening it or making a new bug, and doing the new fix on 40.
I created bug 1149913, with the purpose of disabling bug 1093611 (or reverting it, although I would rather we just pref it off).

At the same time, I would still like this patch to land (it is required in bug 1124600). Following bug 1149913 we would no longer be generating copying URLs containing spaces or any other invalid characters. Otherwise, we can move the discussion to another bug.
If the user enters a URL with a fragment that contains %20 (not literal spaces) as done by OP, the URL parser will preserve that and not turn %20 into a space. So in that case copy-and-paste would work fine.

However, the UI code is not preserving the output of the URL parser for copy-and-paste purposes. That is what we need to fix.

We want to display a URL with less percent-encoding so it becomes more readable in the typical case (where they represent utf-8 bytes), but we always want to copy the most portable form, which is the output of the URL parser.

Now the URL parser can output a fragment with literal spaces in it, but that is an accepted part of the URL Standard and required for compatibility with the web and other user agents. However, in the typical case where the fragment does contain %20, as in comment 0, the URL parser will not change that and neither should the UI code.

It is my understanding that Valentin's patch fixes the UI code and that seems like the right way forward here.
(In reply to Anne (:annevk) from comment #38)
> but we
> always want to copy the most portable form, which is the output of the URL
> parser.

It used to be that, but now it isn't; not with the URL parser passing though spaces. See bug 1149913 comment 4.
Flags: needinfo?(valentin.gosu)
I asked for a backout of the feature in bug 1149913 for 38 as it is an ESR release...
Should this be marked as fixed by bug 1149913, at least on branches?
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Whiteboard: fixed by bug 1149913
Target Milestone: --- → mozilla40
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: