Last Comment Bug 475896 - When accessing urls in firefox with %20, it is replaced with a whitespace. Breaks other software if you try to copy-paste part of the url out of firefox
: When accessing urls in firefox with %20, it is replaced with a whitespace. Br...
Status: NEW
:
Product: Firefox
Classification: Client Software
Component: Location Bar (show other bugs)
: unspecified
: All All
: -- normal with 2 votes (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
http://parchment.googlecode.com/svn/t...
Depends on:
Blocks: 531210
  Show dependency treegraph
 
Reported: 2009-01-28 23:23 PST by Jonathan Steinert
Modified: 2010-09-27 03:02 PDT (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Like so? (1.06 KB, patch)
2009-03-04 14:11 PST, Mats Palmgren (vacation)
dao+bmo: review-
Details | Diff | Splinter Review
v2 (1.98 KB, patch)
2009-03-04 16:06 PST, Mats Palmgren (vacation)
no flags Details | Diff | Splinter Review
v3 (2.65 KB, patch)
2009-03-10 20:24 PDT, Mats Palmgren (vacation)
no flags Details | Diff | Splinter Review
v4 (1.99 KB, patch)
2009-03-14 21:00 PDT, Mats Palmgren (vacation)
no flags Details | Diff | Splinter Review

Description Jonathan Steinert 2009-01-28 23:23:50 PST
User-Agent:       Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.0.5) Gecko/2008120121 Firefox/3.0.5
Build Identifier: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.0.5) Gecko/2008120121 Firefox/3.0.5

Many applications find the end of urls based on a whitespace (or maybe a word boundary) character at the end.

If I copy a url from firefox and paste it into something else it requires me to manually replace the spaces with %20. This is slightly more than annoying.

Reproducible: Always

Steps to Reproduce:
Open the url, look at it.
Actual Results:  
http://parchment.googlecode.com/svn/trunk/parchment.html?story=http://www.meltsner.com/random/Champion of Guitars.z5

Expected Results:  
http://parchment.googlecode.com/svn/trunk/parchment.html?story=http://www.meltsner.com/random/Champion%20of%20Guitars.z5
Comment 1 Jesse Ruderman 2009-01-28 23:26:47 PST
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090128 Minefield/3.2a1pre

Works correctly for me.  It's displayed with spaces in the address bar, but if I copy&paste the whole thing to TextEdit, I get %20s.
Comment 2 Kyle Huey [:khuey] (khuey@mozilla.com) 2009-01-29 06:27:49 PST
Works correctly for me if I've browsed to the URL.  If I just type it in the awesomebar and copy it out without browsing I get spaces.  Seems like reasonable design behavior to me.
Comment 3 Kyle Huey [:khuey] (khuey@mozilla.com) 2009-01-29 06:28:54 PST
Sorry for duplicate emails.

User agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5
Comment 4 Mats Palmgren (vacation) 2009-01-29 12:30:31 PST
Works for me, when pasting it into Emacs I get %20s,  Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.0.7pre) Gecko/2009012304 GranParadiso/3.0.7pre
Comment 5 Mats Palmgren (vacation) 2009-02-22 12:02:11 PST
Jonathan, we need more details to reproduce the problem.  As you can see
above, three people said it works for them.  Which application do you paste
into?  Does it work with TextEdit and/or Emacs?  Can you reproduce the
problem in Firefox Safe Mode?  http://support.mozilla.com/en-US/kb/Safe+Mode
Any other details about your desktop environment that might affect it?
Comment 6 Valentin Laube 2009-03-01 07:36:30 PST
found a way to reproduce this problem:

1. visit http://images.google.de/images?q=rick%20rolled
2. as expected, the url appears with a space instead of %20 in the location bar
3. if you copy the complete url and pase it into another application you get http://images.google.de/images?q=rick%20rolled which is also correct
4. but if you copy a SUBSTRING of the url that contains a space (like "rick rolled") you will get "rick rolled" with a space instead of %20 as expected

hope this helps, and btw: you just got rick rolled :D
Comment 7 Mats Palmgren (vacation) 2009-03-04 07:43:27 PST
I can reproduce the behaviour described in comment 6, but isn't that the
desired behaviour?  it's just a string you're copying at that point, not
an URL.
Comment 8 Valentin Laube 2009-03-04 09:25:47 PST
you are right, if the substring is not a URL it doesn't make too much sense, but imagine it was.
i originally had the problem when i wanted to shorten a standard google image search URL to something more readable, google appends a lot of information other than the query string to the URL.
Comment 9 Mats Palmgren (vacation) 2009-03-04 14:08:02 PST
Yes, one could argue that copying a prefix of the URL is still an URL
and should be escaped.
Comment 10 Mats Palmgren (vacation) 2009-03-04 14:11:53 PST
Created attachment 365536 [details] [diff] [review]
Like so?
Comment 11 Dão Gottwald [:dao] 2009-03-04 14:26:51 PST
Comment on attachment 365536 [details] [diff] [review]
Like so?

This fixes the given use case, but isn't quite right when e.g. "http:" is selected :(
Comment 12 Dão Gottwald [:dao] 2009-03-04 14:30:15 PST
... which the IO service would convert to "http:///".
Comment 13 Mats Palmgren (vacation) 2009-03-04 16:06:45 PST
Created attachment 365563 [details] [diff] [review]
v2

Good catch.
Comment 14 Dão Gottwald [:dao] 2009-03-10 16:27:17 PDT
Still not quite right... "view-source:http:" becomes "view-source:http:///".
Comment 15 Mats Palmgren (vacation) 2009-03-10 20:24:57 PDT
Created attachment 366747 [details] [diff] [review]
v3

Ok, trying a different approach then: remove trailing slashes from the url
string so that it has the same number of trailing slashes as the selection.
Also, I found that we have an existing bug for URIs that ends with spaces,
e.g. "file:///tmp/test%20%20", when copying the URL the trailing spaces
are removed.  I fixed that too.
Comment 16 Dão Gottwald [:dao] 2009-03-11 02:28:09 PDT
Comment on attachment 366747 [details] [diff] [review]
v3

>+ trailing spaces that was
>+          // removed when normalizing the URI.

hrm, why is that part of the normalization?

>+              // Remove added slashes.
>+              let reTrailingSlashes = /\/+$/;
>+              let uriSlashes = reTrailingSlashes.exec(val);
>+              if (uriSlashes) {
>+                let selectionSlashes = reTrailingSlashes.exec(selection);
>+                let noOfAddedSlashes = uriSlashes[0].length - 
>+                                       (selectionSlashes ? 
>+                                          selectionSlashes[0].length : 0);
>+                if (noOfAddedSlashes > 0)
>+                  val = val.slice(0, -noOfAddedSlashes);
>+              }

how about:
> let trailingSlashes = function (s) s.match(/\/*$/)[0].length;
> val = val.replace(new RegExp("\\/{"
>                   + (trailingSlashes(val) - trailingSlashes(selection)
>                   + "}$"), "");

>+              // Add back removed spaces (escaped).
>+              let reTrailingSpaces = /\ +$/;
>+              let selectionSpaces = reTrailingSpaces.exec(selection);
>+              if (selectionSpaces) {
>+                let uriSpaces = reTrailingSpaces.exec(val);
>+                let noOfRemovedSpaces = selectionSpaces[0].length - 
>+                                        (uriSpaces ? uriSpaces[0].length : 0);
>+                while (noOfRemovedSpaces-- > 0)
>+                  val += '%20';
>+              }

how about:
> val += escape(selection.match(/\ *$/)[0]);
Comment 17 Dão Gottwald [:dao] 2009-03-11 02:31:36 PDT
(In reply to comment #16)
> how about:
> > let trailingSlashes = function (s) s.match(/\/*$/)[0].length;
> > val = val.replace(new RegExp("\\/{"
> >                   + (trailingSlashes(val) - trailingSlashes(selection)
> >                   + "}$"), "");

sorry, missed a right parenthesis after trailingSlashes(selection)
Comment 18 Mats Palmgren (vacation) 2009-03-14 21:00:23 PDT
Created attachment 367449 [details] [diff] [review]
v4

(In reply to comment #16)
> hrm, why is that part of the normalization?

nsStandardURL::SetSpec calls net_FilterURIString which removes
leading and trailing whitespace.
http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsStandardURL.cpp#1066
http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsURLHelper.cpp#529

> how about:

I changed it as you suggested and AFAICT the other characters that
net_FilterURIString removes (\t\r\n) does not occur unescaped in the
URL bar (I tested file: and http:).
Comment 19 Dão Gottwald [:dao] 2009-03-15 03:37:48 PDT
(In reply to comment #18)
> nsStandardURL::SetSpec calls net_FilterURIString which removes
> leading and trailing whitespace.

So it seems like this would be more straightforward:

// nsStandardURL::SetSpec removes trailing whitespace
uri = ioService.newURI(val.replace(" ", "%20", "g"), null, null);
Comment 20 Dão Gottwald [:dao] 2009-03-21 01:24:03 PDT
Comment on attachment 367449 [details] [diff] [review]
v4

see comment 19
Comment 21 Alexey Salmin 2009-05-05 12:54:47 PDT
And what about this bug? http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=498137
It was marked as forwarded here.
Comment 22 NoOp 2009-11-01 15:14:19 PST
If I understand this issue correctly (If I copy a url from firefox and paste it into something else it requires me to manually replace the spaces with %20. This is slightly more than annoying.), this problem is apparent in SeaMonkey 2.0 as well. For example:

SM 2.0:
<https://help.ubuntu.com/community>
click on #5 in 'Contents: Getting to know and work with your system
You'll see that in SM2.0 the URL ends up as:
<https://help.ubuntu.com/community#Getting to know and work with your
system>

In 1.1.18 the same URL ends up as:
<https://help.ubuntu.com/community#Getting%20to%20know%20and%20work%20with%20your%20system>

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.4) Gecko/20091017 Lightning/1.0pre SeaMonkey/2.0

Also occurs on SeaMonkey 2.0 Windows.
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.4) Gecko/20091017 SeaMonkey/2.0
Comment 23 David E. Ross 2009-11-22 16:14:55 PST
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.4) Gecko/20091017 SeaMonkey/2.0

I'm seeing the same problem in SeaMonkey 2.0.  Is this a problem in Gecko or a toolkit rather than in Firefox or SeaMonkey?  

I went to the link in the Expected Results in the original Description.  When that page completed loading and rendering, the URI displayed in the address area with blanks and not %20.  I marked the ENTIRE URI in the address area, copied it, and pasted it into both Notepad and Wordpad.  The results showed exactly what I saw in the address area: blank spaces and not %20.  

I went to the menu bar and selected [View > Page Info].  On the General tab, the URI displayed with two instances of %20.  Marking, copying, and pasting that URI display retained the %20.  

Obviously, with respect to the address area, what you see is what you get.  Further, what you see is NOT necessarily the link you selected.

Note You need to log in before you can comment on or make changes to this bug.