Open Bug 1448890 Opened 7 years ago Updated 2 years ago

%2F wrongly unescaped when link target shown in status bar

Categories

(Firefox :: Tabbed Browser, defect, P3)

59 Branch
defect

Tracking

()

Tracking Status
firefox-esr52 --- affected
firefox59 --- affected
firefox60 --- affected
firefox61 --- affected

People

(Reporter: nandhp, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(1 file)

Attached file Test page
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
Build ID: 20180315233128

Steps to reproduce:

Hover over a link with a target containing %2F (a percent-escaped slash character). (A test page with such a link is attached.)


Actual results:

The status bar shows https://www.google.com/about/our-company/ instead of https://www.google.com/about%2Four-company/ (as the address bar correctly does once the link is clicked).


Expected results:

Although not all web servers distinguish between these two URLs, they are not equivalent (RFC 3986 section 2.2: "URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent", where slash is a reserved character). Firefox should properly distinguish between them; while it correctly does so in the address bar, it fails to do so when hovering over a link.

This bug may be related to #675403, which is another discrepancy between the decoding behavior of the status bar and the address bar, but is different in that the decoding behavior of the status bar produces a non-equivalent URL.
Reproduced the bug on the latest release 59.0.2 20180323154952, 61.0a1 20180329100042,56.0.2 20171024165158 and Fx 11.0 Mozilla/5.0 (Windows NT 6.1; rv:11.0). 
Seems we have this status bar issue for a some time. Since it's in the same area as bug 675403, adding :bz to CC and setting general as an initial triage component.
Status: UNCONFIRMED → NEW
Component: Untriaged → General
Ever confirmed: true
OS: Unspecified → All
Hardware: Unspecified → All
This is probably the same as bug 675403, yes.  Wish I knew who owns this stuff...
Depends on: 675403
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #2)
> This is probably the same as bug 675403, yes.  Wish I knew who owns this
> stuff...

I think the answer is "nobody", for the status link thing. Anyway, the code for the status link is:

https://dxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#4521-4532


  setOverLink(url, anchorElt) {
    if (url) {
      url = Services.textToSubURI.unEscapeURIForUI("UTF-8", url);

      // Encode bidirectional formatting characters.
      // (RFC 3987 sections 3.2 and 4.1 paragraph 6)
      url = url.replace(/[\u200e\u200f\u202a\u202b\u202c\u202d\u202e]/g,
                        encodeURIComponent);

      if (gURLBar && gURLBar._mayTrimURLs /* corresponds to browser.urlbar.trimURLs */)
        url = trimURL(url);
    }

The url bar does:

    // Strip off "wyciwyg://" and passwords for the location bar
    try {
      uri = Services.uriFixup.createExposableURI(uri);
    } catch (e) {}

    // Replace initial page URIs with an empty string
    // only if there's no opener (bug 370555).
    if (isInitialPage(uri.spec) &&
        checkEmptyPageOrigin(gBrowser.selectedBrowser, uri)) {
      value = "";
    } else {
      // We should deal with losslessDecodeURI throwing for exotic URIs
      try {
        value = losslessDecodeURI(uri);
      } catch (ex) {
        value = "about:blank";
      }
    }

https://dxr.mozilla.org/mozilla-central/rev/8c71359d60e21055074ae8bc3dcb796d20f0cbaf/browser/base/content/browser.js#2736-2761


with losslessDecodeURI being some kind of monster thing here:

https://dxr.mozilla.org/mozilla-central/rev/8c71359d60e21055074ae8bc3dcb796d20f0cbaf/browser/base/content/browser.js#2779-2826



So, one of these is using URL object (url bar), and one just gets a string and has to make do (status link).

One uses a platform method (status link) and doesn't use losslessDecodeURI, and one uses a *different* platform method (url bar; createExposeableURI) and then uses the black-magic regexp stuff in losslessDecodeURI.

I don't know what the best way to unify these is. Valentin and Marco probably have the best ideas here.

It's worth keeping in mind that we'll update the mouseover thing on, well, mouseover, so anything that's even remotely expensive I would like to get rid of, if possible. So, without wanting to sway the jury here, if we can incorporate a one-method-to-rule-them-all way of doing this stuff in platform that doesn't use regexes then maybe that would be preferable...
Flags: needinfo?(valentin.gosu)
Flags: needinfo?(mak77)
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to :Gijs from comment #3)
> if we can
> incorporate a one-method-to-rule-them-all way of doing this stuff in
> platform that doesn't use regexes then maybe that would be preferable...

... also because I've been bitten a few too many times by trying to do URL processing in JS and the surprising complexities of the URL spec (which, AIUI, isn't actually the RFC quoted but https://url.spec.whatwg.org/, which presumably has the same provisions about escaped forward slashes).
Hmm.  So the way we get to setOverLink is that nsDocShell::OnOverLink calls TabChild::SetStatusWithContext which lands in TabParent::RecvSetStatus etc.

We could probably send an nsIURI through here instead of a string, if that would make things easier on the receiver side.
If it were up to me we wouldn't unescape at all, and just use nsIURI.displaySpec instead.
I understand that users like easily readable URLs, but I think the code for nsTextToSubURI::UnEscapeURIForUI is slightly broken anyway, and should be more selective in what it unescapes.
Flags: needinfo?(valentin.gosu)
FWIW, I cannot reproduce bug 675403, maybe it was fixed in the meanwhile?

If we should unify things, I'd probably keep the urlbar code, that while is more complex, it's also more tested. Getting an nsIURI in setOverlink and extrapolating the code from URLBarSetURI to a separate helper looks enough for that.
We can always build a more general solution/replacement on top of that and plug it in, in the future.

I'm not totally opposed to comment 6, but I fear on certain pages links may become completely unreadable. The urlbar code FOR NOW looks like a decent compromise.
Perf-wise, it's likely a bit more expensive, but I don't expect that to be a problem, off-hand.
Flags: needinfo?(mak77)
Priority: -- → P3
Blocks: status-panel
Component: General → Tabbed Browser
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: