Closed Bug 388372 Opened 17 years ago Closed 17 years ago

URL bar should ignore character directionality

Categories

(Firefox :: Address Bar, defect, P3)

2.0 Branch
defect

Tracking

()

RESOLVED FIXED
Firefox 3 beta3

People

(Reporter: dbaron, Assigned: dao)

References

Details

(Whiteboard: [sg:low spoof])

Attachments

(1 file, 3 obsolete files)

Since bug 366797 made us show unescaped rather than escaped URLs in the URL bar, directional formatting characters and right-to-left charcaters can affect the presentation of the URL while it is being edited.  This seems to have a risk of spoofing bugs, although I haven't found any cases where the problem occurs in the non-editing view.

Steps to reproduce:
1. load one of these URLs:
 http://mozilla.org/htap?yreuq [note that path and query are labeled correctly]
 http://host.evil.com/gro.doog [note that this URL goes to evil.com ]
2. move the mouse over the URL bar and away again

Actual results: The presentation of the URL changes drastically when switching between the two URL bar modes.  When the host is highlighted (i.e., when the mouse is not over the URL bar and the URL bar is not focused), the directional formatting characters are ignored.  When the host is not highlighted, the directional formatting characters reorder the URL.

Expected results:  character directionality and directional formatting characters should not influence the display of what is loaded in the URL bar.  (Probably not even when the user is pasting a URL in the URL bar, which isn't a regression from bug 366797.)
Flags: blocking-firefox3?
Marking security-sensitive because I can reproduce the bug in Firefox 2: the second link makes it look like I'm at good.com.  (This surprised dbaron because he thought Firefox 2 escaped everything...)
Group: security
IMO (based on RFC 3987) we should always escape bidi formatting characters, so the expected results are http://mozilla.org/%E2%80%AEhtap?yreuq%E2%80%AC and http://%E2%80%AEhost.evil.com/gro.doog%E2%80%AC%
Flags: wanted1.8.1.x+
Flags: wanted1.8.0.x?
Whiteboard: [sg:low spoof]
Flags: blocking-firefox3? → blocking-firefox3+
on trunk i crash on loading this Bugzilla Page ! (not a testcae) 

Crash ID: 
http://crash-stats.mozilla.com/report/index/3583a94a-3c61-11dc-afb1-001a4bd43ed6?date=2007-07-27-16
Keywords: crash
So, loading the second example shows me a "Problem loading this page" message that states "Firefox can't find the server at .moc.live.tsoh". If I attempt to paste the URL in the second example in gedit or even in a textbox in Firefox, my cursor gets completely screwed up and shows as being both between the last '/' and the 'g' and after the 'h' (at the same time!). Also, my URL bar states "http://good.org/moc.live.tsoh" instead of the correct evil.com, but mousing over the URL bar does not change it. Also, when I reload this bug, the text for the second example is not "purple", meaning that Firefox thinks I never visited that page at all.

Loading the first example doesn't seem to do the same thing. I'm sent to http://www.mozilla.org/%e2%80%aehtap?yreuq, which is obviously a 404. Also, the link in the bug turns purple (already visited).

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6pre) Gecko/20070724 BonEcho/2.0.0.6pre ID:2007072403
Keywords: crash
Keywords: crash
Oh, and when I view source this page, everything in comment #0 starting at the first example until the </pre> is rtl instead of ltr. If I copy/paste that text into gedit, it shows up as ltr.
Keywords: crash
Target Milestone: --- → Firefox 3 M9
moving out bugs that don't need to block b1
Target Milestone: Firefox 3 M9 → Firefox 3 M10
Hmm, on trunk right now I don't see an issue here yet, except maybe the statusbar text?
The statusbar actually gets it right, the Location Bar doesn't.
Ok, so, we should fix it then!
Assignee: nobody → dao
Priority: -- → P3
Target Milestone: Firefox 3 M10 → Firefox 3 M11
OS: Linux → All
Hardware: PC → All
Version: Trunk → 2.0 Branch
No longer blocks: 366797
I'm willing to fix this if I can figure out how. It's not a regression though.
Depends on: 397815
Status: NEW → ASSIGNED
Attachment 290987 [details], the example I mentioned in bug 406322, has no DNS problem, and works fine if my DNS resolves ".ایران" tld.  It has just a LRM after the slash between host and path.
Attached patch trunk patch (obsolete) — Splinter Review
Attachment #296749 - Flags: review?(gavin.sharp)
Attached patch better trunk patch (obsolete) — Splinter Review
Attachment #296749 - Attachment is obsolete: true
Attachment #296763 - Flags: review?(gavin.sharp)
Attachment #296749 - Flags: review?(gavin.sharp)
Shouldn't we be displaying them escaped, per comment 2, rather than removing theme entirely?
(In reply to comment #2)
> IMO (based on RFC 3987) we should always escape bidi formatting characters, so
> the expected results are http://mozilla.org/%E2%80%AEhtap?yreuq%E2%80%AC and
> http://%E2%80%AEhost.evil.com/gro.doog%E2%80%AC%

The links in comment 0 don't include the trailing "%E2%80%AC%", so are those really the expected results?

If I tweak dao's patch to be:

+      // always encode bidirectional formatting characters (RFC 3987 section 4.1 paragraph 6)
+      value = value.replace(/[\u200e\u200f\u202a\u202b\u202c\u202d\u202e]/g, encodeURIComponent);

then I get http://mozilla.org/%E2%80%AEhtap?yreuq and http://%e2%80%aehost.evil.com/gro.doog .
(In reply to comment #15)
> Shouldn't we be displaying them escaped, per comment 2, rather than removing
> theme entirely?

The way I understand RFC 3987 ("IRIs MUST NOT contain bidirectional formatting characters [...] They affect the visual rendering of the IRI but do not appear themselves") and comment 0 ("directional formatting characters should not influence the display") is that we should just ignore them. That's also what the status bar does (because XUL lacks bidi support?).
I'm just concerned about stripping out information from the URL bar that's actually been sent in the request. Seems like it could be confusing to have two different pages loaded that have the same displayed URL.
Sorry if it's out-of-topic.  It's not a solution, but a way to don't let user get confised.

Can we limit the allowed characters of the input text-box of address-bar?  For example applying a simple character search on every change?

This helps user to understand she cannot input such characters there.
(In reply to comment #16)
> The links in comment 0 don't include the trailing "%E2%80%AC%", so are those
> really the expected results?

That may just be an artefact of bugzilla's linkification. I assumed that they were intended as part of the URLs
(In reply to comment #17)
> The way I understand RFC 3987 ("IRIs MUST NOT contain bidirectional formatting
> characters [...] They affect the visual rendering of the IRI but do not appear
> themselves") and comment 0 ("directional formatting characters should not
> influence the display") is that we should just ignore them. 

See section 3.2 of the RFC:

4. Re-percent-encode all octets produced in step 3 that in UTF-8
      represent characters that are not appropriate according to
      sections 2.2, 4.1, and 6.1.
Ok, thanks.
Attachment #296763 - Attachment is obsolete: true
Attachment #297030 - Flags: review?(gavin.sharp)
Attachment #296763 - Flags: review?(gavin.sharp)
Comment on attachment 297030 [details] [diff] [review]
encode bidi formatting characters, trunk version

encodeURIComponent seems like a better choice semantically (and perhaps more performant, though the difference is almost certainly negligible).
Attachment #297030 - Flags: review?(gavin.sharp) → review+
For the record, the patch needs an update anyway, so I'm waiting for bug 412458.
updated and using encodeURIComponent
Attachment #297030 - Attachment is obsolete: true
Keywords: checkin-needed
Checking in browser/base/content/browser.js;
/cvsroot/mozilla/browser/base/content/browser.js,v  <--  browser.js
new revision: 1.940; previous revision: 1.939
done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
I filed bug 415491 about potentially moving this code into UnescapeURIForUI, at least for the status bar which already uses that function.
Uh..  Yeah.  This should have been done in UnescapeURIForUI, thus avoiding bug 460927 as well.

Is this when I get to mutter about sr in browser/?
I'd love to have a discussion (in some more appropriate forum) about the merits and drawbacks of requiring SR in /browser, but frankly it's rather annoying to see "SR would have caught this" comments any time you find a flaw in a patch that wasn't SRed.

In this case specifically, I don't think SR feedback would have significantly changed the outcome. We chose to take a specific, simpler fix and filed a bug on the more involved general fix, which I think is a net win over the situation we were in before.
(Oops, didn't notice bzbarsky wasn't CCed when I replied to his comment.)
I'm not sure filing and forgetting security bugs is necessarily a win, especially since the bug that was filed wasn't even flagged as a security bug.  Once this one got opened anyone who cared could have figured out how to exploit other uses of UnescapeURIForUI...
And by no means do I say this "any time I find a flaw".  I say it when I find specific "failure to really think about other parts of the code outside this little piece here too" flaws, which are what sr is meant to deal with.

I should note that there was no indication in the bug that the more general fix was even considered as a way of fixing this problem until a week after the patches landed....
(In reply to comment #32)
> I'm not sure filing and forgetting security bugs is necessarily a win,
> especially since the bug that was filed wasn't even flagged as a security bug.

Fair enough - "filing and forgetting" and not recognizing the security implications is obviously my fault. I don't think any of the other callers are quite as critical as the location bar is, though.

(In reply to comment #33)
> And by no means do I say this "any time I find a flaw".  I say it when I find
> specific "failure to really think about other parts of the code outside this
> little piece here too" flaws, which are what sr is meant to deal with.

You are absolutely justified in pointing out any cases of "failure[s] to really think about other parts of the code" by reviewers, but I don't really agree with the implication that SR is the only way to obtain that, and I'm not sure it's a fair characterization of what went on in this bug given that I did consider the other code, and even filed a bug about it.

I mostly commented because bug 395144 (where you made a similar comment, for a patch that actually was reviewed by an SR) was recent in my mind. 

> I should note that there was no indication in the bug that the more general fix
> was even considered as a way of fixing this problem until a week after the
> patches landed....

That's mostly because our URL bar doesn't use UnescapeURIForUI, so changing it wouldn't have actually fixed this bug. That's something else that we need to investigate...
OK, point taken.  I should probably also make it clearer when I'm being tongue-in-cheek (which in comment 29) I actually was...
Is this a problem in SeaMonkey 1.1, too? If not we can definitely unhide this bug. If it is a problem, is it going to get fixed or could we open this bug anyway.

It's potentially even a problem on the 1.9-based SeaMonkey 2 since this bug and patch are in browser.js -- SeaMonkey uses different files which may or may not contain the same mistake.
Whiteboard: [sg:low spoof] → [sg:low spoof][need to check SeaMonkey before unhiding]
Neil should know better than me - is this a problem on SeaMonkey (trunk as well as 1.1)?
(In reply to comment #36)
> Is this a problem in SeaMonkey 1.1, too?
It's potentially a problem according to bug 425280 comment #33 although I was only able to reproduce the second spoof.

> It's potentially even a problem on the 1.9-based SeaMonkey 2 since this bug and
> patch are in browser.js -- SeaMonkey uses different files which may or may not
> contain the same mistake.
Fortunately this patch had already landed before we ported bug 366797 to SeaMonkey 2 so we incorporated it in to our code.
Whiteboard: [sg:low spoof][need to check SeaMonkey before unhiding] → [sg:low spoof]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: