Closed Bug 407974 Opened 17 years ago Closed 17 years ago

should url bar autocomplete results decode UTF-8 encoded urls?

Categories

(Firefox :: Address Bar, defect, P3)

defect

Tracking

()

RESOLVED FIXED
Firefox 3 beta3

People

(Reporter: tomer, Assigned: Mardak)

References

()

Details

(Keywords: intl)

Attachments

(3 files, 1 obsolete file)

I've just found that while the new address bar can show %-encoded URLs in the address bar, the suggestions panel right after it show them much like the address bar in Firefox 2.

Steps to reproduce:
a. enter any non-latin1 wikipedia site (he.wikipedia.org and ar.wikipedia.org are good examples. 
b. watch the address bar become readable for the whole address (the translation of  Main_Page (en) as the landing page name).
c. try to access that address again; you'll see the address encrypted in the suggestion panel.
Component: he-IL / Hebrew → Places
Product: Mozilla Localizations → Firefox
QA Contact: hebrew.he → places
Target Milestone: --- → Firefox 3 M11
Flags: blocking-firefox3?
Component: Places → Location Bar and Autocomplete
QA Contact: places → location.bar
morphing bug slightly.

Tomer points out in the original bug report that Firefox 2 had the same issue.

Additionally, setting browser.urlbar.richResults to false, that this isn't a new problem with rich results.

Note that when doing autocomplete, if you were to match against the encoded url, you'd currently have to type the encoded text (%D7%A2...) to match the url, and that typing the decoded hebrew (עמוד_ראשי) would not work.

We have some additional bugs about encoding / decoding and escaping / unescaping with places and with the url bar.
Summary: Firefox3 address bar don't show UTF-8 encoded text in suggestions addresses → should url bar autocomplete results decode UTF-8 encoded urls?
Version: unspecified → Trunk
Keywords: intl
Flags: blocking-firefox3? → blocking-firefox3+
Priority: -- → P3
> We have some additional bugs about encoding / decoding and escaping /
> unescaping with places and with the url bar.

see bug #389465 – Places should decode URLs
see bug #391691 – escaped and unescaped versions of URI are both in history
Attached patch v1 (obsolete) — Splinter Review
This fixes things up in the back end so it'll also fix bug 413784, bug 391175, bug 415187, bug 389465.

This patch is applied to the current trunk with no dependencies.
Assignee: nobody → edilee
Status: NEW → ASSIGNED
Attachment #300793 - Flags: review?(dietrich)
Added Erwan's testcase from bug 413784.
Attachment #300793 - Attachment is obsolete: true
Attachment #300796 - Flags: review?(dietrich)
Attachment #300793 - Flags: review?(dietrich)
Comment on attachment 300796 [details] [diff] [review]
v1 w/ Erwan's unit test

so, r=me for the code change, looks ok. however, i'm super wary of regressing awesomeness this late in the cycle, so can you please confirm:

1. doesn't regress existing tests

2. doesn't regress perf - i'd like to know exactly how much this'll cost.

ensuring solvency on both issues will increase potential driver acceptance.
Attachment #300796 - Flags: review?(dietrich) → review+
(In reply to comment #5)
> 1. doesn't regress existing tests
Unit tests for places pass.

> 2. doesn't regress perf - i'd like to know exactly how much this'll cost.
I've got a tryserver build building.
(In reply to comment #5)
> 2. doesn't regress perf - i'd like to know exactly how much this'll cost.
I'm printing out runtime numbers for FullHistorySearch with the default prefs (25max, 100chunk, 100timeout) on a debug build:

Typing out "mozilla" one letter at a time: (amount of time in FullHistorySearch)
Without patch: 4ms, 6ms, 6ms, 8ms, 6ms, 8ms, 8ms
With patch: 4ms, 7ms, 7ms, 7ms, 9ms, 7ms, 7ms

Adding on ".org" to "bugzilla.mozilla" one character at a time:
Without patch: 9ms, 9ms, 10ms, 9ms
With patch: 10ms, 10ms, 9ms, 10ms

Remember these are debug builds, so it's going to be significantly slower than a release build. But they're pretty small already.. so not much to worry about!
Comment on attachment 300796 [details] [diff] [review]
v1 w/ Erwan's unit test

Target Firefox3 beta3, blocking-firefox3+, passes unit tests, add unit test, no/minimal performance impact

Huge gain in functionality for international languages especially asian ones like Chinese and Japanese, etc. English users even benefit because now you can both see and search for "@", ":", and other escaped symbols.
Attachment #300796 - Flags: approval1.9b3?
Comment on attachment 300796 [details] [diff] [review]
v1 w/ Erwan's unit test

a=beltzner
Attachment #300796 - Flags: approval1.9b3? → approval1.9b3+
Blocks: 413784
Blocks: 391175
Blocks: 415187
Blocks: 389465
Checking in toolkit/components/places/src/nsNavHistoryAutoComplete.cpp;
/cvsroot/mozilla/toolkit/components/places/src/nsNavHistoryAutoComplete.cpp,v  <--  nsNavHistoryAutoComplete.cpp
new revision: 1.37; previous revision: 1.36
done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Flags: in-testsuite+
OS: Linux → All
Hardware: PC → All
Resolution: --- → FIXED
This patch isn't quite right, but will work in most cases. Not all sites use UTF-8 for the encoding, as assumed by this patch.

I think there is some code (maybe for the status bar) that will do some guessing. A better solution would be to store the encoding in the DB (the URL object knows it in many cases) so you know how to unescape it.
Also, what about IDN? That might be a different bug, but it should also work.
If so, we'll need to change how we store the URIs to begin with because right now nsNavHistory.cpp calls BindStatementURI which gets the UTF8 spec from the URI.
Actually, is that a problem at all? If the URI is not using UTF-8 encoding, the URI object knows how to give the correct UTF-8 encoding. We store the UTF-8 encoding in the database independent of the original encoding. (Both encodings should be valid.) What this patch does is decoding the value from the database.
See the origin charset:
  http://lxr.mozilla.org/seamonkey/source/netwerk/base/public/nsIURI.idl#223
I think but am not sure, if the origin charset is non UTF-8 for the URL when it was added to history, this will break.

I think to fix this would require storing URLs differently, which sucks and may not be practical.
Could this patch have caused bug 415397?
Blocks: 416214
Was this bug supposed to refer to text strings that were manually entered into the address bar?  I ask because currently, if I paste the following URL string into the address bar, http://forums.mozillazine.org/viewtopic.php%3fp%3d3279302#3279302 , the page cannot be found, because Firefox isn't decoding the Javascript encoded URL.

It would be very helpful if the address bar translated the address typed into it at the point of executing the Go button behavior, in this case, to the following: http://forums.mozillazine.org/viewtopic.php?p=3279302#3279302

I run into a site daily that doesn't translate their Javascript referral string URL that is in the form of the first example to that of the second example.  That has me manually copying the URL part of the referral string and pasting it into the following website to decode it, and then pasting the decoded URL back into the address bar: http://meyerweb.com/eric/tools/dencoder/

If you're typing in or copy/pasting stuff to enter in the location bar, then it's not related to this bug. The first url you provided is a valid url that requests a page viewtopic.php%3fp%3d3279302, but you probably wanted it to request the page viewtopic.php with query parameter p=....
Depends on: 421980
Depends on: 415397
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: