should url bar autocomplete results decode UTF-8 encoded urls?

RESOLVED FIXED in Firefox 3 beta3

Status

()

Firefox
Address Bar
P3
normal
RESOLVED FIXED
10 years ago
10 years ago

People

(Reporter: tomer, Assigned: Mardak)

Tracking

({intl})

Trunk
Firefox 3 beta3
Points:
---
Dependency tree / graph
Bug Flags:
blocking-firefox3 +
in-testsuite +
in-litmus -

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(3 attachments, 1 obsolete attachment)

(Reporter)

Description

10 years ago
Created attachment 292675 [details]
Screenshot of he.wikipedia.org as suggestion

I've just found that while the new address bar can show %-encoded URLs in the address bar, the suggestions panel right after it show them much like the address bar in Firefox 2.

Steps to reproduce:
a. enter any non-latin1 wikipedia site (he.wikipedia.org and ar.wikipedia.org are good examples. 
b. watch the address bar become readable for the whole address (the translation of  Main_Page (en) as the landing page name).
c. try to access that address again; you'll see the address encrypted in the suggestion panel.
(Reporter)

Updated

10 years ago
Component: he-IL / Hebrew → Places
Product: Mozilla Localizations → Firefox
QA Contact: hebrew.he → places
Target Milestone: --- → Firefox 3 M11
(Reporter)

Updated

10 years ago
Flags: blocking-firefox3?
(Reporter)

Updated

10 years ago
Component: Places → Location Bar and Autocomplete
QA Contact: places → location.bar
morphing bug slightly.

Tomer points out in the original bug report that Firefox 2 had the same issue.

Additionally, setting browser.urlbar.richResults to false, that this isn't a new problem with rich results.

Note that when doing autocomplete, if you were to match against the encoded url, you'd currently have to type the encoded text (%D7%A2...) to match the url, and that typing the decoded hebrew (עמוד_ראשי) would not work.

We have some additional bugs about encoding / decoding and escaping / unescaping with places and with the url bar.
Summary: Firefox3 address bar don't show UTF-8 encoded text in suggestions addresses → should url bar autocomplete results decode UTF-8 encoded urls?

Updated

10 years ago
Version: unspecified → Trunk

Updated

10 years ago
Keywords: intl

Updated

10 years ago
Flags: blocking-firefox3? → blocking-firefox3+
Priority: -- → P3
> We have some additional bugs about encoding / decoding and escaping /
> unescaping with places and with the url bar.

see bug #389465 – Places should decode URLs
see bug #391691 – escaped and unescaped versions of URI are both in history
(Assignee)

Comment 3

10 years ago
Created attachment 300793 [details] [diff] [review]
v1

This fixes things up in the back end so it'll also fix bug 413784, bug 391175, bug 415187, bug 389465.

This patch is applied to the current trunk with no dependencies.
Assignee: nobody → edilee
Status: NEW → ASSIGNED
Attachment #300793 - Flags: review?(dietrich)
(Assignee)

Comment 4

10 years ago
Created attachment 300796 [details] [diff] [review]
v1 w/ Erwan's unit test

Added Erwan's testcase from bug 413784.
Attachment #300793 - Attachment is obsolete: true
Attachment #300796 - Flags: review?(dietrich)
Attachment #300793 - Flags: review?(dietrich)
Comment on attachment 300796 [details] [diff] [review]
v1 w/ Erwan's unit test

so, r=me for the code change, looks ok. however, i'm super wary of regressing awesomeness this late in the cycle, so can you please confirm:

1. doesn't regress existing tests

2. doesn't regress perf - i'd like to know exactly how much this'll cost.

ensuring solvency on both issues will increase potential driver acceptance.
Attachment #300796 - Flags: review?(dietrich) → review+
(Assignee)

Comment 6

10 years ago
(In reply to comment #5)
> 1. doesn't regress existing tests
Unit tests for places pass.

> 2. doesn't regress perf - i'd like to know exactly how much this'll cost.
I've got a tryserver build building.
(Assignee)

Comment 7

10 years ago
(In reply to comment #5)
> 2. doesn't regress perf - i'd like to know exactly how much this'll cost.
I'm printing out runtime numbers for FullHistorySearch with the default prefs (25max, 100chunk, 100timeout) on a debug build:

Typing out "mozilla" one letter at a time: (amount of time in FullHistorySearch)
Without patch: 4ms, 6ms, 6ms, 8ms, 6ms, 8ms, 8ms
With patch: 4ms, 7ms, 7ms, 7ms, 9ms, 7ms, 7ms

Adding on ".org" to "bugzilla.mozilla" one character at a time:
Without patch: 9ms, 9ms, 10ms, 9ms
With patch: 10ms, 10ms, 9ms, 10ms

Remember these are debug builds, so it's going to be significantly slower than a release build. But they're pretty small already.. so not much to worry about!
(Assignee)

Comment 8

10 years ago
Comment on attachment 300796 [details] [diff] [review]
v1 w/ Erwan's unit test

Target Firefox3 beta3, blocking-firefox3+, passes unit tests, add unit test, no/minimal performance impact

Huge gain in functionality for international languages especially asian ones like Chinese and Japanese, etc. English users even benefit because now you can both see and search for "@", ":", and other escaped symbols.
Attachment #300796 - Flags: approval1.9b3?
(Assignee)

Comment 9

10 years ago
Created attachment 300836 [details]
screenshot of trunk and v1
Comment on attachment 300796 [details] [diff] [review]
v1 w/ Erwan's unit test

a=beltzner
Attachment #300796 - Flags: approval1.9b3? → approval1.9b3+
(Assignee)

Updated

10 years ago
Blocks: 413784
(Assignee)

Updated

10 years ago
Blocks: 391175
(Assignee)

Updated

10 years ago
Blocks: 415187
(Assignee)

Updated

10 years ago
Blocks: 389465
(Assignee)

Comment 11

10 years ago
Checking in toolkit/components/places/src/nsNavHistoryAutoComplete.cpp;
/cvsroot/mozilla/toolkit/components/places/src/nsNavHistoryAutoComplete.cpp,v  <--  nsNavHistoryAutoComplete.cpp
new revision: 1.37; previous revision: 1.36
done
Status: ASSIGNED → RESOLVED
Last Resolved: 10 years ago
Flags: in-testsuite+
OS: Linux → All
Hardware: PC → All
Resolution: --- → FIXED

Comment 12

10 years ago
This patch isn't quite right, but will work in most cases. Not all sites use UTF-8 for the encoding, as assumed by this patch.

I think there is some code (maybe for the status bar) that will do some guessing. A better solution would be to store the encoding in the DB (the URL object knows it in many cases) so you know how to unescape it.

Comment 13

10 years ago
Also, what about IDN? That might be a different bug, but it should also work.
(Assignee)

Comment 14

10 years ago
If so, we'll need to change how we store the URIs to begin with because right now nsNavHistory.cpp calls BindStatementURI which gets the UTF8 spec from the URI.
(Assignee)

Comment 15

10 years ago
Actually, is that a problem at all? If the URI is not using UTF-8 encoding, the URI object knows how to give the correct UTF-8 encoding. We store the UTF-8 encoding in the database independent of the original encoding. (Both encodings should be valid.) What this patch does is decoding the value from the database.

Comment 16

10 years ago
See the origin charset:
  http://lxr.mozilla.org/seamonkey/source/netwerk/base/public/nsIURI.idl#223
I think but am not sure, if the origin charset is non UTF-8 for the URL when it was added to history, this will break.

I think to fix this would require storing URLs differently, which sucks and may not be practical.
Flags: in-litmus-

Comment 17

10 years ago
Could this patch have caused bug 415397?
(Assignee)

Updated

10 years ago
Blocks: 416214

Comment 18

10 years ago
Was this bug supposed to refer to text strings that were manually entered into the address bar?  I ask because currently, if I paste the following URL string into the address bar, http://forums.mozillazine.org/viewtopic.php%3fp%3d3279302#3279302 , the page cannot be found, because Firefox isn't decoding the Javascript encoded URL.

It would be very helpful if the address bar translated the address typed into it at the point of executing the Go button behavior, in this case, to the following: http://forums.mozillazine.org/viewtopic.php?p=3279302#3279302

I run into a site daily that doesn't translate their Javascript referral string URL that is in the form of the first example to that of the second example.  That has me manually copying the URL part of the referral string and pasting it into the following website to decode it, and then pasting the decoded URL back into the address bar: http://meyerweb.com/eric/tools/dencoder/

(Assignee)

Comment 19

10 years ago
If you're typing in or copy/pasting stuff to enter in the location bar, then it's not related to this bug. The first url you provided is a valid url that requests a page viewtopic.php%3fp%3d3279302, but you probably wanted it to request the page viewtopic.php with query parameter p=....

Updated

10 years ago
Depends on: 421980

Updated

10 years ago
Depends on: 415397
You need to log in before you can comment on or make changes to this bug.