Closed Bug 507815 Opened 15 years ago Closed 13 years ago

Cannot reload search page with Unicode characters

Categories

(Firefox :: Address Bar, defect)

3.6 Branch
defect
Not set
major

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: ws.bugzilla, Unassigned)

References

()

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.9.1.1) Gecko/20090715 Firefox/3.5.1 (.NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.9.1.1) Gecko/20090715 Firefox/3.5.1 (.NET CLR 3.5.30729)

If a search contains Unicode characters then it works through the Search box, but the URL generated by the search cannot be loaded correctly.

Reproducible: Always

Steps to Reproduce:
1. Set the "Language for non-Unicode programs" to Russian
2. Type "проверка" into Google search box and hit Enter.
3. Hit Ctrl+L, Enter.
Actual Results:  
"Your search - ???????? - did not match any documents". The URL bar shows "q=%EF%F0%EE%E2%E5%F0%EA%E0", which is not the correct UTF-8 encoding for "проверка".

Expected Results:  
Ctrl+L, Enter should result in the exact same page as the page loaded in step 2.

I bet the fact that my "Language for non-Unicode programs", aka "System locale", is set to Russian is what causes the URL to be encoded this way. This is probably for backwards compatibility reasons.

Either way, I believe Ctrl+L, Enter, on a clean Firefox 3.5 profile, should *always* return the same page as the original search query did.
Further info: the STR causes "????????" because the resulting URL also includes "oe=utf-8". If instead I click this link: "http://www.google.co.uk/search?q=проверка", followed by Ctrl+L, Enter, I get search results for "ïðîâåðêà" instead of "проверка".
Reproduced on Windows 7 RC (build 7100) with Firefox 3.5.1 with system locale set to "Russian".

The reason this bug occurs is because Firefox assumes that the website accepts the same default character set as the system's locale. It would be preferable for Firefox to encode everything as UTF-8 instead and to stop bothering with outdated legacy character sets.
I can also reproduce in Windows 7 RTM.

I filed another bug, nr. 546342, which has some striking similarities, but also some differences.

In my case, manually loading http://search.yahoo.com/search?p=Ålesund or http://www.bing.com/search?q=Ålesund (by pressing Ctrl+L and Enter after following the link) cause similar problems. However, my problem seems independent of the "Language for non-Unicode programs" setting, and interestingly enough, does not cause problems with Google...
Severity: normal → major
OS: Windows Vista → All
Hardware: x86_64 → All
Version: unspecified → 3.6 Branch
Update: these problems do NOT occur on Mac OS X 10.6.
Could someone please mark this as NEW?
Still reproduces as described in 7.0a1 (2011-06-21). To restate:

1. Set the "Language for non-Unicode programs" to Russian
2. Type "проверка" into Google search box and hit Enter.
3. Hit Ctrl+L, Enter. Actual results: a search page for ???????? is displayed.
4. Click http://www.google.co.uk/search?q=проверка", followed by Ctrl+L, Enter. Actual results: search page for "ïðîâåðêà" is displayed.

In both cases, the URL bar changes to show "q=%EF%F0%EE%E2%E5%F0%EA%E0", and the only difference is in how Google decodes this value.

I could not reproduce the "http://search.yahoo.com/search?p=Ålesund" example.
Confirming in 7.0a1 (2011-06-27).
Status: UNCONFIRMED → NEW
Ever confirmed: true
Mostly fixed in Firefox 8. Completely fixed in Nightly, as far as I can tell.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.