258223 - Bookmark keyword quicksearch need a way to specify character encoding for query URLs

Reporter

Description

•

21 years ago

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 using the quicksearch feature in Mozilla and Firefox to search for non ASCII search items often result in the said search item to be sent to website in non-universal encoding. I cannot decipher what encoding it is being sent, but sending it in UTF-8 would be nice. Reproducible: Always Steps to Reproduce: 1. define a quicksearch with a keyword (for example, "g" for http://www.google.com/search?q=%s) 2. use the feature: type "g [searchitem]" in address bar ([searchitem] would necessarily be non-english. I am testing this feature using CHINESE and JAPANESE 3. press enter to activate feature. Actual Results: the query is sent as a wrong encoding and google searches for "????" instead. Expected Results: the search website is sent the properly encoded (UTF-8) queries. Several notes: I am providing some search items that can be used for debugging. japanese searchitem would be "凪," chinese searchitem should be "东" (these are characters that does not exist in the other language). Interestingly, all of the chinese searches come out alright. Even more interestingly, if the chinese character is coupled with the japanese "g 东凪," the japanese character is sent properly. HOWEVER, if sending japanese alone, google receives something in something strange. I use Japanese version of windows., and the same problem was confirmed in both Moz 1.7 and Firefox 0.9.3. As a final usage note, the addressbar search in Mozilla AND firefox searchbox performs these searches flawlessly. Onto some speculation: I believe that the cause of this problem is that the quicksearch feature probably involves two encodings. one is the encoding for the searchitem, and the other is the encoding requested to the website. This feature need to *force* the query items to interpreted as UTF-8, which it is not doing right now, even though the website requests are all in UTF. When the searchstring is japanese, the quicksearch feature would send the item as the default OS encoding (probably EUC-JP), while the page request specifies UTF. This results in google decoding the searchitem by UTF-8 even though it is encoded in EUC. In the case where chinese or chinese + japanese is sent, the default encoding cannot express the chinese characters and sends the query as UTF, which is accepted and processed properly. *) I am marking this bug as major because its existance implies that all non-english users of mozilla / firefox cannot use this feature. It also means that for these users firefox have no address bar search functionality at all and must rely on the searchbox. This becomes even more pronounced as firefox will replace mozilla as the standard browser. *) this bug is probably related to bug # 205652.

Boris Zbarsky [:bzbarsky]

Comment 1

•

21 years ago

Interesting. The testcases in this bug workforme with an English Win98 system.... Can someone point me to where bookmark keywords are implemented? I thought the "keyword" protocol handler handled that, but it looks like that's a different beastie....

Ling Qi

Reporter

Comment 2

•

21 years ago

Small comment: Please view this page in UTF-8 encoding to see the test cases properly.

Ling Qi

Reporter

Comment 3

•

21 years ago

Some additional comments: 凪: japanese native, JIS code 4664, Shift JIS code 93E2, Unicode 51EA 东: chinese (simplified), Unicode 4E1C. Does not exist in JIS code pages. I apparently made an mistake earlier. The character in japanese is sent as S_JIS but interpreted as UTF. Using the quicksearch feature for japanese, the following results are obtained (in firefox): 1) type "g 凪" and press enter 2) the address bar performs substitution to "http://www.google.com/search?q=凪" 3) step 2 quickly changes to "http://www.google.com/search?q=%93%E2" (encoded in Shift_JIS) Using the same feature and searching for chinese 1) type "g 东" and press enter 2) the address bar performs substitution to "http://www.google.com/search?q=东" 3) step 2 quickly changes to "http://www.google.com/search?q=%E4%B8%9C" which I have no idea what encoding it is, but is sent through properly. *) if not using quicksearch but rather Firefox's searchbox (same with moz address bar search), the address will become "http://www.google.com/search?q=%E5%87%AA" directly. Probably the replace function in quicksearch (s/%s/[searchterm]) should have code that will convert [searchterm] to proper encoding before sending it to address bar for page retrieval.

Ling Qi

Reporter

Comment 4

•

21 years ago

(In reply to comment #3) > *) if not using quicksearch but rather Firefox's searchbox (same with moz > address bar search), the address will become > "http://www.google.com/search?q=%E5%87%AA" directly. This is for the japansee test case 凪

neil@parkwaycc.co.uk

Comment 5

•

21 years ago

Attached patch Possible fix (obsolete) — Details — Splinter Review

I can't actually test this because I don't have any Japanese or Chinese fonts.

Boris Zbarsky [:bzbarsky]

Comment 6

•

21 years ago

ccing intl folks and darin. Since bookmark keywords are handled outside the core, they run up against the code added in bug 130393 (which encodes URIs for certain schemes in the OS charset instead of UTF-8 because servers at the time didn't deal well at all well with UTF-8). Is that situation still present? If not, can we remove that code? That said, Neil's fix should do a decent job for this particular bug, I think... Ling Qi, could you possibly test it? (Testing doesn't require a build environment; just modifying the navigator.js in the jar in the chrome dir in a Mozilla install.)

Assignee: p_ch → jag

Status: UNCONFIRMED → NEW

Component: Bookmarks → XP Apps

Ever confirmed: true

QA Contact: seamonkey.bookmarks → pawyskoczka

Ling Qi

Reporter

Comment 7

•

21 years ago

Patch 158088 was a charming fix for Mozilla. For 1.7 (sorry I didn't have time to install 1.7.2), I changed line 1431 (Diff shows 1495), which fixed the problem for Moz. The I cannot find the same line for Firefox but in /content/browser/browser.js, function getShortcutOrURI(aURL, aPostDataRef) seem to be the one related to this bug (0.9.3 release code). So Firefox remains untested.

Possible fix 21 years ago neil@parkwaycc.co.uk 773 bytes, patch		Details \| Diff \| Splinter Review
patch 21 years ago Jungshik Shin 3.41 KB, patch		Details \| Diff \| Splinter Review
firefox patch v1 21 years ago Jungshik Shin 3.18 KB, patch		Details \| Diff \| Splinter Review
patch v2 (per cbie's comment) 21 years ago Jungshik Shin 3.10 KB, patch		Details \| Diff \| Splinter Review
firefox patch v2 21 years ago Jungshik Shin 2.77 KB, patch		Details \| Diff \| Splinter Review
patch v3 21 years ago Jungshik Shin 2.49 KB, patch	neil : superreview+	Details \| Diff \| Splinter Review
patch for firefox (1.0) 21 years ago Jungshik Shin 2.29 KB, patch		Details \| Diff \| Splinter Review
new patch for firefox 20 years ago Jungshik Shin 5.76 KB, patch	neil : superreview-	Details \| Diff \| Splinter Review
browser.js patch for firefox 1.0.4 20 years ago wyns_sh 3.76 KB, patch		Details \| Diff \| Splinter Review
seamonkey patch 20 years ago Jungshik Shin 5.03 KB, patch		Details \| Diff \| Splinter Review
firefox patch 20 years ago Jungshik Shin 7.57 KB, patch	vlad : review+ neil : superreview+ asa : approval1.8b4+	Details \| Diff \| Splinter Review
seamonkey patch with the missing function and a typo fixed 20 years ago Jungshik Shin 5.69 KB, patch	neil : superreview+ asa : approval1.8b4+	Details \| Diff \| Splinter Review