Closed Bug 295991 Opened 20 years ago Closed 19 years ago

Internet Keywords: search doesn't encode characters correctly

Categories

(Core :: DOM: Navigation, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: mossop, Assigned: mossop)

References

Details

(Keywords: fixed1.8.1)

Attachments

(2 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b2) Gecko/20050529 Firefox/1.0+
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b2) Gecko/20050529 Firefox/1.0+

When performing a search from the location bar some of the characters aren't
sent to google correctly. The pages that these searches go to may change over time.


Reproducible: Always

Steps to Reproduce:
1. Type "c++" into the location bar (without the quotes)
2. Browser goes to a C-SPAN page.
3. Go to google and type "c++" into the search.
4. Google shows a c++ language tutorial as the first entry.
5. Go to google and type "c  " into the search.
6. Google shows C-SPAN as the first entry.

Actual Results:  
Firefox appears to be basically sending the keyword characters almost as is. If
I was to guess I'd say that it only replaces spaces with +'s while in fact there
are numerous other changes that should be changed, particularly to + and %
characters.

In fact if you type "%2f" into the location, it does a google search for "/".
Basically the browser isnt encoding the %2f like it should and google is then
correctly decoding %2f as the / character.

Expected Results:  
Firefox should encode the query according to the
application/x-www-form-urlencoded encoding format.
Gotta love the function name that appears to handler keyword searchs:
MangleKeywordIntoHTTPURL

http://lxr.mozilla.org/mozilla/source/netwerk/protocol/keyword/src/nsKeywordProtocolHandler.cpp#100
Status: UNCONFIRMED → NEW
Ever confirmed: true
Following are some examples. The first text is the search typed. The second is
the generated part of the url as generated and sent to google. The third is what
should have been generated for correct form encoding. All quote marks are added.

"c++"             "c++"                   "c%2b%2b"
"c  "             "c"                     "c++"
"this is a test"  "this%20is%20a%20test"  "this+is+a+test"
"c%2f"            "c%2f"                  "c%252f"

It seems that when you type into the url bar the browser automatically escapes
any characters that cannot exist in a url (commonly spaces). This is then
considered to be the final escaped url. That is correct for most url's, but not
for ones that become keyword searches.

For keyword searches, whatever is typed into the url bar should be completely
escaped, not just the invalid characters.

Google is fairly forgiving on most of these issues, in fact I imagine that it is
only if you include a % or a + in your search that you would encounter problems
with the majority of search engines.
I have a patch in progress for this.
Assignee: nobody → dave.townsend
This solves the problem for me, but need to investigate conflicts elsewhere
before getting it reviewed.
When the uri hits the protocol handler it probably should already be escaped so
unescaping there is not right. This puts back in the escaping that was taken
out by bug 261608 and changes it to escape as a query string rather than a
path.
Component: Location Bar and Autocomplete → Embedding: Docshell
Product: Firefox → Core
Version: unspecified → Trunk
QA Contact: location.bar → davidpjames
Have to escape spaces to %20 in the first place so that the unescape in the
protocol handler converts them back to spaces. Then we correctly convert them
to pluses for addition into the keyword query.

Might possibly want to check if the keyword url is a query though I can't
imagine anyone wanting anything else.
Attachment #185658 - Attachment is obsolete: true
This is a duplicate of 316863. Problem still occurs on Firefox 1.5.0.1 Windows.
(In reply to comment #7)
> This is a duplicate of 316863. Problem still occurs on Firefox 1.5.0.1 Windows.
> 

No it is not. That bug talks about keyword searches from bookmarks, this bug is about the default keyword search, they are handled in different ways.
Fixed on trunk by patch for bug 245597.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Bug 245597 was checked into branch, so this should be fixed there now.
Keywords: fixed1.8.1
thanks for tackling this. I think there are couple old dupes about this, I'll try to lasso them in.
Depends on: 245597
Summary: Keyword search doesn't encode characters correctly → Internet Keywords: search doesn't encode characters correctly
QA Contact: davidpjames → docshell
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: