Closed Bug 1121903 Opened 11 years ago Closed 2 years ago

Goodreads Search Engine returns blank page (UTF-8 params aren't correctly URL encoded)

Tracking

()

Status:

VERIFIED FIXED

Milestone:

116 Branch

Tracking Flags:

Tracking

Status

firefox116

---

verified

People

(Reporter: ioana.chiorean, Assigned: daleharvey)

References

Details

(Whiteboard: [snt-scrubbed][search-tech-debt])

Attachments

(3 files)

1121903.htm - demo 11 years ago Hallvord R. M. Steen [:hallvors] 744 bytes, text/html		Details
potential patch 11 years ago :Gavin Sharp [email: gavin@gavinsharp.com] 1.33 KB, patch		Details \| Diff \| Splinter Review
Bug 1121903 - Use encodeURIComponent for keyword url encoding. r=mak 2 years ago Dale Harvey (:daleharvey) 48 bytes, text/x-phabricator-request		Details \| Review

Ioana Chiorean

Reporter

Description

•

11 years ago

[Tracking Requested - why for this release]: Firefox 37.0a2, 38.0a1, 36.0b1 All devices Steps: 1. Go to goodreads.com and add the search engine ( near the bottom of the page). 2. Go to URL bar and search for an author/term with Goodreads search engine Expected Results: - a list of results should be displayed - consistent with the search in the page Actual Results: - a blank page is displayed - the search is run with different UFT parameters - added search engine: http://www.goodreads.com/search?utf8=%u2713&query=eminescu - in page search: http://www.goodreads.com/search?utf8=%E2%9C%93&query=eminescu

Mike Taylor [:miketaylr]

Comment 1

•

11 years ago

Interesting. %E2%9C%93 is the URL encoded hex represenation of ✓. u2713 is I guess a "source code" hex (there's probably a real word for this--code point maybe?), e.g. in Python it would be u"\u2713". Their opensearch.xml looks like so (http://www.goodreads.com/opensearch.xml): <?xml version="1.0" encoding="UTF-8"?> <OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/"> <ShortName>Goodreads</ShortName> <Description>Search Goodreads.com</Description> <Tags>goodreads, books, reading</Tags> <Url type="text/html" method="GET" template="http://www.goodreads.com/search/search"> <Param name="search_type" value="books"/> <Param name="search[query]" value="{searchTerms}"/> </Url> <LongName>Goodreads Search</LongName> <Image height="16" width="16" type="image/vnd.microsoft.icon">http://www.goodreads.com/favicon.ico</Image> <Query role="example" searchTerms="goodreads" /> <Attribution>Search data copyright Goodreads, Inc</Attribution> <SyndicationRight>open</SyndicationRight> <AdultContent>false</AdultContent> <Language>en-us</Language> <OutputEncoding>UTF-8</OutputEncoding> <InputEncoding>UTF-8</InputEncoding> </OpenSearchDescription> Rails uses this utf8=✓ to detect IE bugs, IIRC. Could be that there's some kind of opensearch.xml -> actual search encoding issue. Although, in Chrome desktop if I add the search engine the URL I see is "http://www.goodreads.com/search?utf8=%E2%9C%93&query=%s". In Firefox Desktop I get "http://www.goodreads.com/search?utf8=%u2713&query=foo" as well. Needs some more investigation--this may be a core issue.

Whiteboard: [country-all][notcontactready]

Mike Taylor [:miketaylr]

Updated

•

11 years ago

Component: Mobile → Search

Product: Tech Evangelism → Firefox

Mike Taylor [:miketaylr]

Comment 2

•

11 years ago

Moving to Firefox::Search since this affects Desktop + Mobile.

Mike Taylor [:miketaylr]

Updated

•

11 years ago

Summary: Goodreads Search Engine returns blank page → Goodreads Search Engine returns blank page (UTF-8 params aren't correctly URL encoded)

Hallvord R. M. Steen [:hallvors]

Comment 3

•

11 years ago

Attached file 1121903.htm - demo — Details

Hallvord R. M. Steen [:hallvors]

Comment 4

•

11 years ago

Probably not related to opensearch.xml (?) - just a matter of how Firefox registers the values of other form elements when creating a keyword search. (This looks like an important bug for L10N. The search keyword functionality is likely broken in many locales that use non-ASCII encodings and include extra data in search forms.)

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 5

•

11 years ago

I'm confused about references to "add the search engine", OpenSearch, and keyword search. OpenSearch and keyword searches are entirely separate - which one is the problem here? Can someone clarify the exact STR for experiencing the bug? Comment 0 is a bit low on detail about what exactly "add the search engine" entails, and which query term is used.

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 6

•

11 years ago

Oh, I see it now - it's the "add keyword for this search" that's an issue. OpenSearch is not related.

Component: Search → General

Mike Taylor [:miketaylr]

Comment 7

•

11 years ago

Thanks Gavin. Here's some more clear STR: 1) navigate to http://www.goodreads.com/ 2) Right-click in search input (near bottom of page), select Add a keyword for this search 3) Add keyword via dialog (I used "gr") 4) Use keyword to do a test search via URL bar: "gr fiction" Expected: Search page results, similar to Chrome: http://www.goodreads.com/search?utf8=%E2%9C%93&query=fiction Actual: Blank page with following URL: http://www.goodreads.com/search?utf8=%u2713&query=fiction

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 8

•

11 years ago

Attached patch potential patch — Details — Splinter Review

This seems to fix this particular case, at least. The param encoding code for when the keyword is saved (https://hg.mozilla.org/mozilla-central/annotate/tip/browser/base/content/browser.js#l6650) differs slightly from the param encoding code for the search term when used (https://hg.mozilla.org/mozilla-central/annotate/tip/browser/base/content/browser.js#l2123), this patch (I think) makes them consistent.

Ioana Chiorean

Reporter

Comment 9

•

11 years ago

Gavin, Mike - just to clear this out the bug was initial for mobile and no keyword involved. Steps: 1. Go to goodreads.com 2. Scroll to the search bar ( near the bottom of the page). 3. Tap twice in the search bar so the input menu appears 4. Tap the magnifying loop with + so you can Add the search engine 2. Go to URL bar and search for an author/term with Goodreads search engine See video here: http://youtu.be/lZlNpdLWI-U

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

10 years ago

Attachment #8551974 - Flags: review?(mak77)

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

10 years ago

Assignee: nobody → gavin.sharp

Mike Taylor [:miketaylr]

Updated

•

10 years ago

Whiteboard: [country-all][notcontactready]

Marco Bonardo [:mak]

Comment 10

•

10 years ago

Comment on attachment 8551974 [details] [diff] [review] potential patch Review of attachment 8551974 [details] [diff] [review]: ----------------------------------------------------------------- ::: browser/base/content/browser.js @@ +6653,2 @@ > else > + return escapeFn(aName) + "=" + escapeFn(aValue); I'm not sure this will work as expected due to this: http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#2110 2109 if (postData) 2110 escapedPostData = unescape(postData); Now, if you escape("✓"), you get %u2713 and unescape gives you ✓ But if you encodeURIComponent("✓") you get %E2%9C%93 and unescape gives you broken chars like â But there is more in the keywords code that makes me mad (and sad), maybe you have an insight about these, cause I cannot figure it out: 1. I'm not sure the story of escape vs encodeURIComponent at http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#2118 is very sound, encodeURIComponent generates pairs that should not only be valid for UTF-8, it generates hex pairs valid for an rfc uri. An urlencoded form is not only valid in utf-8 and likely it uses encodeURIComponent... 2. I'm not sure why for url encoded forms we escape "name=value", while in the other cases we escape them separately. "=" and + are allowed, so the if/else here doesn't seem to make sense, it could always escape name and value separately. But even worse, why do we escape urlencoded post data that is already escaped? This completely puzzles me. It looks like we try to build some sort of "sane" representation to store it in a place that doesn't like "="(?). The worst thing is that if we fix it, we break existing keywords. 3. since later we unescape postData that we escaped before (obtaining already-escaped-by-the-page postData), I wonder why we escaped it at all. Couldn't we store it in a ready-to-be-used state? 4. Maybe we could just use encodeURIComponent for GET urls and stop the crazy escape/encode mixup, but again, there's risk to break existing keywords :( So, maybe I'm mad, but I think this code is a pile of nonsense. It looks like it could just store postData as-is, encodeURIComponent anything else, and stop using escape. But, this thing is very hairy and I might be missing something, unfortunately who wrote this code originally is not with us anymore.

Attachment #8551974 - Flags: review?(mak77)

Marco Bonardo [:mak]

Comment 11

•

10 years ago

regardless what we do, it might be worth to extend (or dupe) browser_addKeywordSearch.js to cover this case.

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 13

•

10 years ago

Thanks for diving into this (even just a little bit). I agree that this code is a mess, but I doubt we'll be able to untangle it any time soon. I'm not sure I understand the potential issue you're describing - in my testing this patch worked, at least for the testcase in this bug. Are you suggesting that it wouldn't work for the testcase in this bug, or that it would break some other testcase? We should definitely add a test.

Flags: needinfo?(mak77)

Taylor Hunt

Comment 14

•

10 years ago

This happens with GitHub's search engine, too. Must be a Rails-wide issue.

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

9 years ago

Assignee: gavin.sharp → nobody

Component: General → Search

Marco Bonardo [:mak]

Updated

•

9 years ago

Flags: needinfo?(mak77)

greenbay.graham

Comment 15

•

8 years ago

I have this problem using Firefox 55.02(64 bit) on Ubuntu.

Taylor Hunt

Comment 16

•

8 years ago

Here’s a workaround for users with this issue. Edit the search bookmark keyword, usually by right-clicking it and selecting “Properties”. The Location field should look something like this: https://example.com/search?utf8=%u2713&q=%s The important part is the mangled “utf8=%u2713” parameter — the server expects “utf8=✓”, or it will refuse the search. If you replace “utf8=%u2713” with “utf8=✓” in the location field, it should perform the search correctly. If that didn’t help, try removing the utf8 parameter entirely — which would produce something like this: https://example.com/search?q=%s

Mark Banner (:standard8)

Updated

•

6 years ago

Priority: -- → P2

Mark Banner (:standard8)

Comment 18

•

5 years ago

Current link to the source code doing the translation: https://searchfox.org/mozilla-central/rev/927e525f481a93a8f63d27a78ae6201e42b1b1fb/browser/actors/ContextMenuChild.jsm#172-212

Severity: normal → S3

status-firefox36: affected → ---

status-firefox37: affected → ---

status-firefox38: affected → ---

Mandy Cheang [:mcheang][she/her]

Updated

•

3 years ago

Priority: P2 → P3

Whiteboard: [snt-scrubbed][search-tech-debt]

Jira Integration Bot

Updated

•

3 years ago

See Also: → https://mozilla-hub.atlassian.net/browse/SNT-329

Mandy Cheang [:mcheang][she/her]

Comment 19

•

3 years ago

Since there was already a WIP patch for this bug, we want to continue that and verify what we're receiving in the get and post cases. So there's no further regressions.

Dale Harvey (:daleharvey)

Assignee

Updated

•

2 years ago

Assignee: nobody → dharvey

Dale Harvey (:daleharvey)

Assignee

Comment 20

•

2 years ago

Attached file Bug 1121903 - Use encodeURIComponent for keyword url encoding. r=mak — Details

Pulsebot

Comment 21

•

2 years ago

Pushed by dharvey@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8ea86587da61 Use encodeURIComponent for keyword url encoding. r=mak

Cristina Horotan [:chorotan]

Comment 22

•

2 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/8ea86587da61

Status: NEW → RESOLVED

Closed: 2 years ago

status-firefox116: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 116 Branch

Ardelean Oana, Desktop Test Engineering [:oardelean]

Comment 23

•

2 years ago

Reproducible on a 2023-06-21 Nightly build on Windows 10 using the STR from Comment 7.
Verified as fixed on Nightly 116.0a1(build ID: 20230626215144) on Windows 10, macOS 12, Ubuntu 22.
After submitting the search the results page is displayed properly.

Status: RESOLVED → VERIFIED

status-firefox116: fixed → verified

You need to log in before you can comment on or make changes to this bug.