Closed
Bug 1121903
Opened 10 years ago
Closed 1 years ago
Goodreads Search Engine returns blank page (UTF-8 params aren't correctly URL encoded)
Categories
(Firefox :: Search, defect, P3)
Tracking
()
VERIFIED
FIXED
116 Branch
Tracking | Status | |
---|---|---|
firefox116 | --- | verified |
People
(Reporter: ioana.chiorean, Assigned: daleharvey)
References
Details
(Whiteboard: [snt-scrubbed][search-tech-debt])
Attachments
(3 files)
[Tracking Requested - why for this release]:
Firefox 37.0a2, 38.0a1, 36.0b1
All devices
Steps:
1. Go to goodreads.com and add the search engine ( near the bottom of the page).
2. Go to URL bar and search for an author/term with Goodreads search engine
Expected Results:
- a list of results should be displayed - consistent with the search in the page
Actual Results:
- a blank page is displayed - the search is run with different UFT parameters
- added search engine: http://www.goodreads.com/search?utf8=%u2713&query=eminescu
- in page search: http://www.goodreads.com/search?utf8=%E2%9C%93&query=eminescu
Comment 1•10 years ago
|
||
Interesting. %E2%9C%93 is the URL encoded hex represenation of ✓. u2713 is I guess a "source code" hex (there's probably a real word for this--code point maybe?), e.g. in Python it would be u"\u2713".
Their opensearch.xml looks like so (http://www.goodreads.com/opensearch.xml):
<?xml version="1.0" encoding="UTF-8"?>
<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">
<ShortName>Goodreads</ShortName>
<Description>Search Goodreads.com</Description>
<Tags>goodreads, books, reading</Tags>
<Url type="text/html" method="GET" template="http://www.goodreads.com/search/search">
<Param name="search_type" value="books"/>
<Param name="search[query]" value="{searchTerms}"/>
</Url>
<LongName>Goodreads Search</LongName>
<Image height="16" width="16" type="image/vnd.microsoft.icon">http://www.goodreads.com/favicon.ico</Image>
<Query role="example" searchTerms="goodreads" />
<Attribution>Search data copyright Goodreads, Inc</Attribution>
<SyndicationRight>open</SyndicationRight>
<AdultContent>false</AdultContent>
<Language>en-us</Language>
<OutputEncoding>UTF-8</OutputEncoding>
<InputEncoding>UTF-8</InputEncoding>
</OpenSearchDescription>
Rails uses this utf8=✓ to detect IE bugs, IIRC. Could be that there's some kind of opensearch.xml -> actual search encoding issue.
Although, in Chrome desktop if I add the search engine the URL I see is "http://www.goodreads.com/search?utf8=%E2%9C%93&query=%s". In Firefox Desktop I get "http://www.goodreads.com/search?utf8=%u2713&query=foo" as well. Needs some more investigation--this may be a core issue.
Whiteboard: [country-all][notcontactready]
Updated•10 years ago
|
Component: Mobile → Search
Product: Tech Evangelism → Firefox
Comment 2•10 years ago
|
||
Moving to Firefox::Search since this affects Desktop + Mobile.
Updated•10 years ago
|
Summary: Goodreads Search Engine returns blank page → Goodreads Search Engine returns blank page (UTF-8 params aren't correctly URL encoded)
Comment 3•10 years ago
|
||
Comment 4•10 years ago
|
||
Probably not related to opensearch.xml (?) - just a matter of how Firefox registers the values of other form elements when creating a keyword search.
(This looks like an important bug for L10N. The search keyword functionality is likely broken in many locales that use non-ASCII encodings and include extra data in search forms.)
Comment 5•10 years ago
|
||
I'm confused about references to "add the search engine", OpenSearch, and keyword search. OpenSearch and keyword searches are entirely separate - which one is the problem here?
Can someone clarify the exact STR for experiencing the bug? Comment 0 is a bit low on detail about what exactly "add the search engine" entails, and which query term is used.
Comment 6•10 years ago
|
||
Oh, I see it now - it's the "add keyword for this search" that's an issue. OpenSearch is not related.
Component: Search → General
Comment 7•10 years ago
|
||
Thanks Gavin.
Here's some more clear STR:
1) navigate to http://www.goodreads.com/
2) Right-click in search input (near bottom of page), select Add a keyword for this search
3) Add keyword via dialog (I used "gr")
4) Use keyword to do a test search via URL bar: "gr fiction"
Expected:
Search page results, similar to Chrome: http://www.goodreads.com/search?utf8=%E2%9C%93&query=fiction
Actual:
Blank page with following URL: http://www.goodreads.com/search?utf8=%u2713&query=fiction
Comment 8•10 years ago
|
||
This seems to fix this particular case, at least. The param encoding code for when the keyword is saved (https://hg.mozilla.org/mozilla-central/annotate/tip/browser/base/content/browser.js#l6650) differs slightly from the param encoding code for the search term when used (https://hg.mozilla.org/mozilla-central/annotate/tip/browser/base/content/browser.js#l2123), this patch (I think) makes them consistent.
Reporter | ||
Comment 9•10 years ago
|
||
Gavin, Mike - just to clear this out the bug was initial for mobile and no keyword involved.
Steps:
1. Go to goodreads.com
2. Scroll to the search bar ( near the bottom of the page).
3. Tap twice in the search bar so the input menu appears
4. Tap the magnifying loop with + so you can Add the search engine
2. Go to URL bar and search for an author/term with Goodreads search engine
See video here: http://youtu.be/lZlNpdLWI-U
Updated•10 years ago
|
Attachment #8551974 -
Flags: review?(mak77)
Updated•10 years ago
|
Assignee: nobody → gavin.sharp
Updated•10 years ago
|
Whiteboard: [country-all][notcontactready]
Comment 10•10 years ago
|
||
Comment on attachment 8551974 [details] [diff] [review]
potential patch
Review of attachment 8551974 [details] [diff] [review]:
-----------------------------------------------------------------
::: browser/base/content/browser.js
@@ +6653,2 @@
> else
> + return escapeFn(aName) + "=" + escapeFn(aValue);
I'm not sure this will work as expected due to this:
http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#2110
2109 if (postData)
2110 escapedPostData = unescape(postData);
Now, if you escape("✓"), you get %u2713 and unescape gives you ✓
But if you encodeURIComponent("✓") you get %E2%9C%93 and unescape gives you broken chars like â
But there is more in the keywords code that makes me mad (and sad), maybe you have an insight about these, cause I cannot figure it out:
1. I'm not sure the story of escape vs encodeURIComponent at http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#2118 is very sound, encodeURIComponent generates pairs that should not only be valid for UTF-8, it generates hex pairs valid for an rfc uri. An urlencoded form is not only valid in utf-8 and likely it uses encodeURIComponent...
2. I'm not sure why for url encoded forms we escape "name=value", while in the other cases we escape them separately. "=" and + are allowed, so the if/else here doesn't seem to make sense, it could always escape name and value separately. But even worse, why do we escape urlencoded post data that is already escaped? This completely puzzles me. It looks like we try to build some sort of "sane" representation to store it in a place that doesn't like "="(?).
The worst thing is that if we fix it, we break existing keywords.
3. since later we unescape postData that we escaped before (obtaining already-escaped-by-the-page postData), I wonder why we escaped it at all. Couldn't we store it in a ready-to-be-used state?
4. Maybe we could just use encodeURIComponent for GET urls and stop the crazy escape/encode mixup, but again, there's risk to break existing keywords :(
So, maybe I'm mad, but I think this code is a pile of nonsense.
It looks like it could just store postData as-is, encodeURIComponent anything else, and stop using escape. But, this thing is very hairy and I might be missing something, unfortunately who wrote this code originally is not with us anymore.
Attachment #8551974 -
Flags: review?(mak77)
Comment 11•10 years ago
|
||
regardless what we do, it might be worth to extend (or dupe) browser_addKeywordSearch.js to cover this case.
Comment 13•10 years ago
|
||
Thanks for diving into this (even just a little bit). I agree that this code is a mess, but I doubt we'll be able to untangle it any time soon.
I'm not sure I understand the potential issue you're describing - in my testing this patch worked, at least for the testcase in this bug. Are you suggesting that it wouldn't work for the testcase in this bug, or that it would break some other testcase?
We should definitely add a test.
Flags: needinfo?(mak77)
Comment 14•9 years ago
|
||
This happens with GitHub's search engine, too. Must be a Rails-wide issue.
Updated•8 years ago
|
Assignee: gavin.sharp → nobody
Component: General → Search
Updated•8 years ago
|
Flags: needinfo?(mak77)
Comment 15•7 years ago
|
||
I have this problem using Firefox 55.02(64 bit) on Ubuntu.
Comment 16•7 years ago
|
||
Here’s a workaround for users with this issue.
Edit the search bookmark keyword, usually by right-clicking it and selecting “Properties”.
The Location field should look something like this:
https://example.com/search?utf8=%u2713&q=%s
The important part is the mangled “utf8=%u2713” parameter — the server expects “utf8=✓”, or it will refuse the search.
If you replace “utf8=%u2713” with “utf8=✓” in the location field, it should perform the search correctly. If that didn’t help, try removing the utf8 parameter entirely — which would produce something like this:
https://example.com/search?q=%s
Updated•5 years ago
|
Priority: -- → P2
Comment 18•4 years ago
|
||
Current link to the source code doing the translation: https://searchfox.org/mozilla-central/rev/927e525f481a93a8f63d27a78ae6201e42b1b1fb/browser/actors/ContextMenuChild.jsm#172-212
Severity: normal → S3
status-firefox36:
affected → ---
status-firefox37:
affected → ---
status-firefox38:
affected → ---
Updated•2 years ago
|
Priority: P2 → P3
Whiteboard: [snt-scrubbed][search-tech-debt]
Updated•2 years ago
|
See Also: → https://mozilla-hub.atlassian.net/browse/SNT-329
Comment 19•2 years ago
|
||
Since there was already a WIP patch for this bug, we want to continue that and verify what we're receiving in the get
and post
cases. So there's no further regressions.
Assignee | ||
Updated•1 years ago
|
Assignee: nobody → dharvey
Assignee | ||
Comment 20•1 years ago
|
||
Comment 21•1 years ago
|
||
Pushed by dharvey@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8ea86587da61
Use encodeURIComponent for keyword url encoding. r=mak
Comment 22•1 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 1 years ago
status-firefox116:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → 116 Branch
Reproducible on a 2023-06-21 Nightly build on Windows 10 using the STR from Comment 7.
Verified as fixed on Nightly 116.0a1(build ID: 20230626215144) on Windows 10, macOS 12, Ubuntu 22.
After submitting the search the results page is displayed properly.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•