Closed
Bug 503591
Opened 16 years ago
Closed 16 years ago
Mozbot badly parses extended characters in search results
Categories
(Webtools Graveyard :: Mozbot, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: Noah, Assigned: cww)
Details
Attachments
(1 file)
|
678 bytes,
patch
|
wolf
:
review+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090626 Firefox/3.0.10
Build Identifier:
The bots have encoding/parsing issues when using the google module to search and return links such as:
http://answers.yahoo.com/question/index%3Fqid%3D20090320050357AA4z7c3
instead of:
http://answers.yahoo.com/question/index?qid=20090320050357AA4z7c3
causing them to fail when used.
Apostrophes are shown as ' in link titles:
Symantec kills 'broken' NAV script blocker: News - Security ... -- http://www.zdnet.com.au/news/security/soa/Symantec-kills-broken-NAV-script-blocker/0,130061744,139234250,00.htm
Reproducible: Always
Comment 1•16 years ago
|
||
I tested with firebot, which runs on XP, doing a search for "Symantec kills broken NAV script" and it rendered properly.
Is this related to the UTF-8/Encoding issue in the infamous Bug 490052? Only because it shows a similar encoding issue on Ubuntu and firebot rendered correctly, but that wouldn't explain why this shows up on an XP install of mozbot.
Noah: Is this on XP or another OS?
Wolf, Cww: Any ideas?
| Reporter | ||
Comment 2•16 years ago
|
||
Yes, XP. Retested "Symantec kills broken NAV script" search & it indeed does work. I had originally seen this consistently around 7/14/2008 and afterward. Firebot did update its Google module earlier this year, that must've fixed that. That'll teach me to retest before posting like that again.
But the other issue still does remain. google norton yahoo answers <- for examples.
Comment 3•16 years ago
|
||
This is not a duplicate of Bug 490052, having talked with Cww, and I am assigning it to Cww.
Also with firebot "google norton yahoo answer" returns "What is the difference between mcafee and norton? - Yahoo! Answers -- http://answers.yahoo.com/question/index%3Fqid%3D20081205115645AAupIjR" which appears to be the "buggy" behavior.
Assignee: nobody → cwwmozilla
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Ok, so what's going on is Google seems to have changed their REST API to double-%-escape URLs (probably for things like IE compatibility). I'm going to try to manually unescape it first but a more subtle approach is needed if that ends up with weird unicode.
Attachment #388931 -
Flags: review? → review?(bugtrap)
Updated•16 years ago
|
QA Contact: mozbot → mozilla.bugs
Updated•16 years ago
|
QA Contact: mozilla.bugs → mozbot
Comment 6•16 years ago
|
||
Comment on attachment 388931 [details] [diff] [review]
patch v1
Before:
What is the difference between mcafee and norton? - Yahoo! Answers -- http://answers.yahoo.com/question/index%3Fqid%3D20081205115645AAupIjR
After:
What is the difference between mcafee and norton? - Yahoo! Answers -- http://answers.yahoo.com/question/index?qid=20081205115645AAupIjR
Looks ok to me. r+
Attachment #388931 -
Flags: review?(bugtrap) → review+
Updated•16 years ago
|
Keywords: checkin-needed
Comment 7•16 years ago
|
||
Checking in BotModules/Google.bm;
/cvsroot/mozilla/webtools/mozbot/BotModules/Google.bm,v <-- Google.bm
new revision: 1.5; previous revision: 1.4
done
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Updated•16 years ago
|
Keywords: checkin-needed
Updated•6 years ago
|
Product: Webtools → Webtools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•