Closed
Bug 483811
Opened 16 years ago
Closed 16 years ago
Need to unescape in KB articles' summaries in search results
Categories
(support.mozilla.org :: General, defect, P1)
Tracking
(Not tracked)
VERIFIED
DUPLICATE
of bug 485386
1.0.2
People
(Reporter: stephend, Assigned: paulc)
References
()
Details
Attachments
(1 file)
|
3.35 KB,
patch
|
laura
:
review+
|
Details | Diff | Splinter Review |
Steps to Reproduce:
1. Load http://support-stage.mozilla.org/tiki-newsearch.php?where=d&locale=en-US&q=bookmark&sa=
2. Look at the third entry, "How to change the Latest Headlines feed
The quot;Latest Headlines quot; item is a Live Bookmark that is included by default in Firefox. Instead of changing the feed, you can delete it, and add another live bookmark. Right-clickHold dow..."
Expected Results:
We should escape unescape quot;Latest Headlines quot; so that it appears as "Latest Headlines"
Comment 1•16 years ago
|
||
Eric, this is the most important "new search is broken" bug to fix at this point. Giving it your highest prio for 1.1.
Assignee: nobody → smirkingsisyphus
Priority: -- → P1
Target Milestone: 1.0 → 1.1
Comment 2•16 years ago
|
||
It's not that they need decoded, it's because they're malformed entities...or that seems to be the case. As far as I can tell, utf8entities() defined at /scripts/indexer.php:9 replaces all '&' to ' '. This makes all encoded html entities invalid.
Nelson can you clear up line 11 from the indexer? I get the rest (removing the first 31 ascii characters in the table), but why "&", ">", "<"? Were they causing problems with the indexer? Am I missing something simple?
Comment 3•16 years ago
|
||
The indexing process needs valid XML, which is why entities are being killed. Currently investigating a working solution.
Comment 4•16 years ago
|
||
So, we can 'fix' the indexer or add something into smarty to salvage malformed html entities.
I spent sometime over the weekend playing with the indexer, and that's my preferred approach. I'm just not sure I'll have time. Input?
Updated•16 years ago
|
Target Milestone: 1.1 → 1.0
| Assignee | ||
Comment 5•16 years ago
|
||
Eric: I can help out with this too if you can't get to it by tomorrow and want help :)
Updated•16 years ago
|
Assignee: smirkingsisyphus → paul.craciunoiu
| Assignee | ||
Comment 6•16 years ago
|
||
This fix comes in two flavors (or steps)!
Part 1. Fixed future indexing of encoding for '&<>' in XML.
Part 2. For this to be verifiable on staging, lines
173 of indexer.php and
151 of indexer-forums.php
(the lastModif conditional)
must be commented out, so that old (invalid) descriptions are replaced by new ones.
As long as production hasn't come in contact with indexing yet, there will be no need to do that there. Note that this only fixes the three HTML characters "&<>".
r23784 / r23785
However, while testing this locally, I noticed that certain characters don't encode well in utf-8. For example, on
http://support-stage.mozilla.org/tiki-newsearch.php?where=d&locale=en-US&q=menu+reference&sa=
There are a few odd characters after the word "Preferences". This turns out to go away if the lines for the MySQL character set are commented out in the indexer. However, then the XML encoding needs to be changed...
So we either account for those cases (who knows how many?), or ideally change the encoding to a superset of both Latin-1 and UTF-8, though I'm not sure there is one.
My experience in this is limited, so I'd like some help. I've played around lots though, but didn't find any solution. Ideas?
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Comment 7•16 years ago
|
||
Reopening, since the bug isn't fixed. When looking at the URL you included above, I see a lot of weird things:
- The first result is linkified
- Still lots of weird characters in the results, as you point out
- Also filed bug 485326 for the "Table Of Contents" text appearing in the summaries
Please also ask for review. I know we're on a tight schedule, but seeing how easy it is to create massive regressions, I'd feel much more comfortable if someone reviews the patches before we check them in.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
| Assignee | ||
Comment 8•16 years ago
|
||
(In reply to comment #7)
> Reopening, since the bug isn't fixed. When looking at the URL you included
> above, I see a lot of weird things:
>
> - The first result is linkified
Could you define linkified?
> - Still lots of weird characters in the results, as you point out
Definitely. But this is not about escaping, as much as doing proper summaries and proper encoding. It's not in the scope of this bug.
> - Also filed bug 485326 for the "Table Of Contents" text appearing in the
> summaries
And neither is this.
>
> Please also ask for review. I know we're on a tight schedule, but seeing how
> easy it is to create massive regressions, I'd feel much more comfortable if
> someone reviews the patches before we check them in.
Okay, sorry about that. It'll attach the patch when I wake up, so that at least it will be here. Maybe Eric could look over it at least to double check.
As a side note, I think there's a lot more to be done for newsearch, even more than it seems.
Comment 9•16 years ago
|
||
(In reply to comment #8)
> (In reply to comment #7)
> > Reopening, since the bug isn't fixed. When looking at the URL you included
> > above, I see a lot of weird things:
> >
> > - The first result is linkified
> Could you define linkified?
Interesting, it's no longer linkified. I meant that the whole summary was part of the article link.
> > - Still lots of weird characters in the results, as you point out
> Definitely. But this is not about escaping, as much as doing proper summaries
> and proper encoding. It's not in the scope of this bug.
Is there a bug filed for that?
> As a side note, I think there's a lot more to be done for newsearch, even more
> than it seems.
Yeah; sadly I think you're right.
| Assignee | ||
Comment 10•16 years ago
|
||
(In reply to comment #9)
> Interesting, it's no longer linkified. I meant that the whole summary was part
> of the article link.
It still is linkified for me, all of them are. I thought this was part of the design... :) Should it be removed?
>
> > Definitely. But this is not about escaping, as much as doing proper summaries
> > and proper encoding. It's not in the scope of this bug.
>
> Is there a bug filed for that?
No. I'll file one right now.
| Assignee | ||
Comment 11•16 years ago
|
||
I've already committed this, but it's very likely that the fix won't be pushed. So there's still time for a review. In any case, it'll serve as a reference for reverting if need be.
Attachment #369547 -
Flags: review?(nelson)
Comment 12•16 years ago
|
||
(In reply to comment #10)
> (In reply to comment #9)
> > Interesting, it's no longer linkified. I meant that the whole summary was part
> > of the article link.
> It still is linkified for me, all of them are. I thought this was part of the
> design... :) Should it be removed?
Yes, it should, but when I reported the problem, the summary was also blue (as in the link color). And it was only for the first search result (don't ask me why..). Anyway, not a problem anymore.
| Assignee | ||
Comment 13•16 years ago
|
||
(In reply to comment #12)
So is this bug fixed? Or what should it focus on now?
Comment 14•16 years ago
|
||
Comment on attachment 369547 [details] [diff] [review]
Patch for XML encoding of "&<>"
This is fine, could have been done a bit less verbosely with str_replace but meh.
Attachment #369547 -
Flags: review?(nelson) → review+
| Assignee | ||
Comment 15•16 years ago
|
||
(In reply to comment #14)
> (From update of attachment 369547 [details] [diff] [review])
> This is fine, could have been done a bit less verbosely with str_replace but
> meh.
True, I just reused the same function that was there...
| Assignee | ||
Updated•16 years ago
|
Target Milestone: 1.0 → 1.1
| Assignee | ||
Updated•16 years ago
|
Target Milestone: 1.1 → 1.0.2
Comment 16•16 years ago
|
||
Should this be closed now?
| Assignee | ||
Comment 17•16 years ago
|
||
I think that bug 485386 is taking care of this.
| Reporter | ||
Updated•16 years ago
|
Comment 18•16 years ago
|
||
I'm marking this as a dupe of bug 485386, which is root of this bug and relegates the patch in this bug.
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•