Closed
Bug 929373
Opened 11 years ago
Closed 11 years ago
By default, don't display User: pages in results
Categories
(developer.mozilla.org Graveyard :: Search, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: teoli, Assigned: groovecoder)
References
()
Details
Attachments
(1 file)
258.68 KB,
image/png
|
Details |
Do the search 'test' (ok, it is not a real search, but it shows the problem).
First result is a User page.
Except in some specific cases, the User pages are not what the user is looking for.
By default, don't display them. Maybe add a facet to include them in the Topic list.
Updated•11 years ago
|
Comment 1•11 years ago
|
||
I fully agree with Jean-Yves. Excluding some document types (such as user talk pages) from MDN search engine results would be a great improvement... What about allowing the user to check a few search options in its user references? Thanks in advance for your attention!
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → lcrouch
Assignee | ||
Comment 2•11 years ago
|
||
Still working on this, but the issue with both this bug and bug 928302 is that the post_save index task indexes these documents regardless of the exclude filter in Document.get_indexable().
Comment 3•11 years ago
|
||
Thanks in advance for any improvement to search results, Luke! What about providing an "advanced search" that would allow easy filtering using predefined tags (such as "user: pages") in document titles? And allowing users to save this filter in their preferences?
Assignee | ||
Comment 4•11 years ago
|
||
We have a bunch of filtering features planned for MDN search on bug 915760 and some more in the general search component of bugzilla.
Comment 5•11 years ago
|
||
Commits pushed to master at https://github.com/mozilla/kuma
https://github.com/mozilla/kuma/commit/d8ce6fc87c3cd82c3cfa8a0173e22fecee989433
fix bug 928302, 929373, 931412 - Check if the search index entry of a wiki document should really be updated during saving or deleting.
This fixes the problem of accidentally indexing wiki documents which
aren't supposed to be index when saving or deleting them.
https://github.com/mozilla/kuma/commit/334a8cee56d8947fab213ce2a02424f7c346fb1d
Merge pull request #1631 from jezdez/improved-indexables-928302
fix bug 928302, 929373, 931412 - Check if the search index entry of a wiki document should really be updated during saving or deleting.
Updated•11 years ago
|
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 6•11 years ago
|
||
I just investigated why bug 952532 was filed. So, I searched for "Regexp" in our search.
The second result is:
/docs/User:Potappo/Core_JavaScript_1.5_Reference/Global_Objects/RegExp_(members)
So, "User:" pages aren't excluded from results. See also, the original "test" search JYP provided.
See https://github.com/mozilla/kuma/blob/master/apps/search/models.py#L229
We need to exclude:
.exclude(slug__icontains='Talk:')
.exclude(slug__icontains='User:')
.exclude(slug__icontains='User_talk:')
At least in get_indexable() and in should_update() there.
I would write a PR for this, but I don't know how to test it properly.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 7•11 years ago
|
||
Jannis, any chance you can have a look at my previous comment? Otherwise I could also just open a PR and ask for reviewing this. It's just that you know the search stuff way better than me :-)
Flags: needinfo?(jezdez)
Comment 8•11 years ago
|
||
:fscholz You're right, this needs an update (and afterwards a reindex), I'm not sure why "User:" pages were not excluded to be honest. Can you make sure the list of page prefixes we'd exclude is correct this time? (e.g. what is the difference between "Talk:" and "User_talk:"?) I can work up a patch rather quick once that list is definite.
Flags: needinfo?(jezdez)
Comment 10•11 years ago
|
||
(In reply to Jannis Leidel [:jezdez] from comment #8)
> Can you make sure the list of page prefixes we'd exclude is correct this time? (e.g. what is
> the difference between "Talk:" and "User_talk:"?)
These are artifacts from back when we used MediaWiki as our wiki engine:
http://meta.wikimedia.org/wiki/Help:Namespace#List_of_namespaces
So, as far as I can tell, we used the following namespaces from that list:
.exclude(slug__icontains='Talk:')
.exclude(slug__icontains='User:')
.exclude(slug__icontains='User_talk:')
.exclude(slug__icontains='Template_talk:')
.exclude(slug__icontains='Project_talk:')
This should at least catch cruft from the MediaWiki days. There might be more unrelated to MediaWiki, but the writing team has just started to examine what we have on MDN overall. So for now this should be the complete list. Once we have the delete feature, more content in the normal namespace will go and we want to move some old content to an "Archive/" zone, which should probably be excluded from search, too. But work on this has not been started yet, it's something we will look into in 2014, though.
I think we are good to go with excluding MediaWiki related pages for now and might have more excludes later in the year, if that's okay for you.
Comment 11•11 years ago
|
||
Comment 12•11 years ago
|
||
Commits pushed to master at https://github.com/mozilla/kuma
https://github.com/mozilla/kuma/commit/d77fb4ffac52df31df820d32aac7945402f2aecb
Fix bug 929373 - Stop indexing even more documents.
We now exclude documents whose slug starts with any of the following strings: Talk:, User:, User_talk:, Template_talk:, Project_talk:
https://github.com/mozilla/kuma/commit/e4b8b1fab7099f0c11807e261fcde14c111730b2
Merge pull request #1923 from jezdez/search-exclude-more
Fix bug 929373 - Stop indexing even more documents.
Updated•11 years ago
|
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 13•11 years ago
|
||
FWIW, this is "fixed" in that the search indexer will skip these pages in the future. But, a full-site reindex will need to be done in order to address the current contents of the index that includes these pages.
However, a full-site reindex is currently very disruptive in that it basically makes search useless until it's done. See this github issue for more discussion:
https://github.com/mozilla/kuma/issues/1930
Updated•5 years ago
|
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•