Closed Bug 761582 Opened 12 years ago Closed 12 years ago

[research] Add feature: Articles that link to this article

Categories

(support.mozilla.org :: Knowledge Base Software, task, P2)

Tracking

(Not tracked)

RESOLVED FIXED
2012Q4

People

(Reporter: willkg, Assigned: mythmon)

References

Details

(Whiteboard: u=dev c=wiki p=1 s=2012.22)

+++ This bug was initially created as a clone of Bug #641503 +++

Please add the "Articles that link to this article" feature to the KB to make it easier for people to understand how articles are used (especially with templates)


See the "depends on" bug for what we're researching here. The goal is to come up with an idea of what work needs to be done and roughly how long it might take.

Putting this in the 2012.11 sprint and making it a 1 pointer.
Target Milestone: 2012.11 → 2012Q3
Priority: P4 → P2
Whiteboard: u=dev c=kb p=1 → u=dev c=wiki p=1 s=2012.22
Target Milestone: 2012Q3 → 2012Q4
Blocks: 641503
No longer depends on: 641503
Assignee: nobody → mcooper
Possible solutions to learn which articles that link to this article:

1. Search in `Document.html` for '/kb/{{ SLUG }}` (maybe inside an <a> tag?).

This would work for articles that link to each other. It will not do anything to show the usage of templates, like mentioned in bug 648642. This should be pretty easy, and wouldn't involve changing the index, but might produce false positives, and could have problems with locales.

2. Search in `Revision.content` for internal links.

This has more flexibility, allowing this to expand to template and image use (image use was mentioned in bug 641503), but is more complex because it requires parsing wiki syntax, or integrating into the current parse. Additionally, the wiki syntax content of revisions is not indexed in elasticsearch yet, so this would require either a) slow database queries, or b) modifying the index.

Once we know how to get this information, there are two ways we can access it:

1. Full text search (with caching)

For each document, search in `.html` or `Revision.content` for each query. This might be slower, but is still probably fast enough. In the case of `Document.html`, this would not require changing the index, but for `Revision.content` it would.

2. Store a list of links/template/images in a new field.

While indexing a document, or on save if this is in the DB, gather the list of link by looking through the html or wiki syntax, and then save this list. Later, this list can be queried with a simple contains operation. Since this would be a much simpler operation, it could be done in MySQL.

2a) store the list in MySQL as a new field (requires database migration)
2b) store the list in elasticsearch as a new field in the index. (requires a index migration)

Given this, I would recommend solution 1/1, ie: Searching `Document.html` in the elasticsearch index for '/kb/{{ SLUG }}' for any time we want this information, and caching the results. This is the smallest change, and won't require complicated index or schema changes. This however will only allow finding articles, not templates and images, and could possibly be slow. Under this plan (1/1) we would defer these features until later.
After discussing the use cases, we probably want option 2->2

The use cases are:
* What articles is a specific template used
* What articles link to a specific article
* What articles include a particular image

We'll file followup to come up with an implementation proposal.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Ricky, can you link to the follow up bugs to implement the reduced proposal?
(In reply to Kadir Topal [:atopal] from comment #3)
> Ricky, can you link to the follow up bugs to implement the reduced proposal?

bug 825621
You need to log in before you can comment on or make changes to this bug.