Need to do something about proliferation of redirects

NEW
Unassigned

Status

6 years ago
4 years ago

People

(Reporter: sheppy, Unassigned)

Tracking

(Depends on: 1 bug)

Details

(Whiteboard: [specification][type:feature])

(Reporter)

Description

6 years ago
What problems would this solve?
===============================
As we move content around, we're accumulating redirects. Sometimes redirect->redirect->redirect->real page, even. This has to stop! We need some mechanism that hunts down and gets rid of these internal redirects when possible; this will improve user experience as well as site performance.

Who would use this?
===================
Nobody, directly, but it would benefit everyone.

What would users see?
=====================
In theory, nothing, except a lack of "Redirected from Foobar redirect 734."

What would users do? What would happen as a result?
===================================================
In theory, nothing directly.

Is there anything else we should know?
======================================
I propose that we have a periodic job with low priority that runs through all pages and looks to see if they have any internal links that point at redirects, rewriting those links and updating the article.

There are many ways this could be sped up, such as perhaps by keeping a list of pages that have been moved since the last time that process was run and checking against that list instead of having to actually go and look up each link.

But something needs to be done. This is only getting worse, and right now it's getting bad especially fast as we do our content reorg.
So we can't remove redirects but we may be able to hit the first redirect, then find the last endpoint in the chain and thus skip over a few of the redirects.

We should, IMO, be more aware of how often we're moving things and avoid doing so if not necessary.  I know that what we've done recently is necessary though.
This bug really is two bugs identified and filled long ago:
- on one side, it is fixing the existing urls that have not been updated: it is bug 783535.
- on the other side, it is too fix the moving function to fix the links in other pages when we move things (we will continue to move pages every week). It is bug 820912.

We have to fix both things.
(Reporter)

Comment 3

6 years ago
David:

Yeah, we want to avoid redirect creation when possible, but it's gonna happen and it's gonna happen a lot, even if we try to avoid it (we never do it for fun). :)

And yeah, we have to leave the redirect pages around, but we can update links within MDN to stop pointing at redirects.
Just seeing this bug for the first time today...

I don't think we should get rid of the redirects, or remove the redirect-on-move feature. We have a lot of work into Kuma to try to route visitors to the right page, whether it was a legacy MindTouch URL or a Kuma page that's been moved several times.

We have bug 836529 to deal with the redirect URL params and the "redirected from" message. That might help some of the issue.

We can also fix internal links (bug 820912). But, links in the wild from blog posts & etc are off limits. We want users coming in from those points to end up in the right place. And, if Google sees a 404 for any of those links, they drop those pages from the index.

It would also help to revise all the redirects as a part of bug 820912, so that all point at the current home for the page and no longer form chains. Then, it would also be possible to manually weed out superfluous redirects.

Another idea is to build something that watches page views, and if a redirect hasn't been used in a certain period of time (eg. months? years?), it could expire and self-destruct. That might be more trouble than it's worth, though, since I don't think we're set up to collect that kind of per-page metric on every view in a way that's useful to Kuma code. (eg. Kuma can't query Google Analytics, I don't think)

Given all the above, I think bug 820912 and bug 836529 are the best next steps. Adding those as blockers, turning this into a tracking bug.
Depends on: 820912, 836529
(Reporter)

Comment 5

6 years ago
No, we absolutely don't want to remove redirects. But we do want to find ways to prevent redirects when possible.

I agree that trying to age old redirects and destroy them if they become idle for a long time is more trouble than it's worth. I agree that bug 820912 and bug 836529 are the ones to fix here. I'd also like to see us have a service that goes through and periodically looks for in-site links to redirects and updates them, so that we don't rely on redirects within the site.

This will improve site performance as a whole, and will make it less likely that redirects will be promulgated elsewhere.
Agree that bug 820912 and bug 836529 should be fixed here. But what about external sites that link to a page in the middle of a redirect chain?

What if an external site links to page B in the following structure?

A -> B -> C -> D -> E (where -> means "redirects to")

Do we still want to solve that problem by short-cutting the redirects (A -> E, B -> E, etc.) as David mentions in comment 1? Or do we intend for bug 820912 to solve that problem by updating the addresses of the internal redirects themselves?
(Reporter)

Comment 7

6 years ago
(In reply to John Karahalis [:openjck] from comment #6)

> Do we still want to solve that problem by short-cutting the redirects (A ->
> E, B -> E, etc.) as David mentions in comment 1? Or do we intend for bug
> 820912 to solve that problem by updating the addresses of the internal
> redirects themselves?

Yeah, I think we want to solve this by updating the internal redirects and allow off-site links to redirects to naturally fade away over time. We want the redirects themselves to remain, but we want to remove reliance upon them within our own content.
(In reply to John Karahalis [:openjck] from comment #6)

> Do we still want to solve that problem by short-cutting the redirects (A ->
> E, B -> E, etc.) as David mentions in comment 1? Or do we intend for bug
> 820912 to solve that problem by updating the addresses of the internal
> redirects themselves?

Solving *both* with bug 820912 is what I meant. We would update the links within documents *and* in redirects.
See Also: → bug 906085
You need to log in before you can comment on or make changes to this bug.