Last Comment Bug 454967 - Redirects still not working
: Redirects still not working
Status: RESOLVED INVALID
: access
Product: Mozilla Developer Network
Classification: Other
Component: General (show other bugs)
: unspecified
: All All
: -- normal (vote)
: ---
Assigned To: Eric Shepherd [:sheppy]
:
Mentors:
: 500287 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-09-12 02:35 PDT by Aaron Leventhal
Modified: 2013-11-11 18:27 PST (History)
9 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments

Description Aaron Leventhal 2008-09-12 02:35:20 PDT
Can we finally fix the redirects properly please please?

I keep discovering redirects that aren't working, probably because of punctuation:

For example:
http://developer.mozilla.org/En/ARIA:_Accessible_Rich_Internet_Applications/Relationship_to_HTML_FAQ#Who_supports_ARIA.3F
or
http://developer.mozilla.org/en/docs/AJAX%3aWAI_ARIA_Live_Regions/API_Support

It's a bit frustrating because there are a lot of good articles about Mozilla a11y and WAI-ARIA out on the web which point to our docs which now don't point to our resources. This affects what the readers get out of it, and the page ranking for our docs.

I also keep getting reports from people about broken links, and have to deal with that.

I'm complaining :)
Comment 1 Eric Shepherd [:sheppy] 2008-09-12 16:15:18 PDT
Looking into this.
Comment 2 Matthew N. [:MattN] (behind on reviews) 2009-06-05 14:26:07 PDT
I can confirm there are still issues as I stumbled across them today from Google.  An example is:
https://developer.mozilla.org/en/docs/XBL:XBL_1.0_Reference:Anonymous_Content  
It redirects to:
https://developer.mozilla.org/XBL:XBL_1.0_Reference:Anonymous_Content/en
instead of:
https://developer.mozilla.org/En/XBL:XBL_1.0_Reference:Anonymous_Content
Comment 3 Eric Shepherd [:sheppy] 2009-06-18 08:30:50 PDT
In addition:

http://developer.mozilla.org/ja/docs/Bugzilla-jp:Guide

is redirecting incorrectly to:

https://developer.mozilla.org/Bugzilla-jp:Guide/ja

That should be going to:

https://developer.mozilla.org/ja/Bugzilla-jp:Guide
Comment 4 Eric Shepherd [:sheppy] 2009-06-30 12:19:52 PDT
We still need this fixed; a lot of links to the site that were created in our MediaWiki days are doing this bizarre redirect with the language code at the end instead of the middle for some reason.
Comment 5 Eric Shepherd [:sheppy] 2009-06-30 12:19:55 PDT
*** Bug 500287 has been marked as a duplicate of this bug. ***
Comment 6 Dave Miller [:justdave] (justdave@bugzilla.org) 2009-07-01 23:14:31 PDT
Here's the current block of rewrites that appears to deal with URLs of the type being redirected...  nothing's jumping out at me here, but maybe someone's got better eyes or isn't so tired :)

    ### Begin MDC rewrite rules ###
    RewriteCond %{REQUEST_URI} ^/(.*)$
    RewriteRule ^/([a-z]{2}|[a-z_]{5})/docs/(.*):(.*)$ /$2:$3/$1 [L,QSA,NE,NC,R]

    RewriteCond %{REQUEST_URI} ^/(.*)$
    RewriteRule ^/([a-z]{2}|[a-z_]{5})/docs/(.*)$ /$1/$2 [L,QSA,NE,NC,R]

    RewriteCond %{REQUEST_URI} ^/(.*)$
    RewriteRule ^/docs/(.*):(.*)$ /$1:$2/En [L,QSA,NE,NC,R]

    RewriteCond %{REQUEST_URI} ^/(.*)$
    RewriteRule ^/docs/(.*)$ /En/$1 [L,QSA,NE,NC,R]
    ### End MDC rewrite rules ###
Comment 7 Matthew N. [:MattN] (behind on reviews) 2009-07-02 00:52:12 PDT
(In reply to comment #6)
>     ### Begin MDC rewrite rules ###
>     RewriteCond %{REQUEST_URI} ^/(.*)$
>     RewriteRule ^/([a-z]{2}|[a-z_]{5})/docs/(.*):(.*)$ /$2:$3/$1 [L,QSA,NE,NC,R]

This seems incorrect as $1 refers to the locale here and it is being rewritten so it is at the end.  I believe this should be:

RewriteRule ^/([a-z]{2}|[a-z_]{5})/docs/(.*):(.*)$ /$1/$2:$3 [L,QSA,NE,NC,R]


>     RewriteCond %{REQUEST_URI} ^/(.*)$
>     RewriteRule ^/([a-z]{2}|[a-z_]{5})/docs/(.*)$ /$1/$2 [L,QSA,NE,NC,R]
> 
>     RewriteCond %{REQUEST_URI} ^/(.*)$
>     RewriteRule ^/docs/(.*):(.*)$ /$1:$2/En [L,QSA,NE,NC,R]

This also seems incorrect for the same reason.  It should be:

RewriteRule ^/docs/(.*):(.*)$ /En/$1:$2 [L,QSA,NE,NC,R]

> 
>     RewriteCond %{REQUEST_URI} ^/(.*)$
>     RewriteRule ^/docs/(.*)$ /En/$1 [L,QSA,NE,NC,R]
>     ### End MDC rewrite rules ###


Do we even need the first and third RewriteRules here?  It seems like the second and fourth rules would be enough.  Why are we treating pages with a colon in a special manner? Maybe there is some other URL format to redirect that I am missing?
Comment 8 j.j. 2009-07-02 03:31:47 PDT
(In reply to comment #7)
> RewriteRule ^/docs/(.*):(.*)$ /En/$1:$2 [L,QSA,NE,NC,R]

Lowercase /en/ should be the canonical case, see also bug 492148.
Comment 9 Jeremy Orem [:oremj] 2009-07-07 10:39:12 PDT
Eric, do you remember why we have special rules for colons?
Comment 10 Eric Shepherd [:sheppy] 2009-07-07 10:58:30 PDT
No, but I seem to recall they were needed...
Comment 11 Jeremy Orem [:oremj] 2009-07-08 11:12:56 PDT
Maybe the mindtouch people remember why they are needed. Will you ping them?
Comment 12 Eric Shepherd [:sheppy] 2009-07-08 11:16:40 PDT
I've sent them email; will comment again when I hear back.
Comment 13 roy kim 2009-07-08 12:41:20 PDT
Hey all - popping in at Eric's request. The original bug which tracked why these colons were necessary can be found here: http://bugs.developer.mindtouch.com/view.php?id=3259

The short is that namespace prefixes themselves can be localized, and given the number of languages that MDC is available in, we opted to use a catch-all instead of hardcoding the actual localized versions into the rewrite rules.
Comment 14 Jeremy Orem [:oremj] 2009-07-08 13:17:22 PDT
So what's the solution to this bug?
Comment 15 roy kim 2009-07-08 13:59:10 PDT
The short answer: special cases in the mdc_redirect.php pre-processing hook. 

To provide more background on the issue for people tracking it; when we originally ported MDC to MindTouch, maintaining permalinks was a top priority. There were a bunch of issues we had to address:

 * MediaWiki supported localized namespace prefixes - we do not
 * The original implementation of polyglotism in MediaWiki did not match up to MindTouch's - we treat namespaced pages as the absolute root level, not the language level
 * Deprecation of the /docs/ folder (but with permalinks maintained)

There's already a significant amount of business logic in mapping those links, so we split up the redirect rules between Apache and MindTouch itself - MDC actually has a special PHP pre-processing hook before the page is rendered to find it's correct location. 

We had a list of use cases we had to match that we tested extensively against before launching (see the previously linked bug). 

In order to prevent endless redirects, we explicitly set the language code to the last part - this prevents our pre-processing hook from being executed for "valid" URL entry points and reduces the risk of breaking future pages, but the downside is that for edge cases on old links, it won't work. 

So the solution here is to look at the edge cases that are failing, and add those special cases into mdc_redirects.php. I can take a stab at this after work and submit a new patch to Eric. I'll try to catch the use cases in the bug as filed. 

Regarding the very first link: it's an external redirect, and MindTouch does not automatically redirect for external redirects. (I'm not sure if MediaWiki ever did?)
Comment 16 Jeremy Orem [:oremj] 2009-07-15 14:03:49 PDT
Let me know when a patch is ready.
Comment 17 Eric Shepherd [:sheppy] 2009-07-30 13:54:11 PDT
Still waiting on this.

Also, any idea why Google still lists out of date URLs in their search results in the first place? Do we have a robots.txt file that's preventing it from updating its search results? These old links should not be showing up in Google searches in the first place.
Comment 18 j.j. 2009-07-30 16:21:07 PDT
(In reply to comment #17)
> Still waiting on this.
> 
> Also, any idea why Google still lists out of date URLs in their search results
> in the first place? 

Because a lot of high ranked pages link to those URLs and the server doesn't send an error 404.

> Do we have a robots.txt file that's preventing it from
> updating its search results? 

No, please! 

> These old links should not be showing up in Google
> searches in the first place.

The only sane solution is a 301 REDIRECT, as I said in bug 500287.
 - Top ranked blog entries (also e.g. Google Doctype) link to us, 
   users shouldn't get lost and confused. 
   You shouldn't expect bloggers would update links in old blog entries. 
 - 301 REDIRECTs immediately throw out old URLs from Google's index,
   transferring the old page rank to the new URLs.
Comment 19 Brian Crowder 2009-07-30 16:29:12 PDT
Yeah, we should definitely be sending 301s for these ancient paths...  what are we doing instead?
Comment 20 Eric Shepherd [:sheppy] 2009-07-30 17:22:59 PDT
Well, right now, they're being converted into incorrect URLs that result in going to nonexistent pages in the wiki. This is what needs to be fixed.
Comment 21 Brian Crowder 2009-07-30 17:34:08 PDT
But if we're really sending 301s, then why doesn't google repair the links to point to the "incorrect URLs" we're converting to?
Comment 22 roy kim 2009-07-30 17:44:31 PDT
The 301s are not being sent yet. I am still putting together a new plug-in to do this.
Comment 23 j.j. 2009-07-30 19:16:11 PDT
(In reply to comment #19)
> Yeah, we should definitely be sending 301s for these ancient paths...  what 
> are we doing instead?

See this examples, both should have a permanent redirect to
 https://developer.mozilla.org/en/CSS/z-index

 https://developer.mozilla.org/en/CSS:z-index 
  (will stay for ever in google's index, the canonical page is excluded
   due "duplicate content")

 https://developer.mozilla.org/en/docs/CSS:z-index
  obvious an more serious bug, redirects to
   https://developer.mozilla.org/CSS:z-index/en
Comment 24 Thomas Bertels 2010-01-28 04:21:16 PST
(In reply to comment #22)
> The 301s are not being sent yet. I am still putting together a new plug-in to
> do this.

R=301 should do the trick.
Comment 25 Eric Shepherd [:sheppy] 2012-12-04 08:19:56 PST
Old Deki bug.

Note You need to log in before you can comment on or make changes to this bug.