Last Comment Bug 840092 - Translate a doc (sometimes) prepends 'en' to the slug
: Translate a doc (sometimes) prepends 'en' to the slug
Status: VERIFIED FIXED
[specification][type:bug][specificati...
:
Product: Mozilla Developer Network
Classification: Other
Component: Localization (show other bugs)
: unspecified
: All All
: P1 blocker (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
: website
:
Mentors:
: 819829 836792 (view as bug list)
Depends on:
Blocks: 821694 841088
  Show dependency treegraph
 
Reported: 2013-02-11 08:25 PST by Eric Shepherd [:sheppy]
Modified: 2013-11-11 18:28 PST (History)
11 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments

Description Eric Shepherd [:sheppy] 2013-02-11 08:25:27 PST
Somehow we wound up with a translation of the IndexedDB page here:

https://developer.mozilla.org/fr/docs/en/IndexedDB

But there was also a page here:

https://developer.mozilla.org/fr/docs/IndexedDB

We didn't know about the latter, so we renamed (in the Django UI, because we couldn't stop the redirect from the latter to the former) the first one to match the second.

But now it looks like that was the wrong thing to do, as content is missing. We're very confused (as you can tell by this weirdly phrased bug report). Please help us!
Comment 1 John Karahalis [:openjck] 2013-02-11 08:46:54 PST
Just re-writing this in a format that should be a little easier for us to skim. Sheppy, please let me know if any of this is incorrect.

What did you do?
================
1. Load https://developer.mozilla.org/fr/docs/en/IndexedDB
2. Load https://developer.mozilla.org/fr/docs/IndexedDB

What happened?
==============
The first page redirects to the English version of the page. The second page loads the English version of the document.

What should have happened?
==========================
The first page should not exist (404). The second page should load the French version of the document, which existed previously.

Is there anything else we should know?
======================================
Comment 2 Frédéric Bourgeon [:FredB] 2013-02-11 09:17:19 PST
General thing: that page (fr/docs/en/IndexedDB) should have never been created. But it did...

Moins52 created fr/docs/IndexedDB page, then for some reason, saved the same page under a different slug (namely en/IndexedDB) causing the (kuma) redirect one can see here: https://developer.mozilla.org/fr/profiles/moins52 (page is called Redirect 1).

Although, due to legacy URL structure, when "en/" is contained in a URL, it is detected and the URL is rewritten to the equivalent en-US page. This happens before the kuma redirect. This is why Sheppy had to go to the Django UI to rename the page.
But the target page of the renaming already existed, resulting in losing track of the page that was created as the second (and its most recent edit).

In the result, we have an edit that we can't track down now (it doesn't appear in the fr/docs/IndexedDB history), and an accessible page that is not the most recent update. Both share the same slug.
Comment 3 John Karahalis [:openjck] 2013-02-12 08:51:10 PST
*** Bug 836792 has been marked as a duplicate of this bug. ***
Comment 4 John Karahalis [:openjck] 2013-02-12 08:52:12 PST
Same thing happening in bug 836792. Luke, do we expect that fixing this bug will fix these individual problems, or will we need to correct those slugs manually?
Comment 5 Luke Crouch [:groovecoder] 2013-02-12 13:22:23 PST
We will likely have to clean and fix the slugs manually. Not sure if/how we would write a db migration to do it. :/
Comment 6 Eric Shepherd [:sheppy] 2013-02-13 07:06:20 PST
This needs to be our #1 top priority; things are getting out of hand fast. It's getting rapidly worse.
Comment 8 John Karahalis [:openjck] 2013-02-13 09:13:12 PST
(In reply to Luke Crouch [:groovecoder] from comment #5)
> We will likely have to clean and fix the slugs manually. Not sure if/how we
> would write a db migration to do it. :/

Can you please open a bug for fixing these slugs manually? I think you could describe the actual/expected results better than I could.

https://bugzilla.mozilla.org/form.mdn#h=detail|bug
Comment 9 Luke Crouch [:groovecoder] 2013-02-13 11:16:09 PST
wiki/test_views test to reproduce the issue:

https://gist.github.com/groovecoder/4947142

Filed https://bugzilla.mozilla.org/show_bug.cgi?id=841088
Comment 10 Kohei Yoshino [:kohei] 2013-02-13 13:05:44 PST

*** This bug has been marked as a duplicate of bug 819829 ***
Comment 11 Luke Crouch [:groovecoder] 2013-02-13 13:40:41 PST
Sorry I made the PR against this bug, so I'm going to mark the other one a dupe of this.
Comment 12 Luke Crouch [:groovecoder] 2013-02-13 13:41:28 PST
*** Bug 819829 has been marked as a duplicate of this bug. ***
Comment 13 James Socol [:jsocol, :james] 2013-02-14 07:08:15 PST
(From email:) This is blocking Japanese MWC efforts.
Comment 14 Les Orchard [:lorchard] 2013-02-14 08:23:29 PST
Okay, doing some data spelunking on this and looking through the localization logic. I think I have an answer for why this is happening, and will probably have a fix & a cleanup today.

Here are the details, for anyone who's interested:

* Many (but not all) pages in en-US trace their way up the topic path eventually to a page entitled "MDN" with a slug of "en"

* Back when we migrated to Kuma, we left "en/" out of the slugs. So, that root parent "MDN" page is there, but left unexpressed in en-US slugs.

* Back in September, bug 792418 introduced logic for translation that tries to match the en-US topic hierarchy for new translations. That means, if you create a new translation of HTML/HTML5/Image, but HTML and HTML/HTML5 don't yet exist in the target locale, the system tries to fill in the parents for the locale.

* But, there's this hidden "en" at the root of en-US. So, when a new translation is created and the parent-filling logic happens, the hidden "en" from en-US is cloned into the target locale and made visible (and thus part of the URL path).

At present, it looks like there are 121 documents with "en/" leading the slug:

mysql> select count(*) from wiki_document where substr(slug, 1, 3) = 'en/';
+----------+
| count(*) |
+----------+
|      121 |
+----------+

That should be easy enough to clean up with some quick SQL.

To prevent this in the future, I think the solution is to just nuke that hidden "en" page in en-US. It's not doing anything vital. Then, also nuke its clones in all other locales. Any pages claiming those "en" roots as parents will themselves become topic roots, as expected, and this issue should not reoccur.

Since all the above calls for some possibly scary SQL manipulations / Django migrations, I'm downloading a copy of the site to try all this out on my laptop first. So, the fix will probably be quick, but I want to try it out safely before hitting the real site.
Comment 15 Les Orchard [:lorchard] 2013-02-14 08:27:17 PST
(In reply to Les Orchard [:lorchard] from comment #14)

> * Back when we migrated to Kuma, we left "en/" out of the slugs. So, that
> root parent "MDN" page is there, but left unexpressed in en-US slugs.

And also for the sake of explanation, there are similar hidden roots in most other locales (eg. nl/ for nl, ja/ for ja, etc). But, they haven't been a problem, because we do not support creating translations from any locale other than en-US.

If it doesn't hurt anything, I may remove those hidden roots in other locales, too.
Comment 16 John Karahalis [:openjck] 2013-02-14 08:35:12 PST
(In reply to Les Orchard [:lorchard] from comment #15)
> If it doesn't hurt anything, I may remove those hidden roots in other
> locales, too.

Probably a good idea, since we have talked about allowing people to use other langauges as the "source" for a translation, for example using a Spanish document as the source of a Greek document.

Will this remove "MDN" from the breadcrumb? Removing it could affect our SEO and of course user experience, so if possible it might be nice to hardcode that in as part of this.
Comment 17 David Walsh :davidwalsh 2013-02-14 08:37:39 PST
We'll find a way to hardcode MDN into the template so it's not lost.
Comment 18 Les Orchard [:lorchard] 2013-02-14 11:24:03 PST
Okay, so I think this is fixed now. Reopen this bug if the en/ thing crops up again.

I did the following:

* Removed en/ from any slugs where it was present.

* In cases where removing en/ from the slug would result in a collision with an existing page, I added "-840092-dup" to the end of the slug.

* The above two things affected about 121 documents across locales.

* All pages affected by the above were tagged "bug-840092". For example, in ja locale:
    https://developer.mozilla.org/ja/docs/tag/bug-840092

* All pages with "-840092-dup" added were tagged "bug-840092-dup" For example, in ja locale:
    https://developer.mozilla.org/ja/docs/tag/bug-840092-dup

* All pages with slug "en" have had their children promoted to the site root, so that future localizations will not catch the "en/" slug prefix.

* All pages with slugs matching their locale have had their children promoted to the site root, so that future localizations from non-en-US locales will not run into this problem.
Comment 19 Les Orchard [:lorchard] 2013-02-14 11:30:49 PST
Oh, and yeah, the "MDN" breadcrumb is gone now. Here's a bug:

https://bugzilla.mozilla.org/show_bug.cgi?id=841461
Comment 20 Kohei Yoshino [:kohei] 2013-02-14 12:36:24 PST
WOW, https://developer.mozilla.org/ja/docs/Apps is finally accessible! Thank you :)
Comment 21 Les Orchard [:lorchard] 2013-02-14 12:39:39 PST
Oh, and feel free to remove the bug-840092 and bug-840092-dup tags whenever those pages are next edited. They're just there so we have something to collect the pages I touched, in case anything went wrong during the process.
Comment 22 ziyunfei 2013-02-14 21:12:49 PST
Firefox has detected that the server is redirecting the request for this address in a way that will never complete.


https://developer.mozilla.org/zh-cn/docs/CSS/transform
Comment 23 Kohei Yoshino [:kohei] 2013-02-15 02:52:55 PST
(In reply to 446240525 from comment #22)
> Firefox has detected that the server is redirecting the request for this
> address in a way that will never complete.

I think that is Bug 818477.
Comment 24 Les Orchard [:lorchard] 2013-02-15 06:40:36 PST
(In reply to 446240525 from comment #22)
> Firefox has detected that the server is redirecting the request for this
> address in a way that will never complete.
> 
> 
> https://developer.mozilla.org/zh-cn/docs/CSS/transform

If you ever see a case like this, try adding ?redirect=no to the URL to see if there's a REDIRECT to itself in the page source. That should allow you to edit the page.

Note You need to log in before you can comment on or make changes to this bug.