Open Bug 1970197 Opened 18 days ago Updated 17 days ago

Weird robots.txt file for wiki.mozilla.org; not being indexed by Google

Categories

(Websites :: wiki.mozilla.org, defect)

defect

Tracking

(Not tracked)

People

(Reporter: mccr8, Unassigned, NeedInfo)

References

Details

Somebody noticed that Mozilla wiki pages don't seem to show up in Google search. For instance, if you search for "mozilla wiki Security Severity Ratings/Client" you don't get the wiki page you'd expect.

I was wondering what was going on with the robots.txt file, so I checked it out. Apparently this page was added in 2017, and it is entirely in Russian? I'm not sure if that's preventing the wiki from being crawled by Google or if that's intentional but I would have expected it to be crawled by Google.

Bugzilla's suggest turned up a bug from 2013 about the robots.txt page not working for this site, bug 857621.

wiki.mozilla.org doesn't have a robots.txt file, it has a redirect to the wiki article Robots.txt. wikipedia.org solves this problem by not having their wiki at the root path, so they can have both https://en.wikipedia.org/robots.txt and https://en.wikipedia.org/wiki/Robots.txt

The content of our Robots.txt seems reasonable for an article describing robots.txt, but I don't know why it's in Russian or what it has to do with Mozilla. But even if we delete it that won't change much. The wiki software running the site will still return the redirect to Robots.txt and then the bots will still get wiki HTML content ("there is no text in this page.... you can edit it"). Either the bots do or don't follow redirects, and either way it's HTML content instead of the expected robots.txt format. It will return a 404 status code though, which the bots might pay attention to.

I get reasonable search results from DuckDuckGo so I'm not convinced this is why Google isn't indexing us (but it does look like they aren't: even explicit site:wiki.mozilla.org searches don't find much).

The one entry in the history of Robots.txt sounds like it was UN-deleted by an admin. Maybe Spike remembers the history? (but that was a long time ago)

Flags: needinfo?(spike)
You need to log in before you can comment on or make changes to this bug.