Closed Bug 787231 Opened 12 years ago Closed 12 years ago

API results appearing in search engines

Categories

(developer.mozilla.org Graveyard :: Wiki pages, defect, P1)

x86_64
Linux
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: openjck, Unassigned)

Details

Attachments

(2 files)

Some of our API results are appearing in search results. Additionally, because we use Google to power our in-site search, many of the API results area also appearing there.

See the screenshots below.
Attached image API results in Google
This post prevents a few solutions:

http://yoast.com/x-robots-tag-play/

We need to decide if we want to do this with .htaccess headers or python headers
Mentioned this in IRC. The article only mentions Google and Yahoo supporting this. Do we care about other search engines, like Bing?

Jean-Yvves: Thoughts?
Bing supports REP: the robots.txt, the <meta> values and the X-Robot-Tag are all supported: http://www.bing.com/community/site_blogs/b/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx

For .htaccess or python headers, I have no direct opinion. Note that we probably want to edit the .htaccess anyway to put Cache-Control/Expires headers for images (for another reason)…
FWIW, we blogged about the HTML5 ?raw URL, which is probably where Google found it:

https://hacks.mozilla.org/2012/08/kuma-cool-url-tricks/
Also, this turns up only 19 results, maybe not that big a deal?

https://www.google.com/search?q=site%3Adeveloper.mozilla.org+inurl%3Araw
That said, a quick fix is probably to just send this header out when ?raw is present:

X-Robots-Tag: noindex

https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag
Priority: -- → P1
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/12050dd197c048f4213d16e0dca3d515df2d3169
fix bug 787231 - Add X-Robots-Tag header to ?raw

https://github.com/mozilla/kuma/commit/d43dd454c81e3d243d8bdb6c9b25cac94b57c678
Merge pull request #589 from darkwing/api-result-787231

fix bug 787231 - Add X-Robots-Tag header to ?raw
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
verified fixed 
 curl -I  'https://developer-dev.allizom.org/en-US/docs/CSS/Getting_Started?raw'
HTTP/1.1 200 OK
Date: Thu, 13 Sep 2012 19:19:28 GMT
Server: Apache
X-Robots-Tag: noindex
Vary: Cookie
x-kuma-revision: 3358
x-frame-options: Allow
Set-Cookie: csrftoken=63cdf11cb9b40051da8333ac736fa8ee; Max-Age=31449600; Path=/
X-Backend-Server: developer1.dev.webapp.scl3.mozilla.com
Last-Modified: Sun, 05 Aug 2012 17:57:12 GMT
Content-Length: 7162
Content-Type: text/html; charset=utf-8
Status: RESOLVED → VERIFIED
Version: Kuma → unspecified
Component: Website → Landing pages
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: