Closed Bug 1393770 Opened 7 years ago Closed 6 years ago

add search results to robots.txt

Categories

(developer.mozilla.org Graveyard :: General, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: atopal, Unassigned)

Details

(Keywords: in-triage)

About 2% of our daily Google crawl budget is spent on search results pages that don't provide value for users. Please add them to robots.txt so GoogleBot knows to skip them in the future.

Example: 
/en-US/search?q=%E6%B3%B0%E5%B7%9E%E5%8A%9E%E7%9C%9F%E5%AE%9E%E5%87%BA%E7%94%9F%E8%AF%811300252952Qqfrtznv
Search results are currently marked to be excluded from the index:

<meta name="robots" content="noindex, follow">

It may be worth re-reading Google's FAQ on robots.txt:

https://support.google.com/webmasters/answer/6062608?hl=en

The last item explains that, if an external page links to the page, it may still be crawled and appears in search results.

My worry is that including search results in robots.txt will mean that they will start appearing in search results again, but with the "Sorry, the site doesn't allow us to show a description here" message.

It may be worth checking if we're linking to search results internally (GA, Apache logs), from templates or MDN content, which would make the issue worse.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.