Closed
Bug 892189
Opened 12 years ago
Closed 12 years ago
Add robots.txt to block spiders to download.mozilla.org
Categories
(Webtools :: Bouncer, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: cmore, Unassigned)
References
(Blocks 1 open bug)
Details
Can we add a robots.txt file to http://download.mozilla.org to block search engines from spidering that URL. That URL should not be showing up in search as the query paramaters passed to it may not be specific to the search query. It is showing up on bing, I've blocked it on bing, but that is just temporary.
For example, if you search bing for firefox there is a "deep link" called "Download" that points to: https://download.mozilla.org/?product=firefox-stub&os=win&lang=en-US. That is the windows build and my search came from Mac.
To block all spiders to all URLs on d.mo., the robotx.txt file should be:
User-agent: *
Disallow: /
Comment 1•12 years ago
|
||
Perhaps this should be handled by WebOps as an server setup issue ?
Reporter | ||
Comment 2•12 years ago
|
||
jakem: should we switch components for this? I didn't know a better place to start witht his one.
Flags: needinfo?(nmaul)
Reporter | ||
Comment 3•12 years ago
|
||
cturra: can you help here? We are getting a lot of invalid downloads from bing.com and robots.txt should be used to prevent this for all search engines.
Assignee: nobody → server-ops-webops
Component: Bouncer → Server Operations: Web Operations
Flags: needinfo?(cturra)
Product: Webtools → mozilla.org
QA Contact: nmaul
Version: Trunk → other
Comment 4•12 years ago
|
||
Webops does not manage content. Moving to the webtools component.
Assignee: server-ops-webops → nobody
Component: Server Operations: Web Operations → Bouncer
Flags: needinfo?(nmaul)
Flags: needinfo?(cturra)
Product: mozilla.org → Webtools
QA Contact: nmaul
Reporter | ||
Comment 5•12 years ago
|
||
(In reply to Daniel Maher [:phrawzty] (afk 24-07-2013 through 04-08-2013) from comment #4)
> Webops does not manage content. Moving to the webtools component.
That's where it was initially and :nthomas said to pass it over to WebOps and now WebOps is saying pass it over to the Webtools team.
Laura/Jakem/nthomas: Can we just decide who is going to do this change? Bing is serving up links to en-US windows builds to all users regardless of their locale or operating system. A robot.txt is the way to prevent this from happening. d.m.o should not be indexed by robots.
Comment 6•12 years ago
|
||
/CC bsavage as someone who touched bouncer this year
Apologies if that was wrong, it was a guess based on download.m.o being a php app. Perhaps this is a simple as dumping the robots.txt at https://github.com/mozilla/tuxedo/tree/master/bouncer/php/, maybe it needs some Apache work too.
Comment 7•12 years ago
|
||
Fixed in https://github.com/mozilla/tuxedo/commit/226d71cfa6b85103aaf66631b6288e2d2afb92fc
All this needs now is to be deployed.
Updated•12 years ago
|
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 8•12 years ago
|
||
(In reply to Brandon Savage [:brandon] from comment #7)
> Fixed in
> https://github.com/mozilla/tuxedo/commit/
> 226d71cfa6b85103aaf66631b6288e2d2afb92fc
>
> All this needs now is to be deployed.
Thanks, Brandon! What is the normal deployment for this?
Try searching for site:download.mozilla.org on both Google and Bing. Google totally ignores the domain as it should and bing has been indexing it all along.
Comment 9•12 years ago
|
||
Let me talk with :laura and :jakem this morning and see about getting this shipped. Assuming that we are already serving the latest master (before this change) it should quite literally be trivial.
Reporter | ||
Comment 10•12 years ago
|
||
(In reply to Brandon Savage [:brandon] from comment #9)
> Let me talk with :laura and :jakem this morning and see about getting this
> shipped. Assuming that we are already serving the latest master (before this
> change) it should quite literally be trivial.
Can we get an update here?
Flags: needinfo?(bsavage)
Comment 12•12 years ago
|
||
QA verified on stage and prod - automation also passes. Thanks to Brandon for adding a test for robots.txt
Status: RESOLVED → VERIFIED
Reporter | ||
Comment 13•12 years ago
|
||
(In reply to Brandon Savage [:brandon] from comment #11)
> This has been pushed to production.
Thanks!
You need to log in
before you can comment on or make changes to this bug.
Description
•