Closed Bug 857621 Opened 12 years ago Closed 12 years ago

robots.txt disallows all

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: kats, Assigned: cturra)

References

Details

Attachments

(1 file)

Google results screenshot 12 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 76.59 KB, image/png		Details

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Reporter

Description

•

12 years ago

Attached image Google results screenshot — Details

https://wiki.mozilla.org/robots.txt disallows everything. I was searching for something on google today and got the attached (I added the red outline for emphasis). Is this intended?

Dustin J. Mitchell [:dustin] (he/him)

Comment 1

•

12 years ago

This is making it hard to find things on wikimo - and killing the site's google-fu. it's probably too late for the latter..

Tantek Çelik

Comment 2

•

12 years ago

This seems like a major regression.

Severity: normal → major

Andrei Hajdukewycz [:sancus]

Comment 3

•

12 years ago

Seriously.

Severity: major → critical

Dustin J. Mitchell [:dustin] (he/him)

Comment 4

•

12 years ago

Andrei, "critical" means "all work stops". Think of it as "I'm willing to have a meeting with Mitchell to explain this". So, not this bug. Callek, what do you think? Who can make this call? Sadly, it's probably too late for the site's pagerank..

Severity: critical → normal

Justin Wood (:Callek)

Comment 5

•

12 years ago

Personally I'm ok if google indexes/crawls wiki.m.o, nothing on there should ever be private that a crawl would expose it. The only downside is it makes it a larger target of spammers, (since they can get their pagerank artificially increased), but I think we're in a better state (right now) re: newly incomming spam than we used to be. Final say, lies with webops, imho.

Dustin J. Mitchell [:dustin] (he/him)

Updated

•

12 years ago

Assignee: nobody → server-ops-webops

Component: wiki.mozilla.org → Server Operations: Web Operations

Product: Websites → mozilla.org

QA Contact: nmaul

Version: unspecified → other

Andrei Hajdukewycz [:sancus]

Comment 6

•

12 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #4) > Andrei, "critical" means "all work stops". Think of it as "I'm willing to > have a meeting with Mitchell to explain this". So, not this bug. I thought that was blocker :p Anyway, I think it is pretty critical for wikimo to be google searchable, because if it's not that makes it considerably harder for anyone to find anything on it, especially people looking for things they don't *know* are even on the wiki.

Chris Turra [:cturra]

Assignee

Comment 7

•

12 years ago

i agree this really isn't a blocker -- blockers are reserved for scenarios like "omg, the site is down!!111one" i have applied a new robots.txt file for the wiki that should be a little less restrictive for bots. note, i have applied Disallow list of the top known "scraper" bots, plus many of our "Special:..." pages. *interesting note on that, i chose not to use wildcard matching in here since only the Google and Microsoft's Bing bots honor wildcards.

Assignee: server-ops-webops → cturra

Status: NEW → RESOLVED

Closed: 12 years ago

OS: Mac OS X → All

Hardware: x86 → All

Resolution: --- → FIXED

Andrei Hajdukewycz [:sancus]

Comment 8

•

12 years ago

Thanks! I apologize for the somewhat aggressive severity setting :p

Matt Basta [:basta]

Comment 10

•

12 years ago

This seems to have regressed. The current robots.txt disallows all robots; User-agent: * Disallow: /

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Chris Turra [:cturra]

Assignee

Comment 11

•

12 years ago

looks like a recent mediawiki updated the robots.txt back to the vanilla one that is shipped with the product. i have updated this again and will look into how we can make sure this gets applied with future upgrades.

Status: REOPENED → RESOLVED

Closed: 12 years ago → 12 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

12 years ago

Component: Server Operations: Web Operations → WebOps: Other

Product: mozilla.org → Infrastructure & Operations

BMO Automation

Updated

•

6 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard

Andrew McCreight [:mccr8]

Updated

•

18 days ago

Bugzilla

https://wiki.mozilla.org/robots.txt disallows all

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

Tracking

(Not tracked)

People

(Reporter: kats, Assigned: cturra)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Comment 10

Comment 11

Updated

Updated

Updated

Attachment

General

Description

File Name

Content Type