Closed Bug 839964 Opened 11 years ago Closed 11 years ago

add enter_bug.cgi to the disallow list in robots.txt

Categories

(bugzilla.mozilla.org :: General, defect)

Production
x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: glob, Assigned: dkl)

Details

Attachments

(1 file)

as a result of stuff like:

> [10/Feb/2013:23:42:35 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 139
> [10/Feb/2013:23:42:35 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 140
> [10/Feb/2013:23:42:37 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 126
> [10/Feb/2013:23:42:37 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 156
> [10/Feb/2013:23:42:38 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 120
> [10/Feb/2013:23:42:38 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 120
> [10/Feb/2013:23:42:39 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 113
> [10/Feb/2013:23:42:39 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 117
> [10/Feb/2013:23:42:40 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 126
> [10/Feb/2013:23:42:40 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 129
> [10/Feb/2013:23:42:41 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 127
> [10/Feb/2013:23:42:42 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 150
> [10/Feb/2013:23:42:42 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 87
> [10/Feb/2013:23:42:43 -0800] "GET /enter_bug.cgi?product=Core&component=WebRTC HTTP/1.0" 301 284 "https://bugzilla.mozilla.org" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0 bugzilla.mozilla.org 90

bots are always going to get the same response from enter_bug.cgi ("please login"), so we should just block that cgi.
according to google's webmaster tools, it's already blocked:

http://bugzilla.mozilla.org/enter_bug.cgi
Blocked by line 5: Disallow: /
As you noted and I have also confirmed with third party robots.txt analyzers, the enter_bug.cgi page should already be disallowed due to Disallow: / and it seems Googlebot is not honoring that properly. We can try to add it as an explicit disallow for a while and see if it helps anyway.

dkl
Assignee: nobody → dkl
Status: NEW → ASSIGNED
Attachment #714118 - Flags: review?(glob)
thanks for the patch, and for testing with other analysers.
let's leave it .. the torrent of requests has stopped.

if it ramps up again, we should try to find the source IP and determine if these are legit googlebot requests.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
Attachment #714118 - Flags: review?(glob)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: