Closed Bug 261901 Opened 20 years ago Closed 18 years ago

https redirect is applied to robots.txt and index.cgi - what robot uses SSL?

Categories

(bugzilla.mozilla.org :: General, defect)

x86
Windows 2000
defect
Not set
minor

Tracking

()

VERIFIED WONTFIX

People

(Reporter: bugzilla-mozilla, Assigned: myk)

References

()

Details

User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; T312461; Q312461)
Build Identifier: 

When bug 58300 (SSL for bugzilla) was introduced, a redirect was applied 
globally irrespective of whether pages contain content that should be secured 
or not.

As well as being unnecessary for most human users, well-behaved robots such as 
GoogleBot will also find that they are given responses they do not expect. As 
there are many links to pages within bugzilla.mozilla.org, robots will come to 
the site expecting to find the linked content.

Since there is not a valid response to the request for robots.txt, the crawler 
could legitimately attempt to retrieve many pages from the site (any bug that 
is mentioned on any external page), resulting in a "301 Moved Permanently" for 
each one. This could waste bandwidth and resources both of the robot and of 
mozilla.org.

Reproducible: Always
Steps to Reproduce:
> telnet bugzilla.mozilla.org 80
GET /robots.txt HTTP/1.0
Host: bugzilla.mozilla.org
User-Agent: slee-telnet-test
Actual Results:  
HTTP/1.1 301 Moved Permanently
Date: Tue, 28 Sep 2004 08:41:56 GMT
Server: Apache/1.3.27 (Unix)  (Red-Hat/Linux) mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/
1.0.3 PHP/4.1.2 mod_perl/1.26
Location: https://bugzilla.mozilla.org/robots.txt
[...]

Expected Results:  
HTTP/1.1 200 OK
[...]
User-agent: *
Allow: /index.cgi
Disallow: /

Also, if the robots.txt remains as above, index.cgi should also be accessible 
via HTTP. This page could also be used as an opportunity to explain to human 
users why they must now use https for accessing bug reports.
Oops - wrong product  :-[
Assignee: justdave → myk
Component: bugzilla.org → Bugzilla: Other b.m.o Issues
Product: Bugzilla → mozilla.org
QA Contact: mattyt-bugzilla → justdave
Version: 2.17.6 → other
I would think that a well-behaved robot would follow the redirect.  Google's most certainly does, and they index https sites just fine if they're public.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → WONTFIX
(In reply to comment #2)
> I would think that a well-behaved robot would follow the redirect.  Google's
> most certainly does, and they index https sites just fine if they're public.

Indeed.

I'd forgotten about this bug - clearly I was suffering from "failing to think for 5 minutes about a problem before reporting" syndrome... looks like I elevated it from a speculative aside comment in bug 58300 Comment #34 to a full bug report without giving it any further thought.
Status: RESOLVED → VERIFIED
Component: Bugzilla: Other b.m.o Issues → General
Product: mozilla.org → bugzilla.mozilla.org
You need to log in before you can comment on or make changes to this bug.