Closed Bug 457549 Opened 11 years ago Closed 11 years ago

Create robots.txt to prevent user info pages from showing up on Google

Categories

(addons.mozilla.org Graveyard :: Public Pages, defect)

defect
Not set

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: wenzel, Assigned: wenzel)

References

()

Details

Attachments

(2 files)

People have been complaining that they do not want their user info page on AMO to be found when googling their name. I agree that it is of little use as a search engine "entrance point" to AMO via people's user info pages.

We should make appropriate changes to robots.txt and/or meta tags.

This is not directly related but should probably be tackled together with bug 442498.
Depends on: 371779
Duplicate of this bug: 461361
I can fix this as part of TM 4.0.4.
Assignee: nobody → fwenzel
Summary: User info pages show up on Google → Create robots.txt to prevent user info pages from showing up on Google
Target Milestone: --- → 4.0.4
I am excluding robots from the user info pages, and while I was at it also from indexing search results.

There's a htaccess rule too to forward /robots.txt to the right page. I am quite sure that'll work (it works well here) but we should double check when this hits preview.
Attachment #349664 - Flags: review?(morgamic)
Status: NEW → ASSIGNED
Attachment #349664 - Flags: review?(morgamic) → review+
Thanks, committed: r20224.

I'll keep this open just a little bit longer until I made sure that the rewrite rule works as expected on preview.
Keywords: push-needed
Ah, there was a mistake in the .htaccess file: the rewrite was missing a target application. Fixed in r20235.

I'll double check that it works fine on Preview but I am confident.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
FYI, it works flawlessly on preview. However, I noticed that we had a Disallow: / rule in place for preview before (so preview won't show up on Google), so I probably need to add that to the code. Reopening.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Fixed in r20244: It is now "Allow: *" in production, "Disallow: *" otherwise.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
On https://preview.addons.mozilla.org/en-US/firefox/user/9873, for example, viewing source still reveals:

<meta name="ROBOTS" content="ALL"/>

Is this the right way to test?  Normally .htaccess changes need aravind to run a script to update them, right?
(In reply to comment #8)
> On https://preview.addons.mozilla.org/en-US/firefox/user/9873, for example,
> viewing source still reveals:
> 
> <meta name="ROBOTS" content="ALL"/>
> 
> Is this the right way to test?

No, the file in question is https://preview.addons.mozilla.org/robots.txt which will be read by search engine crawlers.

> Normally .htaccess changes need aravind to run
> a script to update them, right?

That, I don't know.

Nonetheless, I do think I need to remove that meta tag, so it doesn't override the definitions in robots.txt. Reopening.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This simple patch removes the robots meta tag, as the "allow: all" part (along with all necessary restrictions) is now handled by the robots.txt file.

Note that directives like "noarchive" can't be defined in robots.txt, so they'd need the meta tag again, but since we don't use them at the moment, this patch should be fine for solving this bug.
Attachment #351003 - Flags: review?(morgamic)
Attachment #351003 - Flags: review?(morgamic) → review+
Comment on attachment 351003 [details] [diff] [review]
Removing robots meta tag

r=morgamic
r20597, thanks.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Keywords: push-needed
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.