Closed
Bug 1144956
Opened 9 years ago
Closed 9 years ago
Fix Google search result for 'bugzil.la' to list primary URL 'bugzilla.mozilla.org'
Categories
(bugzilla.mozilla.org :: Infrastructure, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: Atoll, Assigned: reed)
Details
Attachments
(1 file)
48.91 KB,
image/png
|
Details |
Right now, searching Google for "bugzil.la" results in the BMO home page and Google's discovered sitemap being shown for the domain "bugzil.la", which is incorrect - Google should be informed, somehow (meta tags or webmaster tools or whatever), that the canonical URL for that result page is 'https://bugzilla.mozilla.org/'. Unsure where to file, so starting out in BMO :: General, but assigning directly to :cmore as he requested.
Comment 1•9 years ago
|
||
If the network monitor is not broken, bugzil.la responds with status code 200 and Location: header field. * The response status should be 301 rather than 200. * Is it expected that Location: works even with 200? (Maybe Core/Networking bug)
Comment 2•9 years ago
|
||
(In reply to Masatoshi Kimura [:emk] from comment #1) > * Is it expected that Location: works even with 200? (Maybe Core/Networking > bug) Hm, at least IE 8 and Chrome also responds with 200 with Location:.
note bugzil.la isn't owned by mozilla, :reed owns and maintains that domain.
Component: General → Infrastructure
QA Contact: mcote
Comment 4•9 years ago
|
||
Do you want Google to just completely ignore bugzil.la URLs complete for bugzilla.mozilla.org pages in the results page?
Comment 5•9 years ago
|
||
I need to verify the website with Google's webmaster tools. What would be easier: adding a meta page to the base html template in bugzilla or hosting a one-off single html page at the root of the domain?
Comment 6•9 years ago
|
||
Do you have an example that Google returns a bugzil.la result with a search other than "bugzil.la" ?
(In reply to Chris More [:cmore] from comment #4) > Do you want Google to just completely ignore bugzil.la URLs complete for > bugzilla.mozilla.org pages in the results page? bugzilla.mozilla.org doesn't include canonical URLs on its pages (<link rel="canonical">): https://support.google.com/webmasters/answer/139066?hl=en Which means that Google thinks that https://bugzil.la/ is a distinct site from https://bugzilla.mozilla.org/. If we add canonical URLs, then that solves the problem without any changes on the bugzil.la side, since Google will detect that the bugzil.la result is a duplicate for another canonical URL and purge it. (In reply to Chris More [:cmore] from comment #5) > I need to verify the website with Google's webmaster tools. What would be > easier: adding a meta page to the base html template in bugzilla or hosting > a one-off single html page at the root of the domain? Both of these options would require cooperation from :reed, so I can't offer any guidance there. But if we add canonical tags, this becomes somewhat unnecessary. (In reply to Chris More [:cmore] from comment #6) > Do you have an example that Google returns a bugzil.la result with a search > other than "bugzil.la" ? Nope. I did notice that bugzil.la has a robots.txt that bans indexing, so Google *knows* about a lot of bugzil.la bug ID URLs, but doesn't index them.
Comment 8•9 years ago
|
||
(In reply to Richard Soderberg [:atoll] from comment #7) > (In reply to Chris More [:cmore] from comment #4) > > Do you want Google to just completely ignore bugzil.la URLs complete for > > bugzilla.mozilla.org pages in the results page? > > bugzilla.mozilla.org doesn't include canonical URLs on its pages (<link > rel="canonical">): > > https://support.google.com/webmasters/answer/139066?hl=en > > Which means that Google thinks that https://bugzil.la/ is a distinct site > from https://bugzilla.mozilla.org/. > > If we add canonical URLs, then that solves the problem without any changes > on the bugzil.la side, since Google will detect that the bugzil.la result is > a duplicate for another canonical URL and purge it. I think canonical URLs on bugzilla.mozilla.org has no effect to make it canonical itself. Otherwise any spam site can claim "I'm canonical." The canonical URLs must be added on bugzil.la.
Comment 9•9 years ago
|
||
By the way, did you read comment #1? > * The response status should be 301 rather than 200.
Reporter | ||
Comment 10•9 years ago
|
||
(In reply to Masatoshi Kimura [:emk] from comment #8) > I think canonical URLs on bugzilla.mozilla.org has no effect to make it > canonical itself. Otherwise any spam site can claim "I'm canonical." The > canonical URLs must be added on bugzil.la. When you publish the canonical meta tag at site A and site B, with the tag pointing to site A, google removes site B from the results and stops penalizing you for duplication of content. So any spam site can claim "I'm canonical", and they will promptly be delisted from the search index, because they're site B delisting themselves in favor of the canonical site A. This would be an extremely ineffective form of spam, but, yes, it's technically possible today for *any* site to do that. Practically, this would result in all content served through BMO being treated as canonically from 'bugzilla.mozilla.org', regardless of shorteners or whatever. (In reply to Masatoshi Kimura [:emk] from comment #9) > By the way, did you read comment #1? > > * The response status should be 301 rather than 200. I did. I'm not authorized to alter bugzil.la to fix this, nor do I have history to confirm that this is accidental and can be corrected easily. I can only request that we fix this *somehow*, propose solutions, and let the BMO admins and the bugzil.la admin decide what to do.
Comment 11•9 years ago
|
||
from reading https://support.google.com/webmasters/answer/139066?hl=en it seems that changing bugzil.la to return 301 redirects instead of 200 should be sufficient. reed, what do you think about making this change?
Flags: needinfo?(reed)
Assignee | ||
Comment 12•9 years ago
|
||
(In reply to Byron Jones ‹:glob› from comment #11) > from reading https://support.google.com/webmasters/answer/139066?hl=en it > seems that changing bugzil.la to return 301 redirects instead of 200 should > be sufficient. > > reed, what do you think about making this change? No idea why it's sending 200 instead of 301. Looks set up correctly to me. I should just move this from Apache to nginx. Will do that either today or Sunday to see if that fixes it.
Flags: needinfo?(reed)
Assignee | ||
Comment 13•9 years ago
|
||
Moved bugzil.la from Apache to nginx, which let me fix a few other things as well. $ curl -I https://bugzil.la HTTP/1.1 301 Moved Permanently Server: nginx Date: Mon, 23 Mar 2015 08:47:50 GMT Content-Type: text/html Content-Length: 178 Connection: keep-alive Location: https://bugzilla.mozilla.org/ Strict-Transport-Security: max-age=63113852; includeSubDomains; preload Can folks confirm this solves the problem?
Assignee: chrismore.bugzilla → reed
Status: NEW → RESOLVED
Closed: 9 years ago
OS: Mac OS X → All
Hardware: x86 → All
Resolution: --- → FIXED
Comment 14•9 years ago
|
||
(In reply to Reed Loden [:reed] (use needinfo?) from comment #13) > Moved bugzil.la from Apache to nginx, which let me fix a few other things as > well. > > $ curl -I https://bugzil.la > HTTP/1.1 301 Moved Permanently > Server: nginx > Date: Mon, 23 Mar 2015 08:47:50 GMT > Content-Type: text/html > Content-Length: 178 > Connection: keep-alive > Location: https://bugzilla.mozilla.org/ > Strict-Transport-Security: max-age=63113852; includeSubDomains; preload > > Can folks confirm this solves the problem? It will probably take a few days for the search index to update and eliminate 301 redirects from the results.
Reporter | ||
Comment 15•9 years ago
|
||
Verified "bugzilla.mozilla.org" is now the top search result for "bugzil.la", with the second result being a "denied by robots.txt" link that can't be purged while robots.txt is in place. Thank you!
You need to log in
before you can comment on or make changes to this bug.
Description
•