Clean out old and un-used feeds to prevent errors

NEW
Unassigned

Status

developer.mozilla.org
Code Cleanup
3 years ago
3 years ago

People

(Reporter: groovecoder, Unassigned)

Tracking

Details

The only feed items we display on the site are from Hacks. We can remove all the other bundles and feeds. Especially now that many are causing errors ...

Old and un-used:
Unable to fetch https://developer.mozilla.org/@api/deki/site/feed


Removed/missing blog.mozilla.org feeds (failing since the move of blog.mozilla.org to WPEngine):
Unable to fetch http://planet.firefox.com/mobile/rss20.xml
Unable to fetch https://blog.mozilla.org/addons/feed/
Unable to fetch http://blog.mozilla.org/about_mozilla/feed/atom/
Hmm ... All of these feeds work except for the MDN one, which I deleted from the stage and prod servers.

:cyliang - can you see any reason why the stage & prod cron hosts wouldn't be able to reach those planet.firefox.com and blog.mozilla.org urls? Maybe a netflow needs to be open or something? I'm still fine removing them, would just like to know why they're failing.
Flags: needinfo?(cliang)

Comment 2

3 years ago
I’m not sure why the feed links are failing.  A quick test shows that they should, for the most part, work when invoked from the developer web heads. [1]

Some questions:

  * How are the feed links being parsed?  Is there some regular job that grabs new feeds?  (I’m trying to see if I should be testing feed access from the celery nodes or some other part of the MDN infra.)
   
  * Do you know what user agent string you are emitting?  (I can try running a wget that spoofs that agent.)  In https://bugzilla.mozilla.org/show_bug.cgi?id=1160609#c16, we discovered that wp-engine apparently filters out UA strings it considers “harmful”.  >_<  planet.mozilla.org updated their code to emit a different user agent string.


[1] When I attempt to grab the feeds via wget from the developer webheads, I succeed for the http feeds but not the https ones.  That, I think, has to do with a bug in the version of wget running on those servers.  (I can successfully grab the HTTPS feed from developeradm, which is running a newer version of wget.)
Flags: needinfo?(cliang)
* There's an 'update_feeds' cron job, which I think runs on the admin node?

* Our feedparser package uses urllib2, which uses "Python-urllib/2.6" by default [1] as the user agent.

I updated all the stage feeds [2] to https if possible. planet.firefox.com/mobile/rss20.xml does not respond on https. I deleted all the twitter feeds, as they're not available anymore.

I'll keep an eye on the update_feeds emails to see if these changes fix the errors.


[1] https://docs.python.org/2/library/urllib2.html#urllib2.Request
[2] https://developer.allizom.org/admin/feeder/feed/
The error was down to only the http planet.firefox.com so I removed that from stage too.
Since I removed the planet feed, we're not getting so many errors. Still getting this one intermittently:

http_app_kuma: 2015-05-14 08:30:15,858 kuma.feeder:ERROR Unable to fetch https://forums.addons.mozilla.org/feed.php Exception: <urlopen error The read operation timed out>: /data/developer/www/developer.mozilla.org/kuma/kuma/feeder/management/commands/update_feeds.py:176
You need to log in before you can comment on or make changes to this bug.