Hacks blog feed on MDN homepage does not currently update (stuck on 12/17)

RESOLVED FIXED

Status

P1
normal
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: havi, Unassigned)

Tracking

Details

(Whiteboard: [specification][type:bug], URL)

(Reporter)

Description

5 years ago
What did you do?
================
1. Went to MDN homepage and looked below the fold to Hacks blog
2. https://developer.mozilla.org/en-US/
3. 

What happened?
==============
no current hacks blog posts

What should have happened?
==========================
seems like this feed should be auto-updating, but looks like it broke. 

Is there anything else we should know?
======================================
hmmm, you got me
pls cc me on this bug :)

Updated

5 years ago
Severity: normal → major
Priority: -- → P1

Comment 1

5 years ago
Based on discussion on mdn-drivers list, make a change to the feed to just show most recent Hacks posts (as opposed to the ones marked as "featured")
Severity: major → normal
Duplicate of this bug: 962450
Not sure if this is a cause, but I tried running the feed update in my dev VM and got this error:

vagrant@developer-local:~/src$ /usr/bin/python ./manage.py update_feeds
General Error starting loop: 'ascii' codec can't decode byte 0xc3 in position 15: ordinal not in range(128)
'ascii' codec can't decode byte 0xc3 in position 15: ordinal not in range(128)
Traceback (most recent call last):
  File "/home/vagrant/src/apps/feeder/management/commands/update_feeds.py", line 63, in update_feed
    stream = self.fetch_feed(feed)
  File "/home/vagrant/src/apps/feeder/management/commands/update_feeds.py", line 120, in fetch_feed
    modified=feed.last_modified.timetuple())
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 2623, in parse
    feedparser.feed(data)
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 1441, in feed
    sgmllib.SGMLParser.feed(self, data)
  File "/usr/lib/python2.7/sgmllib.py", line 104, in feed
    self.goahead(0)
  File "/usr/lib/python2.7/sgmllib.py", line 143, in goahead
    k = self.parse_endtag(i)
  File "/usr/lib/python2.7/sgmllib.py", line 320, in parse_endtag
    self.finish_endtag(tag)
  File "/usr/lib/python2.7/sgmllib.py", line 360, in finish_endtag
    self.unknown_endtag(tag)
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 476, in unknown_endtag
    method()
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 1318, in _end_content
    value = self.popContent('content')
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 700, in popContent
    value = self.pop(tag)
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 641, in pop
    output = _resolveRelativeURIs(output, self.baseuri, self.encoding)
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 1594, in _resolveRelativeURIs
    p.feed(htmlSource)
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 1441, in feed
    sgmllib.SGMLParser.feed(self, data)
  File "/usr/lib/python2.7/sgmllib.py", line 104, in feed
    self.goahead(0)
  File "/usr/lib/python2.7/sgmllib.py", line 138, in goahead
    k = self.parse_starttag(i)
  File "/usr/lib/python2.7/sgmllib.py", line 296, in parse_starttag
    self.finish_starttag(tag, attrs)
  File "/usr/lib/python2.7/sgmllib.py", line 338, in finish_starttag
    self.unknown_starttag(tag, attrs)
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 1588, in unknown_starttag
    attrs = [(key, ((tag, key) in self.relative_uris) and self.resolveURI(value) or value) for key, value in attrs]
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 1584, in resolveURI
    return _urljoin(self.baseuri, uri)
  File "/home/vagrant/src/vendor/packages/feedparser/feedparser.py", line 286, in _urljoin
    return urlparse.urljoin(base, uri)
  File "/usr/lib/python2.7/urlparse.py", line 270, in urljoin
    params, query, fragment))
  File "/usr/lib/python2.7/urlparse.py", line 223, in urlunparse
    return urlunsplit((scheme, netloc, url, query, fragment))
  File "/usr/lib/python2.7/urlparse.py", line 240, in urlunsplit
    url = url + '#' + fragment
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 15: ordinal not in range(128)

Comment 4

5 years ago
No idea if that's the thing, Les. But I think one major thing is that it only gets the Featured articles when it's better to get all of them. Then if there are formatting errors, I guess that could be something else that needs to be looked into.
(In reply to Robert Nyman from comment #4)
> No idea if that's the thing, Les. But I think one major thing is that it
> only gets the Featured articles when it's better to get all of them. Then if
> there are formatting errors, I guess that could be something else that needs
> to be looked into.

The reason I bring up the error is that I think the featured articles thing is a red herring: There was a featured article on 1/22 that hasn't shown up on the MDN front page. 

I haven't had a chance to dig into it yet, but I suspect there are feed parsing issues. Posted that dev-side error to the bug as more info to whomever might dig into this (even if it ends up being me)

Comment 6

5 years ago
(In reply to Les Orchard [:lorchard] from comment #5)

> The reason I bring up the error is that I think the featured articles thing
> is a red herring: There was a featured article on 1/22 that hasn't shown up
> on the MDN front page. 
> 
> I haven't had a chance to dig into it yet, but I suspect there are feed
> parsing issues. Posted that dev-side error to the bug as more info to
> whomever might dig into this (even if it ends up being me)

Ah, I see! Thanks for the info then, let me know if there's anything I can change on the Hacks side to help things.
(Reporter)

Comment 7

5 years ago
Les, correct! 'Featured' is a red herring. The feed is not successfully pulling/updating featured articles - there's a 'Featured' post from 12/18 that also doesn't appear. 

Thanks!
I think Les is on the right track with comment 3. The very next post to be published [1] contains lots of non-English characters, and the entire text body is published in the RSS feed. This might explain why the feed update logic is running into encoding issues. 

[1] https://hacks.mozilla.org/2013/12/write-elsewhere-run-on-firefox/
I went ahead and updated the post to use "equivalent" ascii characters, so the homepage may be up-to-date again after the feed is pulled. Even if this does happen, we should open a follow-up bug for correcting this in the long term.

Havi, I will keep a copy of the original post (uumlauts and all) if you ever need it.
Sure enough, the homepage is back up to date! Long term fix will happen with bug 966146.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED

Comment 11

5 years ago
Thanks John!
You need to log in before you can comment on or make changes to this bug.