Closed Bug 1338588 Opened 7 years ago Closed 7 years ago

Planet is incorrectly reporting errors on certain feeds.

Categories

(Websites :: planet.mozilla.org, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mhoye, Unassigned)

References

Details

Several otherwise accessible and correct feeds are being incorrectly handled by Planet, reporting 403 errors (that are not reflected on the server side) and "internal server errors", whose meaning is unclear.

jdm, ehsan and several other people are affected.
Depends on: 1334949
Preliminary investigation suggests that an older version of Python - 2.6.6 - doesn't like a new version of TLS. Via Ericz:

INFO:planet.runner:Socket timeout set to 20 seconds
INFO:planet.runner:Fetching http://www.joshmatthews.net/blog/feed/ via 0
INFO:planet.runner:Fetching https://ehsanakhgari.org/blog/tag/planet/feed via 1
ERROR:planet.runner:HTTP Error: [Errno 1] _ssl.c:492: error:14077438:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert internal error in thread-1
INFO:planet.runner:Finished threaded part of processing.
ERROR:planet.runner:Error 500 while updating feed https://ehsanakhgari.org/blog/tag/planet/feed
INFO:planet.runner:no activity in 90 days
ERROR:planet.runner:Error 403 while updating feed http://www.joshmatthews.net/blog/feed/
INFO:planet.runner:no activity in 90 days

I understand that the box the staging script is running on has Python 2.7.9 on it as well, and propose that we can test my theory by dropping the current automatic updates temporarily, and calling the update script explicitly via Python 2.7.9 and seeing what happens.

red rover, red rover, I call ericz over.
Flags: needinfo?(eziegenhorn)
It actually has python 2.7.11 available, but it encounters the same errors:

+ /usr/bin/python2.7 ../../../planet-source/trunk/planet.py config.ini
INFO:planet.runner:Socket timeout set to 20 seconds
INFO:planet.runner:Fetching http://www.joshmatthews.net/blog/feed/ via 0
INFO:planet.runner:Fetching https://ehsanakhgari.org/blog/tag/planet/feed via 1
ERROR:planet.runner:HTTP Error: [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:590) in thread-1
INFO:planet.runner:Finished threaded part of processing.
ERROR:planet.runner:Error 500 while updating feed https://ehsanakhgari.org/blog/tag/planet/feed
INFO:planet.runner:no activity in 90 days
ERROR:planet.runner:Error 403 while updating feed http://www.joshmatthews.net/blog/feed/
INFO:planet.runner:no activity in 90 days
INFO:planet.runner:Loading cached data
Flags: needinfo?(eziegenhorn)
This is apparently caused by certs using SNI.  See <http://stackoverflow.com/questions/34522211/error-urllib2-python-ssl-tlsv1-alert-internal-error-ssl-c590> although it seems the solution there is to ignore the exception...
Hah! You beat me back to the bug.

Yes, the problem Planet has is that the httplib2 that Planet relies on explicitly wontfixed SNI per here:

https://github.com/jcgregorio/httplib2/issues/233

Moving to that codebase  to requests looks like the right thing, but I expect that will be a nontrivial operation.

Working on it, sorry...
How about invoking wget or some such from python directly?  That's another option.
Whatever I do is going to involve rewriting the spidering part of the code somehow, but I'm not opposed to using curl or wget if that ends up being simpler.
hmm… which version of Planet do you use?

This one?
https://people.gnome.org/~jdub/bzr/planet/2.0/


or Venus
https://github.com/rubys/venus
If it's venus and looking at the code. I'm happy to fork it and fix it.
https://github.com/rubys/venus/search?utf8=%E2%9C%93&q=httplib2

httplib2 could be easily replaced by requests.
I'm checking with miketaylr if we have the same issues on planet.webcompat.com
Flags: needinfo?(mhoye)
I'm reluctant to replace the original Planet with Venus, but httplib2 is vendored into the Planet code and its usage looks well-contained. I should be able to replace httplib2 with requests in short order.
Flags: needinfo?(mhoye)
(In reply to Mike Hoye [:mhoye] from comment #4)
> Yes, the problem Planet has is that the httplib2 that Planet relies on
> explicitly wontfixed SNI per here:

and

(In reply to Mike Hoye [:mhoye] from comment #10)
> I'm reluctant to replace the original Planet with Venus, but httplib2 is
> vendored into the Planet code and its usage looks well-contained. I should
> be able to replace httplib2 with requests in short order.


hmm? Are you sure you are using Planet 2.0 and not Venus.
Because httplib2 is not in Planet 2.0 to the best of my knowledge.
This was my mistake - I'd thought Planet had drifted from the Venus codebase, but apparently not enough to warrant it.

Thanks to jbuck's sleuthing, we've figured out that the problem is with the version of urllib2 that runs on RHEL6. Seems to work right on CentOS7, so I'm going to loop back to ops and see if we can make that happen.
Mike, any progress on this?  It's been a month since I noticed my blog wasn't been syndicated.  I'd appreciate if you can give an estimate on how long this will take to fix.  Thanks!
Flags: needinfo?(mhoye)
Hoping to have it fixed this week, but will keep you updated.
Flags: needinfo?(mhoye)
We've identified a fix per bug 1342201 and with luck will roll it out during that team's next sprint, next week.
Mike: In bug 1346898, you mentioned you hoped to resolve it at the end of last week. Could you give an overview of what's left to close this bug?
Flags: needinfo?(mhoye)
Well, that took longer than I wanted it to. Details are here:

http://exple.tive.org/blarg/2017/04/07/planet-secure-for-now/
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(mhoye)
Resolution: --- → FIXED
Thanks for fixing this. It would be nice to have more timely communication around issues like this next time.
You need to log in before you can comment on or make changes to this bug.