Closed
Bug 809602
Opened 13 years ago
Closed 13 years ago
Give ftpscraper.py sane timeout values
Categories
(Socorro :: Backend, task, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
26
People
(Reporter: selenamarie, Assigned: selenamarie)
References
Details
(Whiteboard: [qa-])
ftpscraper.py is hanging, likely when connecting to the FTP server.
We talked on IRC about adding a socket timeout, or passing a parameter to urllib2.open(). Caution on open() is that it only affects blocking operations - from the docs, this may only work for the initial connection rather than subsequent fetches of data. So the socket timeout is probably the right thing to do.
Assignee | ||
Updated•13 years ago
|
Priority: -- → P1
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → sdeckelmann
Assignee | ||
Comment 1•13 years ago
|
||
ftpscraper is now hanging unpleasantly soon after being invoked. Doing some immediate work to add debugging output to pinpoint exactly where this is failing.
Assignee | ||
Comment 2•13 years ago
|
||
We set up a temporary scraper job here:
*/15 * * * * cd socorro ; bash selena_temp_scraper_job.sh
There's a 120 second timeout on the connection to the FTP server currently. I'll monitor the connections and see if that's too short a time period.
Typical successful scraper runs take a full 2 minutes, including the connection to the database, so this seems like it should be more than sufficient.
Branch with running code is here: https://github.com/selenamarie/socorro/compare/mozilla:master...selenamarie:bug809602-ftpscraper-logging
Also looking into adding more debugging, a la: http://stackoverflow.com/questions/132058/showing-the-stack-trace-from-a-running-python-application
Assignee | ||
Comment 3•13 years ago
|
||
I had a look at the logs today and the production scraper job is hung, but selena_temp_scraper_job.sh runs without a problem. :/
Watched pot apparently does not boil.
The upside is that the temp job succeeded in populating all the nightly builds.
Assignee | ||
Comment 4•13 years ago
|
||
I had an idea this morning that perhaps the timing of the original job (5 after the hour) was part of the issue. I've rescheduled my cron to run at the same time.
I'm leaving the production cron hanging for now (not requesting for it to be killed, but will do that eventually).
Other debugging ideas are welcome.
Assignee | ||
Comment 5•13 years ago
|
||
Found it! Running it at 5 after the hour *does* produce errors.
Filed https://bugzilla.mozilla.org/show_bug.cgi?id=811063 to ask IT to investigate further if they can.
Cleaned up the patch to just include the 2 minute timeout:
r? https://github.com/mozilla/socorro/pull/931
Assignee | ||
Comment 6•13 years ago
|
||
New PR to address comments from peterbe: https://github.com/mozilla/socorro/pull/932
Assignee | ||
Updated•13 years ago
|
Target Milestone: --- → 27
Comment 7•13 years ago
|
||
Commits pushed to master at https://github.com/mozilla/socorro
https://github.com/mozilla/socorro/commit/3b9d58d7364810241b9d5715de01f8fa97753a1d
bug 809602 Adding a socket timeout to ftpscraper: 120 seconds
Added comments and updated the configmanized job
https://github.com/mozilla/socorro/commit/28ccf33cab9596b1c307ee5856a2f689f89e9c5f
Merge pull request #932 from selenamarie/bug809602-ftpscraper-logging
bug 809602 Adding a socket timeout to ftpscraper: 120 seconds
Assignee | ||
Updated•13 years ago
|
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•13 years ago
|
Whiteboard: [qa-]
Updated•13 years ago
|
Target Milestone: 27 → 26
You need to log in
before you can comment on or make changes to this bug.
Description
•