rewrite archivescraper as django command
Categories
(Socorro :: General, task, P2)
Tracking
(Not tracked)
People
(Reporter: willkg, Assigned: willkg)
References
Details
Attachments
(2 files)
archivescraper is a crontabber job. Code is here:
https://github.com/mozilla-services/socorro/blob/master/socorro/cron/jobs/archivescraper.py
This bug covers rewriting archivescraper as a Django command that runs at a scheduled time using the Django cronrun command. Also, rewrite the tests.
Assignee | ||
Comment 2•5 years ago
|
||
Assignee | ||
Comment 3•5 years ago
|
||
Assignee | ||
Comment 4•5 years ago
|
||
This deployed to stage yesterday. I looked at the records and it's running. It looks like it takes about half the time to run than it did before. I'm not sure why that would be.
Because we're not running it in verbose mode, it's not clear exactly what it's doing. I think I want to adjust the logging a bit.
Also, the subprocesses are printing to stdout which I don't think is getting picked up by the main process and converted to mozlog format.
Three things to look into:
- adjust logging so we know what trees it's traversing
- log how many builds it found in addition to how many were successfully inserted
- see if we can get the subprocesses to send their output to the main process for mozlog formatting
Assignee | ||
Comment 5•5 years ago
|
||
Also, I broke the local dev environment when removing the crontabber job. Need to fix that, too.
Assignee | ||
Comment 6•5 years ago
|
||
Assignee | ||
Comment 7•5 years ago
|
||
willkg merged PR #4915: "bug 1542388: archivescraper fixes" in a8eb3c8.
This fixes all the above things. I'll wait for this to go to stage, then to run, and then I'll see how I feel about everything.
Assignee | ||
Comment 8•5 years ago
|
||
Works perfect on stage now:
demozlogged mozlog formatted logs:
2019-04-29T19:23:04.682184 INFO crashstats.cron: about to run archivescraper
2019-04-29T19:23:04.764408 INFO crashstats.cron: archivescraper: scrape_candidates working on /pub/firefox/candidates/
2019-04-29T19:23:05.311038 INFO crashstats.cron: archivescraper: skipping anything before Firefox and not esr (63)
2019-04-29T19:23:31.480972 INFO crashstats.cron: archivescraper: worker: could not find json files in: /pub/firefox/candidates/52.0.1esr-candidates/build1/
2019-04-29T19:23:31.481236 INFO crashstats.cron: archivescraper: worker: could not find json files in: /pub/firefox/candidates/60.0esr-candidates/build3/
2019-04-29T19:23:31.481343 INFO crashstats.cron: archivescraper: worker: could not find json files in: /pub/firefox/candidates/60.0esr-candidates/build4/
2019-04-29T19:23:31.481448 INFO crashstats.cron: archivescraper: worker: could not find json files in: /pub/firefox/candidates/60.2.2esr-candidates/build1/
2019-04-29T19:23:31.481676 INFO crashstats.cron: archivescraper: worker: could not find json files in: /pub/firefox/candidates/60.3.0esr-candidates/build2/
2019-04-29T19:23:31.481804 INFO crashstats.cron: archivescraper: worker: could not find json files in: /pub/firefox/candidates/63.0.1-candidates/build3/
2019-04-29T19:23:31.481905 INFO crashstats.cron: archivescraper: worker: could not find json files in: /pub/firefox/candidates/65.0b3-candidates/build1/
2019-04-29T19:23:31.481997 INFO crashstats.cron: archivescraper: worker: could not find json files in: /pub/firefox/candidates/67.0b15-candidates/build1/
2019-04-29T19:23:32.680190 INFO crashstats.cron: archivescraper: found 1308 builds; inserted 0 builds
2019-04-29T19:23:32.683002 INFO crashstats.cron: archivescraper: scrape_candidates working on /pub/devedition/candidates/
2019-04-29T19:23:33.517644 INFO crashstats.cron: archivescraper: skipping anything before DevEdition and not esr (63)
2019-04-29T19:23:48.162532 INFO crashstats.cron: archivescraper: found 734 builds; inserted 1 builds
2019-04-29T19:23:48.165235 INFO crashstats.cron: archivescraper: scrape_candidates working on /pub/mobile/candidates/
2019-04-29T19:23:48.912092 INFO crashstats.cron: archivescraper: skipping anything before Fennec and not esr (63)
2019-04-29T19:23:57.337756 INFO crashstats.cron: archivescraper: found 393 builds; inserted 1 builds
2019-04-29T19:23:57.338001 INFO crashstats.cron: archivescraper: Done!
2019-04-29T19:23:57.338125 INFO crashstats.cron: successfully ran archivescraper on 2019-04-29 19:23:04.685174+00:00
Assignee | ||
Comment 9•5 years ago
|
||
This just went to prod.
Description
•