Closed Bug 1506907 Opened 3 years ago Closed 3 years ago

rewrite ftpscraper


(Socorro :: Processor, task, P2)


(Not tracked)



(Reporter: willkg, Assigned: willkg)




(4 files)

ftpscraper does a bunch of stuff to populate the product_versions table. We want to get rid of that table and all the related postgres bits, but we need to keep scraping because Buildhub isn't quite there, yet.

The new scraper should:

1. run as a crontabber app

2. scrape

3. save data to a single table that is backed by a Django model with all the information required to do BetaVersionRule lookups

This bug covers that rewrite.
I've got parts of this done already. I'll just finish it up.
Assignee: nobody → willkg
Blocks: 1361394
Priority: -- → P2
Commits pushed to master at
bug 1506907: implement Product model

This creates a Django model for products, creates a data migration to move
product data from the old table to the new one, and reimplements things to
use the new table.

We have to leave the old table in place because it's a foreign key of
the old product_versions table and some other tables. All that will go
away in a future commit.
Merge pull request #4712 from willkg/1506907-product

bug 1506907: implement Product model
Commits pushed to master at
bug 1506907: add ProductVersion table

This table has all the information for the processor's BetaVersionRule to
convert release_versions into version_strings.
fix bug 1506907: implement archivescraper

This implements ArchiveScraperCronApp which is a slimmed down minimal
version of ftpscraper. The goal is to populate the crashstats_productversion
table with the information the BetaVersionRule needs to do lookups
to convert release_version to version_string for Firefox and Fennec.
Merge pull request #4714 from willkg/1506907-archivescraper

fix bug 1506907: implement archivescraper
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.