Closed Bug 1166058 Opened 9 years ago Closed 3 years ago

Automatically upload PCI database

Categories

(Socorro :: General, task, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: peterbe, Assigned: willkg)

References

Details

Attachments

(1 file)

At the moment, the process is as follows:

1. Go to http://www.pcidatabase.com/reports.php?type=tab-delimeted
2. Download that page as a .csv file
3. Go to https://crash-stats.mozilla.com/admin/graphics-devices/
4. Upload the downloaded CSV file

All of this ought to be automated with cron.
Turns out this site https://pci-ids.ucw.cz/ has a much larger CSV file available. 

https://pci-ids.ucw.cz/v2.2/pci.ids has 27,112 lines in it where as http://www.pcidatabase.com/reports.php?type=tab-delimeted only has 9,480 lines. 

Perhaps we ought to look at what's in one and not in the other. 

Also, the .cz site versions their download URLs so we'd need to scrape the homepage for the link to the latest pci.ids URL.
See Also: → 1166063
Note! See the See Also bug.
(In reply to Peter Bengtsson [:peterbe] from comment #0)
> 1. Go to http://www.pcidatabase.com/reports.php?type=tab-delimeted

When using pcidatabase.com, you really should use http://www.pcidatabase.com/reports.php?type=csv instead.
Cool! I didn't know about that option. 
However, the advantage we having written the code for tab-delimitation is that the code is now (hopefully) going to work for that larger pci-ids.ucw.cz format.
I've talked to one of the maintainers of https://pci-ids.ucw.cz/
The version number in https://pci-ids.ucw.cz/v2.2/pci.ids isn't a version number of the content. It's an indication of the version of the format of the file. 

Apparently they update the file every night but only if the file is different. For example, today is May 19 and the last time the file was updated as May 13.

Note: http://www.pcidatabase.com disappeared sometimes in 2017 but https://pci-ids.ucw.cz/ is still around and being updated so we should use that.

See Also: → 1695576

I forgot about this bug. I think I looked into automating it and it was tough for some reason, but I don't remember what that was. I'll toss it in the queue.

Assignee: nobody → willkg
Status: NEW → ASSIGNED
Priority: -- → P2

willkg merged PR #5704: "bug 1166058: implement update_graphics_pci" in bf376c1.

When that deploys to stage, the cron job will run. I'll check the output and then force it to run again.

Worked great on stage! I'll deploy it tomorrow.

This went out in bug #1695932. It ran and picked up a bunch of new devices:

Done. Created: 9630; Updated: 3699; Skipped: 17372

"Skipped" is what it already had in the table. "Updated" updated vendor or device information. The pci.ids file has some subdevices in two places, so that'll be most of the updateds. The "Created" is all new things.

This should be good going forward. It'll run once a week.

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: