Closed Bug 1668795 Opened 4 years ago Closed 4 years ago

upload compressed versions of dll and exe

Categories

(Socorro :: Symbols, task, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

Attachments

(1 file)

In bug #1668317, we fixed the symbol uploader to run makecab on dll and exe files like it used to.

We need to compress the 2-3 months of dll and exe files that were uploaded without being compressed first.

I think there are a few parts here:

  1. make a list of the dll and exe files that need to be compressed and uploaded
  2. write a script that downloads a file, runs makecab on it, and uploads it (maybe do it in batches?)
  3. run the script on the list of affected files

I'll work on building a list of affected files now.

Assignee: nobody → willkg
Status: NEW → ASSIGNED
Priority: -- → P2

I started writing a script, but the API endpoint I'm hitting takes an incredibly long time to run and the script will take like 46 hours to complete. I got Brian to run some SQL in prod.

SELECT bucket_name, key, created_at
FROM upload_fileupload
WHERE key LIKE '%dll' OR key LIKE '%exe';

That picks up the bucket name, key, and created_at dates. Looking at the data, we see:

  1. It affects both the private and public buckets. I was curious, mostly. You can't download symbols from the private bucket through Tecken, so it doesn't affect work in this bug.
  2. What is the range of created_at dates. This tells us whether the theory that the change in make_symbols was the only issue and it'll tell us going forward when we stop getting uncompressed symbol files.

Next time we run this SQL, we should ignore try builds. The key for try builds starts with "try".

We have 48,585 files to fix created between July 31st, 2020 and today. The upload_symbols script fix landed, but hasn't been uplifted, so I think this will continue to be an issue for a while. We'll have to do a pass now and then another pass later.

I'm working on a script now. I need to figure out how I can run makecab.

I have a working script. I'm running it now. It fails periodically partially because this machine I'm running it on is clunky. I'm guessing it'll take a couple of days to complete.

See Also: → 1668317
Blocks: 1667481

The script finished running. I'm pretty sure it downloaded the 48,585 files one-at-a-time, ran makecab on them, and then uploaded the resulting dl_/ex_ file. I'll double-check and fix anything that got missed in this pass tomorrow.

I'll keep this open and do another round next week to pick up new dll and exe files.

No longer blocks: 1667481

I think that fix was uplifted so it's in all the channels now. I'm going to do the second round of compress-and-upload this week.

New SQL:

SELECT bucket_name, key, created_at
FROM upload_fileupload
WHERE
   (key LIKE '%dll' OR key LIKE '%exe')
   AND NOT key LIKE 'try%'
   AND created_at > '2020-10-01';

This ignores the try symbols and also restricts to symbols uploaded after 2020-10-01. That'll pick up some that we've covered already, but my script will skip those.

Brian sent me a list of 1523 files between 2020-10-01 and today. The last date of an uncompressed file being uploaded was 2020-10-13, so I don't think we're going to have to do another pass after this one.

I ran my script and it fixed around 800 files. Half of the files in the list were fixed already. I purposely backdated the start date so that we were guaranteed to pick up all of the files, so that was expected.

I think we're good here! Marking as FIXED.

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: