Closed
Bug 1189062
Opened 9 years ago
Closed 9 years ago
Data on S3 has doubled in size
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rvitillo, Unassigned)
Details
On the 27th a new edge node has been added to aggregate submissions into files to be stored on S3. Adding that node shrank the average size of a file by about 2x which in turned is causing the v4 aggregator to take way too long to process a single day and ultimately fail. Can we do something about it? Maybe aggregating for longer periods?
Reporter | ||
Updated•9 years ago
|
Priority: -- → P1
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(whd)
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(whd) → needinfo?(mreid)
Reporter | ||
Comment 1•9 years ago
|
||
I will try to batch file reads on the analysis job side and see if things improve.
Comment 2•9 years ago
|
||
On the 27th I updated the DWL configuration to use a c3.4xlarge (objects ending with ip-172-31-16-184), but the old DWL was re-enabled some time after that by cron and processed its entire backfill from kafka (objects ending with ip-172-31-14-40). I've disabled the old DWL, and we should remove all the ip-172-31-14-40 objects from after when the ip-172-31-16-184 objects started to appear. I'm guessing the behavior :rvitillo is seeing is due to the doubling of S3 data, which would also double the large number of smaller files, throwing off the average. The configuration of the old and new DWLs is exactly the same, and removing the redundant data should be sufficient to fix things up.
Comment 3•9 years ago
|
||
okay. it's going to take me a bit to find out exactly which files need deleting, but i'll update here when i've figured it out.
Comment 4•9 years ago
|
||
'a bit' being a 1-4 hours of attention, on the morrow
Reporter | ||
Comment 5•9 years ago
|
||
Yeah, the size of the data has more than doubled: s3cmd du -H s3://net-mozaws-prod-us-west-2-pipeline-data/telemetry-2/20150721/ 2T s3://net-mozaws-prod-us-west-2-pipeline-data/telemetry-2/20150721/ s3cmd du -H s3://net-mozaws-prod-us-west-2-pipeline-data/telemetry-2/20150728/ 5T s3://net-mozaws-prod-us-west-2-pipeline-data/telemetry-2/20150728/
Reporter | ||
Updated•9 years ago
|
Summary: Files on S3 are too small → Data on S3 has doubled in size
Updated•9 years ago
|
Flags: needinfo?(mreid)
Comment 6•9 years ago
|
||
I've determined the list of files to remove by piping the output of this python script to a file:
> from boto import connect_s3
>
> s3 = connect_s3()
>
> bucket = s3.get_bucket('net-mozaws-prod-us-west-2-pipeline-data')
>
> first = float('inf')
> for key in bucket.list('telemetry-2/20150727'):
> ts, host = key.key.rsplit('/', 1).pop().split('_')
> if host == 'ip-172-31-16-184':
> first = min(first, float(ts))
>
> for key in bucket.list('telemetry-2/20150727'):
> ts, host = key.key.rsplit('/', 1).pop().split('_')
> if host == 'ip-172-31-14-40' and float(ts) > first:
> print('s3://%s/%s' % (bucket.name, key.key))
>
> for key in bucket.list('telemetry-2/20150728'):
> _, host = key.key.rsplit('/', 1).pop().split('_')
> if host == 'ip-172-31-14-40':
> print('s3://%s/%s' % (bucket.name, key.key))
>
> for key in bucket.list('telemetry-2/20150729'):
> _, host = key.key.rsplit('/', 1).pop().split('_')
> if host == 'ip-172-31-14-40':
> print('s3://%s/%s' % (bucket.name, key.key))
>
> for key in bucket.list('telemetry-2/20150730'):
> _, host = key.key.rsplit('/', 1).pop().split('_')
> if host == 'ip-172-31-14-40':
> print('s3://%s/%s' % (bucket.name, key.key))
>
> for key in bucket.list('telemetry-2/20150731'):
> _, host = key.key.rsplit('/', 1).pop().split('_')
> if host == 'ip-172-31-14-40':
> print('s3://%s/%s' % (bucket.name, key.key))
Comment 7•9 years ago
|
||
94959 files were found
Comment 8•9 years ago
|
||
rvitillo and mreid have approved the list of 94959 files for deletion. starting delete now.
Comment 9•9 years ago
|
||
delete completed.
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•