Closed
Bug 1158175
Opened 9 years ago
Closed 9 years ago
Add build-id dimension to v4 filenames.
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P2)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rvitillo, Assigned: whd)
References
Details
V2 telemetry files contained the build-id in their filenames, which allowed users to query the indexing service by build-id range. Since this is a common query for the performance team, and more generally fo users trying to correlate a regression or performance improvement to a specific build-id, v4 filenames should contain the build-id. Queries we might want to issue might look like this: query(appName="Firefox", appUpdateChannel="nightly", appBuildID=("20150101000000", "20150110999999") Note that the submission date is not part of the query.
Updated•9 years ago
|
Priority: -- → P2
Updated•9 years ago
|
Assignee: nobody → whd
Comment 1•9 years ago
|
||
Roberto, how will this impact the S3 filter service? The plan is to backfill the majority of the data into a new bucket prefix using the new schema, then cut over the data loader to use the new schema, then backfill the small gap in the middle. We will then delete the data from the old prefix and use the new one.
Flags: needinfo?(rvitillo)
Reporter | ||
Comment 2•9 years ago
|
||
The following is required to transition to the new bucket: - Change the v4 bucket in the batch filter service - Change the v4 bucket in the lambda function - Backfill submissions in the SimpleDB index Where is the new schema definition going to be stored? Does telemetry_schema.py support it? When will the transition to the new bucket happen?
Flags: needinfo?(mreid)
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(rvitillo)
Comment 3•9 years ago
|
||
The new schema definition will be stored in the metadata bucket. It will be very similar to the current schema, with the addition of one more field for appBuildId, so it will definitely be supported by telemetry_schema.py (you will have to set "dirs_only=True" similar to the current v4 schema). Wes, do you have a particular timeline when you expect to be ready to transition?
Flags: needinfo?(mreid) → needinfo?(whd)
Assignee | ||
Comment 4•9 years ago
|
||
Probably Wednesday. As we surmised, the cardinality increase caused the backfill process to hit open file descriptor limits, but it's finally running and should complete in about a day. The production switch-over and single day of backfill takes another day.
Flags: needinfo?(whd)
Updated•9 years ago
|
Assignee | ||
Comment 5•9 years ago
|
||
Update on this: we ran into another possibly related issue where the heka process was being killed by SIGPIPE mid-backfill. I've worked around it by processing data in chunks of one month and the cutover should happen this weekend.
Comment 6•9 years ago
|
||
Uh oh, did the backfilled data get snappy-encoded?
Comment 7•9 years ago
|
||
There are still a few lingering snappy-encoded records on the following days: 20150514 20150515 20150516 I'm checking prior history and will update here if there are any other affected days.
Assignee | ||
Comment 8•9 years ago
|
||
2015051[456] have all been reprocessed.
Comment 9•9 years ago
|
||
I also notice that there is only one file for 20150430, no files for 20150429 and 20150428, and fewer than expected for 20150427 and 20150426
Assignee | ||
Comment 10•9 years ago
|
||
(In reply to Mark Reid [:mreid] from comment #9) > I also notice that there is only one file for 20150430, no files for > 20150429 and 20150428, and fewer than expected for 20150427 and 20150426 At least there was a simple explanation for this: the SIGPIPE of death affected the backfill for the "201504" prefix, and heka died while processing the final days of that month. I re-ran the backfill for the affected days.
Assignee | ||
Comment 12•9 years ago
|
||
(In reply to Mark Reid [:mreid] from comment #11) > Was it just the days I mentioned that you backfilled? Yeah, it looked from the logs like 20150425 had been entirely processed and 201505 was processed in a different heka run. Only 201504 had the SIGPIPE issue because it's the only full month of files in landfill.
Flags: needinfo?(whd)
Assignee | ||
Comment 13•9 years ago
|
||
Backfill is complete and metadata has been updated at s3://net-mozaws-prod-us-west-2-pipeline-metadata/sources.json, so we're done here.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Comment 14•9 years ago
|
||
20150430 is still empty :(
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 15•9 years ago
|
||
Really this time.
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•