Open Bug 1649987 Opened 7 months ago Updated 27 days ago

Reduce the retention of TC artifacts depending on the branch - aka files generated during the builds

Categories

(Firefox Build System :: Task Configuration, task)

task

Tracking

(Not tracked)

People

(Reporter: Sylvestre, Unassigned)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files, 1 obsolete file)

We are already doing this for try. We store files for 28 days:
https://searchfox.org/mozilla-central/source/taskcluster/taskgraph/transforms/task.py#1809

We should do similar things for autoland and m-c. And maybe for m-b, m-r and m-esr

Context: we have storing a lot of data and it has a significant cost.

This needs to be more granular than reducing the retention per branch. For one, the per-push shippable builds are useful for mozregression, so those should, in fact, be kept longer, short of uploading them somewhere else (which may itself be desirable). Other artifacts from builds are potentially less interesting. Artifacts from other builds are also less interesting.

BTW, I'm not sure taskcluster allows an artifact expiry shorter than the expiry of the task itself. So it might not actually be possible to do per-artifact expiration.

Or even the other way around (artifact expiry longer than the expiry of the task itself)

Ok, I just checked, it seems they both work.

Correction, artifact expiration needs to be shorter or equal than the task expiration.

I think a good starting point is to preserve the current policy and keep build artifacts for 1 year, and change the policy for non-build artifacts to 3 months. This gives us significant savings without reducing what binaries are available for mozregression.

Assignee: nobody → catlee
Status: NEW → ASSIGNED
See Also: → 1651965
Attachment #9161331 - Attachment description: Bug 1649987: [WIP] Set default task expiry to 12 weeks r=bhearsum → Bug 1649987: [WIP] Set default task expiry to 12 weeks r=tomprince
Attachment #9161331 - Attachment description: Bug 1649987: [WIP] Set default task expiry to 12 weeks r=tomprince → Bug 1649987: Set default task expiry to 12 weeks r=jmaher
Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/49da43027193
Set default task expiry to 12 weeks r=jmaher

Backed out changeset 49da43027193 (bug 1649987) for gecko decision bustage

Push with failure: https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&fromchange=49da43027193fc09fc35c53ef8397c8b99c00974&searchStr=gecko%2Cdecision&tochange=774f97c76ce3b7c0fa416e725cad27fb480c67e0&selectedTaskRun=KpUtcdWoSp-axaTjQYeSoQ.0

Backout link: https://hg.mozilla.org/integration/autoland/rev/774f97c76ce3b7c0fa416e725cad27fb480c67e0

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=312837922&repo=autoland&lineNumber=317

...
[task 2020-08-12T15:14:20.537Z]   File "/builds/worker/checkouts/gecko/taskcluster/taskgraph/util/schema.py", line 34, in validate_schema
[task 2020-08-12T15:14:20.537Z]     raise Exception('\n'.join(msg) + '\n' + pprint.pformat(obj))
[task 2020-08-12T15:14:20.537Z] Exception: In task.run 'fetch-wasm-misc':
[task 2020-08-12T15:14:20.537Z] extra keys not allowed @ data['artifacts'][0]['expires-after']
[task 2020-08-12T15:14:20.537Z] {'allow-ptrace': True,
[task 2020-08-12T15:14:20.537Z]  'artifacts': [{'expires-after': '1000 years',
[task 2020-08-12T15:14:20.537Z]                 'name': 'public',
[task 2020-08-12T15:14:20.537Z]                 'path': '/builds/worker/artifacts',
[task 2020-08-12T15:14:20.538Z]                 'type': 'directory'}],
[task 2020-08-12T15:14:20.538Z]  'chain-of-trust': True,
[task 2020-08-12T15:14:20.538Z]  'command': ['/builds/worker/bin/run-task',
[task 2020-08-12T15:14:20.538Z]              '--fetch-hgfingerprint',
[task 2020-08-12T15:14:20.538Z]              '--',
[task 2020-08-12T15:14:20.538Z]              '/builds/worker/bin/fetch-content',
[task 2020-08-12T15:14:20.538Z]              'static-url',
[task 2020-08-12T15:14:20.538Z]              '--sha256',
[task 2020-08-12T15:14:20.538Z]              '0ba273b748b872117a4b230c776bbd73550398da164025a735c28a16c0224397',
[task 2020-08-12T15:14:20.538Z]              '--size',
[task 2020-08-12T15:14:20.538Z]              '4433793',
[task 2020-08-12T15:14:20.538Z]              'https://github.com/mozilla/perf-automation/releases/download/wasm-misc-v1/wasm-misc-c55c3c7690b2.zip',
[task 2020-08-12T15:14:20.538Z]              '/builds/worker/artifacts/wasm-misc.zip'],
[task 2020-08-12T15:14:20.538Z]  'docker-image': {'in-tree': 'fetch'},
[task 2020-08-12T15:14:20.538Z]  'docker-in-docker': False,
[task 2020-08-12T15:14:20.538Z]  'env': {'MOZ_SCM_LEVEL': '3', 'UPLOAD_DIR': '/builds/worker/artifacts'},
[task 2020-08-12T15:14:20.538Z]  'implementation': 'docker-worker',
[task 2020-08-12T15:14:20.538Z]  'loopback-audio': False,
[task 2020-08-12T15:14:20.538Z]  'loopback-video': False,
[task 2020-08-12T15:14:20.538Z]  'max-run-time': 900,
[task 2020-08-12T15:14:20.538Z]  'os': 'linux',
[task 2020-08-12T15:14:20.538Z]  'privileged': False,
[task 2020-08-12T15:14:20.538Z]  'taskcluster-proxy': False,
[task 2020-08-12T15:14:20.538Z]  'volumes': []}
[taskcluster 2020-08-12 15:14:20.971Z] === Task Finished ===
[taskcluster 2020-08-12 15:14:21.122Z] Artifact "public/docker-contexts" not found at "/builds/worker/checkouts/gecko/docker-contexts"
[taskcluster 2020-08-12 15:14:21.222Z] Unsuccessful task run with exit code: 1 completed in 20.15 seconds
Flags: needinfo?(bhearsum)
Pushed by bhearsum@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d244e80dc826
Set default task expiry to 12 weeks r=jmaher
Blocks: 1658938

I think Joel is taking over this work.

Flags: needinfo?(bhearsum)
Attachment #9161331 - Attachment is obsolete: true
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3755692f8d5f
Set default task expiry to 12 weeks r=bhearsum
Backout by nbeleuzu@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/157db696462d
Backed out changeset 3755692f8d5f as per glandium req

Backed out in:
https://hg.mozilla.org/integration/autoland/rev/157db696462d8a98905d0f8697088aa97cb6e08f

There are at least two problems it causes:

With this patch, will we still keep test logs for longer than 12 weeks?

Flags: needinfo?(jmaher)

we would keep test logs for 1 month. Is there a use case for keeping logs longer? IIRC activedata ingests logs within 2 weeks (typically 2 hours, but assuming outage, restart, etc.)

Flags: needinfo?(jmaher)

ActiveData only keeps a limited amount of data about tests (12 weeks), for test selection we need a larger timespan (at least for the errorsummary.log files).

Assignee: catlee → nobody
Status: ASSIGNED → NEW
You need to log in before you can comment on or make changes to this bug.