Store job artifacts in S3 or Azure

RESOLVED WONTFIX

Status

Tree Management
Treeherder: Data Ingestion
P4
normal
RESOLVED WONTFIX
3 years ago
2 years ago

People

(Reporter: jgriffin, Unassigned)

Tracking

(Blocks: 3 bugs)

Details

(Reporter)

Description

3 years ago
Our database is currently running at ~250G out of a max of 700G.  To help prevent future scaling issues, we should put job artifacts in S3 or Azure rather than the db.

Some caveats:
- we should leave all perf data in the db, since we'll want to do large time range queries against this
- we should leave artifacts relevant to OF in the db for the same reason (primarily bug associations, I think)

Updated

3 years ago
Blocks: 1078392

Updated

3 years ago
Priority: -- → P4

Comment 1

3 years ago
I'm sure things will improve with azure, but something to consider: (not the first time this year)
http://www.neowin.net/news/microsoft-azure-hit-with-another-outage

Updated

3 years ago
No longer blocks: 1080757
Component: Treeherder → Treeherder: Data Ingestion

Comment 3

3 years ago
I was thinking about this - I guess once we're uploading things to S3, we'll just be storing the URL to the resource. But if we're doing that - what's to stop that URL being one given to us by Taskcluster? ie we could even skip out having to process and upload the error summary artifact and just have a URL to it given to us by taskcluster, which we store and then just pass to the UI when it comes to displaying the failure summary tab. (Minus the part about orangefactor signatures; but that can be done more async)

Would sure make a few things a bit simpler :-)

Comment 4

3 years ago
(In reply to Ed Morley [:emorley] from comment #3)
> Would sure make a few things a bit simpler :-)

And punt the S3 bill upstream, or if they were already using S3 for storing these artefacts, it saves us doubling up on them.

Comment 5

2 years ago
We're moving away from storing json blobs in the DB, and instead storing eg each log line separately. Between this, us using gzip compression for the remaining blobs, and disk space being easier to come by on RDS, I think this is wontfix.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.