Closed
Bug 1151806
Opened 10 years ago
Closed 10 years ago
Add chunking to treeherder-client / ETL to keep POSTs under the 30s Heroku request time limit
Categories
(Tree Management :: Treeherder: Infrastructure, defect, P2)
Tree Management
Treeherder: Infrastructure
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: camd)
References
Details
Attachments
(1 file, 1 obsolete file)
Breaking out of the overall Heroku bug (bug 1145606), since this is camd's Treeherder deliverable for this quarter:
https://docs.google.com/a/mozilla.com/document/d/1U3VXk7K5iTmZvqqtX4sc-Znhch9ZAD98Vl1OyWRqZwo/edit
Heroku has a 30 second cutoff for requests to web nodes (https://devcenter.heroku.com/articles/request-timeout). Our current ETL process submits the data to the publicly accessible API on the web nodes, quite often in big chunks due to builds-4hr etc. We'll likely hit the 30s limit unless we do one of:
1) Chunk the submissions to the API.
2) Make the ETL layer submit not use the web-accessible API (eg internally make the model/DB updates).
3) Be more intelligent about the amount of busywork we repeat (eg switch to builds-2hr or use memcached to keep track of ingested jobs, so we don't continually re-insert the builds-4hr jobs list, when only a small percentage of it is new each time).
Assignee | ||
Comment 1•10 years ago
|
||
use memcache to keep track of which jobs we've already ingested from pending.js, running.js or build4hr.js. So we can check that prior to ingestion rather than relying always on the failover of ON DUPLICATE KEY. This will reduce DB traffic and speed it up. One memcache key per repo. Add job_guid to the list only on successful ingestion.
Assignee | ||
Comment 2•10 years ago
|
||
chunking can be added as a param in the th_client. We can specify the chunk size in our settings file. This could be broken up by chunk size for pending, running, build4hr. Even resultsets.
So the code change will be primarily in th_client. But then OAuthLoaderMixin will need to pass the param from the settings.
Mauro mentioned, too, to change our timeout to match the Heroku limit requirement. Then when we deploy to the existing staging env, we'll know we're good.
Assignee | ||
Comment 3•10 years ago
|
||
Above are just some notes from chatting with mdoglio about this task. Sorry they're a bit choppy. :)
Assignee | ||
Updated•10 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 4•10 years ago
|
||
Attachment #8606000 -
Flags: review?(mdoglio)
Assignee | ||
Updated•10 years ago
|
Attachment #8606000 -
Attachment description: PR → Ingestion Chunking PR
Reporter | ||
Comment 5•10 years ago
|
||
When this lands, we'll want to double check it wasn't the cause of the memory usage spikes seen in bug 1164888 comment 2, which was either that bug or this one. Perhaps Mauro's idea of using a generator (https://github.com/mozilla/treeherder/pull/533#discussion_r30598047) will help avoid this? :-)
Updated•10 years ago
|
Attachment #8606000 -
Flags: review?(mdoglio) → review+
Comment 6•10 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/e71e78156555baada7f6d60d291188387ce96b12
Bug 1151806 - Implement chunking for job ingestion
Reporter | ||
Comment 7•10 years ago
|
||
Something that occurred to me: the "too many requests" is presumably us hitting the API thresholds we put in place for things like taskcluster (though it does seem right that we hit them too; makes sense for us to do so).
To decide what values we should set chunk size to, I think it would help to know what a typical batch size would be if we weren't using chunking at all (likely for builds-4hr, since that's the worst case file).
ie: if previously we'd been submitting up to 10,000 jobs at once, then perhaps the 150 job chunk size we have after the followup https://github.com/mozilla/treeherder/commit/e9d127f7eee4a3dbffee6b421e293adcf46fcc52 is still a case of "one extreme to the other"? (And so we could say set it to 500 or 1000 jobs and avoid the timeouts on Heroku but also not increase the number of requests ten-fold).
Summary: Get ETL layer working with Heroku → Add chunking to treeherder-client / ETL to keep POSTs under the 30s Heroku request time limit
Comment 8•10 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/942e314361ed3fbe90e58780f7bc983cbd9edd4a
Revert "Bug 1151806 - Implement chunking for job ingestion"
This reverts commit e71e78156555baada7f6d60d291188387ce96b12.
This commit caused pending and running jobs to be put into the objectstore.
This causes their completed versions not to be ingested.
Assignee | ||
Comment 9•10 years ago
|
||
Attachment #8609401 -
Flags: review?(mdoglio)
Assignee | ||
Updated•10 years ago
|
Attachment #8606000 -
Attachment is obsolete: true
Assignee | ||
Comment 10•10 years ago
|
||
mdoglio hasn't actually marked this an r+, but he said it was on a vidyo chat today.
Comment 11•10 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/f598198c72470209c698f4099ed748ac65c49857
Bug 1151806 - Implement chunking for job ingestion - fixed
Reporter | ||
Updated•10 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Attachment #8609401 -
Flags: review?(mdoglio) → review+
You need to log in
before you can comment on or make changes to this bug.
Description
•