Closed Bug 1286160 Opened 9 years ago Closed 6 years ago

autoland should use a queue instead of a database

Categories

(Conduit Graveyard :: Transplant, defect, P2)

Production
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: glob, Unassigned)

References

()

Details

(Keywords: conduit-triaged)

Attachments

(1 file, 1 obsolete file)

my concerns with current design: - doesn't function as a true queue - patches are not always landed in the order they were submitted, which is unexpected - if there's a transient failure when trying to land a patch at the head of the queue, it'll be deferred and the next patch will be attempted - if the trees are closed for an extended period of time, this is likely to result in unexpected ordering of patches - a single "queue" is used for all repositories - we shouldn't be restricted to one worker for all repos - data stored unnecessarily on server - there's a pg database which holds the "queue" - once jobs are processed they are not removed from the database - service outages (eg. review board upgrades) may result in lost notifications proposed solution: - use pulse for queuing commits - removes data store from autoland server, simplifying deployment and code - report success and failures back to review board, also via pulse - review board should be the data store, not autoland - report to treeherder (maybe?), to provide a high level view of autoland's activities, and as a means to view the detailed autoland logs for success and failures - always process the commit at the head of the queue (ie. FIFO) - need to detect transient vs fatal failures - a transient failure should result in the job being retried - with a back-off - if retry attempts hit a max value, autoland should stop and alerts triggered for admins to deal with - admins need a mechanism to easily examine the job at the head of the queue and the failures encountered during processing - use a separate queue/topic/key for each repository - required when switching to a true queuing system - allows us to land to different repos at the same time - spin up a process/container/instance for each queue
Attached image autoland - current
Attached image autoland - proposed (obsolete) —
There will be some upcoming changes to autoland to support Servo that may require persistent state (read: a database). I'm all for using a proper queue, however. But don't get your heart set on killing the database :(
(In reply to Gregory Szorc [:gps] from comment #3) > There will be some upcoming changes to autoland to support Servo that may > require persistent state (read: a database). > > I'm all for using a proper queue, however. But don't get your heart set on > killing the database :( no worries -- if required we should use RDS (or S3?) for persistence, not a database running on the server.
Depends on: 1287537
Glob and I discussed this the other day and I will be working with him to use Pulse rather than the database as a message queue for the Servo autoland changes.
Product: MozReview → Conduit
Still something we very much want to do but needs prioritization versus other important Lando features.
Keywords: conduit-triaged
Whiteboard: [lando-backlog]
Blocks: 1266732
Most likely will use SQS not Pulse. Turning this into the overarching story.
Depends on: 1312140
Keywords: conduit-story
Summary: autoland should use a queue (amqp/pulse) instead of a database → autoland should use a queue instead of a database
Whiteboard: [lando-backlog]
Depends on: 1467694
Attachment #8770025 - Attachment is obsolete: true
Keywords: conduit-story
Priority: -- → P2

we've decided not to move to a queue based system; instead we'll integrate transplant directly into lando.

see bug 1544346.

Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Product: Conduit → Conduit Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: