The v4 aggregation job uses multiple separate connections to speed up upserts. To avoid to rollback to a backup in case of failure though, we should parallelize on channels, i.e. have a single transaction for each channel, rather than on the individual upserts.
As the db is partitioned in subtables by (channel, build-id), I decided to use a transaction per subtable. There is a rather long tail when updating recent build-ids but, as the upsert performance degraded only by about 2x, I consider it still acceptable considering the gains. By combining multiple channels we should be able to allievate the long tail effect. I am also keeping track of the dates in which a subtable was updated, which effectively renders the aggregation updates idempotent.