Closed Bug 1171046 Opened 9 years ago Closed 9 years ago

Use a single transaction per channel when aggregating.

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: rvitillo, Assigned: rvitillo)

Details

Roberto Agostino Vitillo (:rvitillo)

Assignee

Description

•

9 years ago

The v4 aggregation job uses multiple separate connections to speed up upserts. To avoid to rollback to a backup in case of failure though, we should parallelize on channels, i.e. have a single transaction for each channel, rather than on the individual upserts.

Mark Reid [:mreid]

Updated

•

9 years ago

Priority: -- → P2

Roberto Agostino Vitillo (:rvitillo)

Assignee

Comment 1

•

9 years ago

As the db is partitioned in subtables by (channel, build-id), I decided to use a transaction per subtable. There is a rather long tail when updating recent build-ids but, as the upsert performance degraded only by about 2x, I consider it still acceptable considering the gains. By combining multiple channels we should be able to allievate the long tail effect.

I am also keeping track of the dates in which a subtable was updated, which effectively renders the aggregation updates idempotent.

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

BMO Automation

Updated

•

6 years ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Use a single transaction per channel when aggregating.

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P2)

Tracking

(Not tracked)

People

(Reporter: rvitillo, Assigned: rvitillo)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Updated