Closed Bug 1090284 Opened 10 years ago Closed 10 years ago

Investigate recent regression in pushlog ingestion performance

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: emorley, Assigned: mdoglio)

References

Details

(Keywords: perf)

Attachments

(1 file)

Github PR #274 on treeherder-service 10 years ago Mauro Doglio [:mdoglio] 54 bytes, text/x-github-pull-request		Details \| Review

Ed Morley [:emorley]

Reporter

Description

•

10 years ago

Mauro mentioned in last week's meeting that there was a regression recently - we should figure out what caused it. I can't remember now - but was it on pushlog or job ingestion? (or both?)

Ed Morley [:emorley]

Reporter

Updated

•

10 years ago

Keywords: perf

Mauro Doglio [:mdoglio]

Assignee

Comment 1

•

10 years ago

It's on pushlog ingestion, here is my theory on that. This is a chart of the SQL volume in the last four weeks https://rpm.newrelic.com/public/charts/hsSFpmsqWtc I believe the huge increase there corresponds with the day bug 1083305 landed. There are pushes that contain thousands of revisions and the database take a lot of time to ingest those. 3 different insertions are required in order to store correctly a push: 1 insertion on the result_set table N insertions on the revision table N insertions on the revision_map table where N is the number of revisions for that push. Before the changes requested by bug 1083305 landed, the ingestion of these kind of pushes failed often on the second or third step. The push was then ingested again on the next round of data ingestion, but the corresponding result_set was already there and the push was skipped. This is not the case anymore because half-stored pushes will keep being submitted and the queries on revision and revision_map will be executed again and again.

Mauro Doglio [:mdoglio]

Assignee

Comment 2

•

10 years ago

I noticed yesterday that the production memcached instance doesn't contain a last_push key for all the repositories. And where the key is present, the value is empty. The result is that we keep ingesting data over and over no matter we stored it already. Still digging into what's causing this

Mauro Doglio [:mdoglio]

Assignee

Comment 3

•

10 years ago

Attached file Github PR #274 on treeherder-service — Details

Mauro Doglio [:mdoglio]

Assignee

Comment 4

•

10 years ago

attachment 8514303 [details] [review] verifies that we are correctly storing the last_push when we ingest a collection of result-sets

Ed Morley [:emorley]

Reporter

Updated

•

10 years ago

Blocks: 1076750

Summary: Investigate recent regression in ingestion performance → Investigate recent regression in pushlog ingestion performance

Treeherder GitHub Bugbot

Comment 5

•

10 years ago

Commit pushed to master at https://github.com/mozilla/treeherder-service https://github.com/mozilla/treeherder-service/commit/0611b880ff1158363c6be511587e14475ea5a9c5 Bug 1090284 - Add a test to verify that the last push is cached

Ed Morley [:emorley]

Reporter

Updated

•

10 years ago

Blocks: 1096919

Ed Morley [:emorley]

Reporter

Updated

•

10 years ago

No longer blocks: 1080757

Component: Treeherder → Treeherder: Data Ingestion

Mauro Doglio [:mdoglio]

Assignee

Comment 6

•

10 years ago

The root cause was a missing netflow between the etl nodes and the memcached instances. This is fixed now

Status: NEW → RESOLVED

Closed: 10 years ago

Resolution: --- → FIXED

Ed Morley [:emorley]

Reporter

Updated

•

10 years ago

Assignee: nobody → mdoglio

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Investigate recent regression in pushlog ingestion performance

Categories

(Tree Management :: Treeherder: Data Ingestion, defect, P1)

Tracking

(Not tracked)

People

(Reporter: emorley, Assigned: mdoglio)

References

Details

(Keywords: perf)

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Comment 5

Updated

Updated

Comment 6

Updated

Attachment

General

Description

File Name

Content Type