Closed Bug 1490759 Opened 6 years ago Closed 5 years ago

Treeherder is slow while loading bug suggestions and logs

Categories

(Tree Management :: Treeherder: API, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: dluca, Unassigned)

References

(Blocks 1 open bug)

Details

      No description provided.
Yeah we've been having a few response time alerts.

Load caused by the jobs API has jumped up, starting about 90 minutes ago:
https://screenshots.firefox.com/L9FKoImaIY0aH3EC/rpm.newrelic.com
https://screenshots.firefox.com/rVIPWEb6IvMmh8xd/rpm.newrelic.com

...and a corresponding load on the DB from selects on the jobs table:
https://screenshots.firefox.com/8JEezhMMQsDzTX2o/rpm.newrelic.com
https://screenshots.firefox.com/5PKbOMYCqZvpvrpY/rpm.newrelic.com

...and the prod RDS instance has much higher CPU load than normal:
https://screenshots.firefox.com/eXkEIfRULYnqw4j8/console.aws.amazon.com

...but none of the other stats have jumped up (eg throughput shown on New Relic, read/write I/O on the DB), which is strange. Perhaps AWS noisy neighbour?

The DB CPU load appears to have improved slightly over the last 5-10 mins.
Component: Treeherder → Treeherder: Infrastructure
Priority: -- → P1
Summary: Treehderder is slow while loading bug suggestions and logs → Treeherder is slow while loading bug suggestions and logs
Things have returned to normal since:
https://screenshots.firefox.com/hXz3Lf2ZpGNlqmQi/rpm.newrelic.com

It's still not entirely clear what the root cause was.
Assignee: nobody → emorley
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Ryan and Andreea experience the issue again. Does newrelic show the issue for users in general?
Status: RESOLVED → REOPENED
Flags: needinfo?(emorley)
Resolution: FIXED → ---
New Relic doesn't show any further spikes:
https://screenshots.firefox.com/dAaOlEF39hVkfxnj/rpm.newrelic.com

Testing prod Treeherder now, it responds fast for me.

It would be useful to know if the issues they see are UI load times/responsiveness or duration for API XHR calls related.
Flags: needinfo?(emorley)
Hi.

The issues i'm having are regarding the duration of API XHR calls. The most frequent ones are regarding failure selection/retrieving data for them (summary/suggestions), classification saving - switching to a next failure.
Blocks: 1504990
Assignee: emorley → nobody
Status: REOPENED → NEW
Component: Treeherder: Infrastructure → Treeherder: API

How bad is the impact of this?

IMHO a worksforme, bug 1553199 is the issue observed these days.

Status: NEW → RESOLVED
Closed: 6 years ago5 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.