Open Bug 1296077 Opened 9 years ago Updated 6 years ago

Jobs stuck in the "running" state

Categories

(Tree Management :: Treeherder: Data Ingestion, defect, P2)

defect

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: ekyle, Assigned: ekyle)

References

Details

This comment deals with pending, so may not apply here. But I wonder if Buildbot is not properly informing Treeherder when things go different than usual. https://bugzilla.mozilla.org/show_bug.cgi?id=1296329#c16
Hi! The links in comment 0 no longer work - is this still happening?
Flags: needinfo?(klahnakoski)
Yes, this is still happening: SELECT * FROM job WHERE state='running' and last_modified>date_add(now(), INTERVAL -30 day) AND last_modified<date_add(now(), INTERVAL -10 day) ORDER BY last_modified DESC LIMIT 100
Flags: needinfo?(klahnakoski)
Priority: -- → P1
SELECT j.id, j.state, j.result, j.last_modified, j.start_time, j.submit_time, rds.build_system_type FROM job j left join reference_data_signatures rds on j.signature_id = rds.id WHERE j.state='running' and j.last_modified>date_add(now(), INTERVAL -30 day) AND j.last_modified<date_add(now(), INTERVAL -10 day) ORDER BY j.last_modified DESC LIMIT 100 Looks like these are all from Taskcluster when I ran it just now

Running the query from comment 4 now, I still see recent instances - so the removal of buildbot bridge et al hasn't helped.

I'm pretty sure bug 1470243 and bug 1532255 will help prevent at least a subset of these cases.

Once the PRs in those bugs have been merged/deployed and a few weeks have passed it would be good to re-run the query in comment 4 to see if there have been any further occurrences.

Depends on: 1470243, 1532255

Kyle, assigning this to you for follow-up query between now and the end of March. If it persists, we'll put it on the work queue for Q2.

Assignee: nobody → klahnakoski
Priority: P1 → P2

I have confirmed that this persists.

Status: NEW → ASSIGNED
You need to log in before you can comment on or make changes to this bug.