Open
Bug 1296077
Opened 9 years ago
Updated 6 years ago
Jobs stuck in the "running" state
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P2)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
ASSIGNED
People
(Reporter: ekyle, Assigned: ekyle)
References
Details
A few jobs are old, and still "running".
http://activedata.allizom.org/tools/query.html#query_id=Q8PSy813
Here is a specific example:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=eb74a01c8dc0bc508ea8b492e6a4c179e1c2ff19&selectedJob=32358519&exclusion_profile=false
A subset of the same problem is where we can identify a specific task
http://activedata.allizom.org/tools/query.html#query_id=hGBmQc2H
Again, an example
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=8dc198cd46fff3b1f6e39ea6e80bb4507bf2cdbe&selectedJob=32671050
Assignee | ||
Comment 1•9 years ago
|
||
This comment deals with pending, so may not apply here. But I wonder if Buildbot is not properly informing Treeherder when things go different than usual.
https://bugzilla.mozilla.org/show_bug.cgi?id=1296329#c16
Comment 2•8 years ago
|
||
Hi!
The links in comment 0 no longer work - is this still happening?
Flags: needinfo?(klahnakoski)
Assignee | ||
Comment 3•8 years ago
|
||
Yes, this is still happening:
SELECT
*
FROM
job
WHERE
state='running' and
last_modified>date_add(now(), INTERVAL -30 day) AND
last_modified<date_add(now(), INTERVAL -10 day)
ORDER BY
last_modified DESC
LIMIT
100
Flags: needinfo?(klahnakoski)
Updated•8 years ago
|
Priority: -- → P1
Comment 4•7 years ago
|
||
SELECT
j.id, j.state, j.result, j.last_modified, j.start_time, j.submit_time, rds.build_system_type
FROM
job j
left join reference_data_signatures rds
on j.signature_id = rds.id
WHERE
j.state='running' and
j.last_modified>date_add(now(), INTERVAL -30 day) AND
j.last_modified<date_add(now(), INTERVAL -10 day)
ORDER BY
j.last_modified DESC
LIMIT
100
Looks like these are all from Taskcluster when I ran it just now
Comment 5•7 years ago
|
||
Running the query from comment 4 now, I still see recent instances - so the removal of buildbot bridge et al hasn't helped.
Comment 6•6 years ago
|
||
I'm pretty sure bug 1470243 and bug 1532255 will help prevent at least a subset of these cases.
Once the PRs in those bugs have been merged/deployed and a few weeks have passed it would be good to re-run the query in comment 4 to see if there have been any further occurrences.
Comment 7•6 years ago
|
||
Kyle, assigning this to you for follow-up query between now and the end of March. If it persists, we'll put it on the work queue for Q2.
Assignee: nobody → klahnakoski
Priority: P1 → P2
Assignee | ||
Comment 8•6 years ago
|
||
I have confirmed that this persists.
Updated•6 years ago
|
Status: NEW → ASSIGNED
You need to log in
before you can comment on or make changes to this bug.
Description
•