Cancelling all Buildbot jobs in Buildapi only cancels some jobs

RESOLVED WONTFIX

Status

Release Engineering
General
RESOLVED WONTFIX
3 years ago
3 months ago

People

(Reporter: armenzg, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Is there a way in which we can the return call from Buildapi?

I cancelled two Linux Buildbot jobs within 2 minutes of starting and they run to completion.

This has happened to me many many times over the last two weeks.
Component: Treeherder → Tools
Product: Tree Management → Release Engineering
QA Contact: hwine
I tried cancelling Linux build jobs through TH and they did not cancel, hence, triggering all test jobs.

I tried cancelling all running tests jobs with the "cancel all builds on this revision" button, however, only a subset of all of them canceled [1]

You will see a lot of test jobs which canceled around 2015-12-02 13:37, however, you will see a lot of jobs completing to completion (they're currently running).

For example you will see "Ubuntu VM 12.04 x64 try debug test web-platform-tests-6" completing after the time I indicated above.

[1] https://secure.pub.build.mozilla.org/buildapi/self-serve/try/rev/3d23be00dbee
Summary: Cancelling Buildbot jobs from Treeherder does not currently work → Cancelling all Buildbot jobs in Buildapi only cancels some jobs
I tried cancelling all several times.
Yeah, those are all still running.

	
cancel_revision {'branch': 'try', 'revision': '3d23be00dbee'} 2015-12-02 13:44:44 		

cancel_revision {'branch': 'try', 'revision': '3d23be00dbee'} 2015-12-02 13:43:39 		

cancel_revision {'branch': 'try', 'revision': '3d23be00dbee'} 2015-12-02 13:38:49 		

cancel_revision {'branch': 'try', 'revision': '3d23be00dbee'} 2015-12-02 13:36:44 		

cancel_revision {'branch': 'try', 'revision': '3d23be00dbee'} 2015-12-02 13:33:51 		

cancel_request {'brid': 90092721} 2015-12-02 13:01:56 	2015-12-02 13:01:59 	
  {'body': {'msg': 'Error cancelling build (request 90092721)', 'errors': True}, 'request_id': 1668442}

cancel_request {'brid': 90092720} 2015-12-02 13:01:52 	2015-12-02 13:01:52 	
  {'body': {'msg': 'Error cancelling build (request 90092720)', 'errors': True}, 'request_id': 1668441}
Windows builds are particularly uninterested in being cancelled, leading to unwanted tests on an AWS pool with a tendency to have jobs run for 17 hours, a vastly more expensive AWS pool, and two horribly limited hardware pools.
Blocks: 1303152
e.g. https://treeherder.mozilla.org/#/jobs?repo=try&revision=85caac99b7cce725a87e67141ad1e6cf4fd759a0

Windows builds were both cancelled prior to calling sendchange, but they didn't actually stop, and so we ran full sets of tests on each.
This is resulting in approximately 10,000 hours per month of wasted work on Windows VMs.
Depends on: 1304075
(Assignee)

Updated

a year ago
Component: Tools → General
Product: Release Engineering → Release Engineering

Updated

3 months ago
Status: NEW → RESOLVED
Last Resolved: 3 months ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.