Closed Bug 1229921 Opened 9 years ago Closed 6 years ago

Cancelling all Buildbot jobs in Buildapi only cancels some jobs

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: armenzg, Unassigned)

References

Details

Is there a way in which we can the return call from Buildapi?

I cancelled two Linux Buildbot jobs within 2 minutes of starting and they run to completion.

This has happened to me many many times over the last two weeks.
Component: Treeherder → Tools
Product: Tree Management → Release Engineering
QA Contact: hwine
I tried cancelling Linux build jobs through TH and they did not cancel, hence, triggering all test jobs.

I tried cancelling all running tests jobs with the "cancel all builds on this revision" button, however, only a subset of all of them canceled [1]

You will see a lot of test jobs which canceled around 2015-12-02 13:37, however, you will see a lot of jobs completing to completion (they're currently running).

For example you will see "Ubuntu VM 12.04 x64 try debug test web-platform-tests-6" completing after the time I indicated above.

[1] https://secure.pub.build.mozilla.org/buildapi/self-serve/try/rev/3d23be00dbee
Summary: Cancelling Buildbot jobs from Treeherder does not currently work → Cancelling all Buildbot jobs in Buildapi only cancels some jobs
I tried cancelling all several times.
Yeah, those are all still running.

	
cancel_revision {'branch': 'try', 'revision': '3d23be00dbee'} 2015-12-02 13:44:44 		

cancel_revision {'branch': 'try', 'revision': '3d23be00dbee'} 2015-12-02 13:43:39 		

cancel_revision {'branch': 'try', 'revision': '3d23be00dbee'} 2015-12-02 13:38:49 		

cancel_revision {'branch': 'try', 'revision': '3d23be00dbee'} 2015-12-02 13:36:44 		

cancel_revision {'branch': 'try', 'revision': '3d23be00dbee'} 2015-12-02 13:33:51 		

cancel_request {'brid': 90092721} 2015-12-02 13:01:56 	2015-12-02 13:01:59 	
  {'body': {'msg': 'Error cancelling build (request 90092721)', 'errors': True}, 'request_id': 1668442}

cancel_request {'brid': 90092720} 2015-12-02 13:01:52 	2015-12-02 13:01:52 	
  {'body': {'msg': 'Error cancelling build (request 90092720)', 'errors': True}, 'request_id': 1668441}
Windows builds are particularly uninterested in being cancelled, leading to unwanted tests on an AWS pool with a tendency to have jobs run for 17 hours, a vastly more expensive AWS pool, and two horribly limited hardware pools.
Blocks: 1303152
e.g. https://treeherder.mozilla.org/#/jobs?repo=try&revision=85caac99b7cce725a87e67141ad1e6cf4fd759a0

Windows builds were both cancelled prior to calling sendchange, but they didn't actually stop, and so we ran full sets of tests on each.
This is resulting in approximately 10,000 hours per month of wasted work on Windows VMs.
Depends on: 1304075
Component: Tools → General
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.