self-serve can get bogged down if masters are slow

NEW
Unassigned

Status

Release Engineering
General
P2
normal
5 years ago
7 months ago

People

(Reporter: Gavin, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

I can't trigger new builds via self serve. philor points to https://secure.pub.build.mozilla.org/buildapi/self-serve/jobs, which seems to show that it scheduled a job at 2013-01-06 11:38:20, but then failed to at 2013-01-06 11:40:54 (same for all jobs since).
Assignee: nobody → nthomas
Priority: -- → P1
Looks like this has fixed itself up after taking 30 minutes to cancel a try run. Leaving open for debugging.
Looks like it's killing off running jobs that's to blame:, eg:

2013-01-06 11:41:46,441 cancelling request by gszorc@mozilla.com of 19257479
2013-01-06 11:41:46,444 request is running, going to cancel it!
2013-01-06 11:41:46,446 Cancelling at http://buildbot-master18.build.scl1.mozilla.com:8201/builders/Rev3%20Fedora%2012x64%20try%20talos%20svgr/builds/624/stop
2013-01-06 11:44:24,411 cancelling request by gszorc@mozilla.com of 19257481

2013-01-06 12:00:06,148 cancelling request by gszorc@mozilla.com of 19258445
2013-01-06 12:03:46,088 request is running, going to cancel it!
2013-01-06 12:03:46,096 Cancelling at http://buildbot-master16.build.scl1.mozilla.com:8201/builders/Rev3%20WINNT%206.1%20try%20talos%20svgr/builds/468/stop
2013-01-06 12:07:18,324 cancelling request by gszorc@mozilla.com of 19258446

There were delays on other masters too.

catlee, are we processing messages single-threaded, and wouldn't have picked up Gavin's requests until we were done with gszorc's ?
Summary: self-serve is busted → self-serve can get bogged down if masters are slow
Assignee: nthomas → nobody
Severity: critical → normal
Priority: P1 → P2
> catlee, are we processing messages single-threaded, and wouldn't have picked
> up Gavin's requests until we were done with gszorc's ?

Yes, we have one agent, and it processes requests one at a time.
This is happening again today - cancelling a try revision is being processed very slowly because bm24 is not working properly. How hard would it be to use threads/workers/hamsters to process messages separately ?
(Assignee)

Updated

4 years ago
Product: mozilla.org → Release Engineering
(Assignee)

Updated

7 months ago
Component: Tools → General
Product: Release Engineering → Release Engineering
You need to log in before you can comment on or make changes to this bug.