Closed
Bug 677004
Opened 13 years ago
Closed 11 years ago
usebuildbot=1 has too much lag between when a job finishes and when it is displayed
Categories
(Tree Management Graveyard :: TBPL, defect)
Tree Management Graveyard
TBPL
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: philor, Unassigned)
References
Details
(Keywords: regression, sheriffing-untriaged)
By comparison with tinderbox-based when tinderbox is four hours lagged reading mail, usebuildbot=1 looks good, but head to head, when tinderbox isn't lagged, it looks terrible.
The job I was just watching actually finished at 19:56, tinderbox claims it finished at 19:57, I starred it at 20:00 so it was visible on tinderbox-based tbpl by my 19:59 refresh, but usebuildbot didn't show it until my 20:05 refresh (though it was quick to remove the running grey-letter, making it look like the job just disappeared for 8 minutes).
We have to dump tinderbox, so we have to do what we can do, but zomg, if you proposed doing something which would do what this essentially does, add 5-10 minutes onto the time it takes to run every single build and test job, it better come with Shetland unicorn rides for everyone to make up for it.
Comment 1•13 years ago
|
||
There's a few steps in getting data from buildbot to users:
1, insert finished jobs into status db
cron job on each buildbot master that looks for newly finished builds, and inserts them in the db. Runs every 10 minutes, mostly takes 30s to insert (sometimes 90)
2, recreate builds-4hr.js.gz
cron job on cruncher. Runs every minute, 10-30s to generate
3, copy builds-4hr.js.gz to build.m.o
cron job on cruncher. Runs every minute, very quick to transfer a few hundred KB
4, expiry header
The Apache config on build.m.o sets an Expiry header of 'access plus 1 minute' when the gz file is requested. Non-issue if the tbpl refresh interval is that or longer.
So that's roughly 15 minutes worst case.
Possible improvements
* running the cron job on the masters more frequently, bug 681834 to look into this
* refactor 1 so that builds are individually added to the db on completion. Bug 662885 will provide a backend to do this
* push the file over after 2, instead of waiting for the pull to happen
Reporter | ||
Updated•13 years ago
|
Keywords: regression
Reporter | ||
Comment 2•13 years ago
|
||
* move 2 and 3 off cruncher, which maybe isn't a good place for tier 1 jobs to run
I think because of a helping hand from bug 714406, my lag last night (while I had jobs finishing during the time when builds-2011-12-31.js.gz was being created and the load on cruncher was being alerted as 10 to 12) was around 50 minutes.
Updated•12 years ago
|
Whiteboard: [sheriff-want]
Comment 3•12 years ago
|
||
Here are the current steps when a job finishes:
* immediately on job finish we start uploading the log, insert the job into statusdb, and send the pulse message. This takes a few seconds
* we generate build-running.js, builds-pending.js, and builds-4hr.js every minute, taking a few seconds. There's no rsync to move the file any more
Are you seeing delays longer than a minute or two these days ? How often does tbpl import data ?
Comment 4•12 years ago
|
||
(In reply to Nick Thomas [:nthomas] from comment #3)
> Are you seeing delays longer than a minute or two these days ? How often
> does tbpl import data ?
I believe the tbpl cron job for running import-buildbot-data.py is set to 5 mins. TBPL's client side then refreshes every 2 mins.
The situation is a bit better than it was, but there still seems to be a bit of a lag at times (though it may just be when you hit the worst case of 60-90 secs for the steps in comment 3 + 5 min tbpl cron + 2 min tbpl client side; total 8-9 mins).
I'll try to keep an eye out to see what the delays are in reality.
Comment 5•12 years ago
|
||
Maybe we should be looking at shortening up the tbpl cron, or thinking about pushing data from buildbot directly into tbpl via API.
Updated•12 years ago
|
Keywords: sheriffing-untriaged
Whiteboard: [sheriff-want]
Reporter | ||
Comment 6•11 years ago
|
||
Or just get used to the way things are.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Assignee | ||
Updated•10 years ago
|
Product: Webtools → Tree Management
Assignee | ||
Updated•10 years ago
|
Product: Tree Management → Tree Management Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•