Closed Bug 677004 Opened 13 years ago Closed 11 years ago

usebuildbot=1 has too much lag between when a job finishes and when it is displayed

Tracking

(Not tracked)

Status:

RESOLVED WORKSFORME

People

(Reporter: philor, Unassigned)

References

Details

(Keywords: regression, sheriffing-untriaged)

Phil Ringnalda (:philor)

Reporter

Description

•

13 years ago

By comparison with tinderbox-based when tinderbox is four hours lagged reading mail, usebuildbot=1 looks good, but head to head, when tinderbox isn't lagged, it looks terrible. The job I was just watching actually finished at 19:56, tinderbox claims it finished at 19:57, I starred it at 20:00 so it was visible on tinderbox-based tbpl by my 19:59 refresh, but usebuildbot didn't show it until my 20:05 refresh (though it was quick to remove the running grey-letter, making it look like the job just disappeared for 8 minutes). We have to dump tinderbox, so we have to do what we can do, but zomg, if you proposed doing something which would do what this essentially does, add 5-10 minutes onto the time it takes to run every single build and test job, it better come with Shetland unicorn rides for everyone to make up for it.

Nick Thomas [:nthomas] (UTC+12)

Updated

•

13 years ago

Depends on: 681834

Nick Thomas [:nthomas] (UTC+12)

Comment 1

•

13 years ago

There's a few steps in getting data from buildbot to users: 1, insert finished jobs into status db cron job on each buildbot master that looks for newly finished builds, and inserts them in the db. Runs every 10 minutes, mostly takes 30s to insert (sometimes 90) 2, recreate builds-4hr.js.gz cron job on cruncher. Runs every minute, 10-30s to generate 3, copy builds-4hr.js.gz to build.m.o cron job on cruncher. Runs every minute, very quick to transfer a few hundred KB 4, expiry header The Apache config on build.m.o sets an Expiry header of 'access plus 1 minute' when the gz file is requested. Non-issue if the tbpl refresh interval is that or longer. So that's roughly 15 minutes worst case. Possible improvements * running the cron job on the masters more frequently, bug 681834 to look into this * refactor 1 so that builds are individually added to the db on completion. Bug 662885 will provide a backend to do this * push the file over after 2, instead of waiting for the pull to happen

Phil Ringnalda (:philor)

Reporter

Updated

•

13 years ago

Keywords: regression

Phil Ringnalda (:philor)

Reporter

Comment 2

•

13 years ago

* move 2 and 3 off cruncher, which maybe isn't a good place for tier 1 jobs to run I think because of a helping hand from bug 714406, my lag last night (while I had jobs finishing during the time when builds-2011-12-31.js.gz was being created and the load on cruncher was being alerted as 10 to 12) was around 50 minutes.

Ed Morley [:emorley]

Updated

•

12 years ago

Whiteboard: [sheriff-want]

Nick Thomas [:nthomas] (UTC+12)

Comment 3

•

12 years ago

Here are the current steps when a job finishes: * immediately on job finish we start uploading the log, insert the job into statusdb, and send the pulse message. This takes a few seconds * we generate build-running.js, builds-pending.js, and builds-4hr.js every minute, taking a few seconds. There's no rsync to move the file any more Are you seeing delays longer than a minute or two these days ? How often does tbpl import data ?

Ed Morley [:emorley]

Comment 4

•

12 years ago

(In reply to Nick Thomas [:nthomas] from comment #3) > Are you seeing delays longer than a minute or two these days ? How often > does tbpl import data ? I believe the tbpl cron job for running import-buildbot-data.py is set to 5 mins. TBPL's client side then refreshes every 2 mins. The situation is a bit better than it was, but there still seems to be a bit of a lag at times (though it may just be when you hit the worst case of 60-90 secs for the steps in comment 3 + 5 min tbpl cron + 2 min tbpl client side; total 8-9 mins). I'll try to keep an eye out to see what the delays are in reality.

Nick Thomas [:nthomas] (UTC+12)

Comment 5

•

12 years ago

Maybe we should be looking at shortening up the tbpl cron, or thinking about pushing data from buildbot directly into tbpl via API.

Ed Morley [:emorley]

Updated

•

12 years ago

Keywords: sheriffing-untriaged

Whiteboard: [sheriff-want]

Phil Ringnalda (:philor)

Reporter

Comment 6

•

11 years ago

Or just get used to the way things are.

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → WORKSFORME

Nobody; OK to take it and work on it

Assignee

Updated

•

10 years ago

Product: Webtools → Tree Management

Nobody; OK to take it and work on it

Assignee

Updated

•

10 years ago

Product: Tree Management → Tree Management Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

usebuildbot=1 has too much lag between when a job finishes and when it is displayed

Categories

(Tree Management Graveyard :: TBPL, defect)

Tracking

(Not tracked)

People

(Reporter: philor, Unassigned)

References

Details

(Keywords: regression, sheriffing-untriaged)

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Updated

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Updated

Updated