Closed Bug 930945 Opened 12 years ago Closed 11 years ago

Investigate why TBPL's dataimport cron had hung in bug 930383

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: emorley, Unassigned)

References

Details

(In reply to Shyam Mani [:fox2mike] from comment #4) > And tbpl's import crons were hosed, had to kill the existing ones and fire > off new ones on both dev and prod. (In reply to Ed Morley [:edmorley UTC+1] from comment #6) > I'm not sure why bzapi being down meant the TBPL crons got stuck. The > backend bzapi calls (as opposed to the UI tooltip meta calls) are only done > after data import, by > https://hg.mozilla.org/webtools/tbpl/file/16e6a0cc29c0/php/inc/ > AnnotatedSummaryGenerator.php#l200 , via > https://hg.mozilla.org/webtools/tbpl/file/16e6a0cc29c0/php/getLogExcerpt.php > . These are done by workers spawned from > https://hg.mozilla.org/webtools/tbpl/file/16e6a0cc29c0/dataimport/import- > buildbot-data.py#l363 > > However after 60s we should hit the timeout at > https://hg.mozilla.org/webtools/tbpl/file/16e6a0cc29c0/php/inc/ > ParallelLogGenerating.php#l37 aborting that getLogExcerpt.php call, so > shouldn't end up with a backlog of workers. (And even if we did, the new > jobs are inserted before spawning the workers, so we'd at least see the new > jobs on TBPL, unless we starved the webhead of resources). Shyam, do you have any more debug info as to the state of the hung processes before they were killed?
(In reply to Ed Morley [:edmorley UTC+1] from comment #0) > Shyam, do you have any more debug info as to the state of the hung processes > before they were killed?
Flags: needinfo?(shyam)
(or the resource load on that box etc)
(In reply to Ed Morley [:edmorley UTC+1] from comment #1) > (In reply to Ed Morley [:edmorley UTC+1] from comment #0) > > Shyam, do you have any more debug info as to the state of the hung processes > > before they were killed? They were waiting on a read and were stuck for a while. I don't think the box was constrained in any way.
Flags: needinfo?(shyam)
(In reply to Shyam Mani [:fox2mike] from comment #3) > They were waiting on a read and were stuck for a while. I don't think the > box was constrained in any way. Of the DB, or ...?
Assignee: nobody → server-ops-webops
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → INCOMPLETE
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.