Closed
Bug 772458
Opened 12 years ago
Closed 11 years ago
Try is extremely backed up
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: khuey, Unassigned)
References
Details
(Keywords: sheriffing-P1, Whiteboard: [tryserver][capacity])
Wait times for tests are closing in on 24 hrs. Seems that Android and Windows are more backed up than the other platforms, but that's just anecdotal.
Updated•12 years ago
|
Severity: normal → major
Updated•12 years ago
|
Comment 1•12 years ago
|
||
Do we have any idea of what is happening? Do people push more to try those days than before? Do we simply need more slaves? It seems to be a critical issue for engineering: pushing to try takes a so ridiculous amount of time that it will whether reduce productivity or people will just push to m-i without waiting for full results.
Comment 2•12 years ago
|
||
(In reply to Mounir Lamouri (:mounir) from comment #1) > or people > will just push to m-i without waiting for full results. Which has already happened several times this week; with the ensuing layers of bustage made worse by high infra load on non-try trees and the coalescing that brings :-(
Comment 3•12 years ago
|
||
(In reply to Mounir Lamouri (:mounir) from comment #1) > Do we have any idea of what is happening? Do people push more to try those > days than before? Do we simply need more slaves? > > It seems to be a critical issue for engineering: pushing to try takes a so > ridiculous amount of time that it will whether reduce productivity or people > will just push to m-i without waiting for full results. One of the issues with test pool capacity is that *all* current tests run on Mac minis of various vintages. This is due to a historical notion that we wanted to be able to compare test results between different platforms/OSes on the same hardware. Apple's hardware rev cycle is aggressive, so we simply can't buy any more of these older rev minis any more. The existing pool capacity is static, modulo attrition via hardware failure. We can create some extra capacity on one platform only at the expense of another, e.g. stopping tests on 10.5 (bug 773120). We no longer think these inter-platform comparisons are meaningful. Releng is extremely resource-constrained at present for setting up new hardware. We have an effort underway to refresh our test pool to newer non-Mac hardware (for non-Mac OSes), but this is blocked by getting test coverage setup for Win8 (bug 731280) and Mountain Lion (10.8) (bug 731278), platforms where we currently have no coverage at all.
Updated•12 years ago
|
Comment 4•12 years ago
|
||
We've just had 38 consecutive pushes (81 changesets) of bustage on inbound, since Try results are taking so long to come back, that people are pushing regardless. Are there any other quick wins that we can do here? eg: disabling platforms/tests on twigs that could do without them; or re-balancing the try vs non-try buildpool? Looking at http://build.mozilla.org/builds/pending/pending.html shows that the Try linux compile pending count is always an order of magnitude higher than the others. Can we spare some more non-try linux builds for Try? The graph for non-try would imply there is capacity going unused that could be switched over perhaps?
Updated•12 years ago
|
Component: Release Engineering → Release Engineering: Developer Tools
QA Contact: lsblakk
Whiteboard: [tryserver][buildduty][capacity]
Comment 5•12 years ago
|
||
bug#750285, bug#777037 track disabling a bunch of android builds/unittest/talos jobs which will help reduce android load. This is an interim solution while we wait for additional tegras to come online.
Updated•12 years ago
|
Comment 6•12 years ago
|
||
If this bug and its dependents were resolved fix, what would the expected turn around time for try be? try used to a tremendously useful development tool. Now its pretty much just a pain.
Updated•12 years ago
|
Whiteboard: [tryserver][buildduty][capacity] → [tryserver][buildduty][capacity][sheriff-want]
Comment 7•12 years ago
|
||
We can turn off tests for the UX branch (https://tbpl.mozilla.org/?tree=UX). I've been maintaining the branch for the past N months (doing daily merges between m-c and ux). Devs sending their patch to UX branch can just run it through try server first and in total that will save some build resources since we have vastly more merges between m-c to ux then we do have checkins to ux. It should also be noted that ux is a dead-end branch, which doesn't feed anywhere but is used for functional testing of new ux features.
Updated•12 years ago
|
QA Contact: lsblakk → hwine
Comment 8•12 years ago
|
||
Now that we're running android/b2g/nativefennec builds over on AWS, we're freeing up cycles on our linux32/linux64 machines. bug#784891 tracks converting a bunch of existing linux32/linux64/win32 physical ix builders into win64 builders. This will improve wait times for windows builds in both the production build pool and try build pool.
Depends on: 784891
Comment 9•12 years ago
|
||
(In reply to Jared Wein [:jaws] from comment #7) > We can turn off tests for the UX branch (https://tbpl.mozilla.org/?tree=UX). Done in bug 779419.
Comment 10•12 years ago
|
||
This isn't an acute issue, and thus not a buildduty concern.
Whiteboard: [tryserver][buildduty][capacity][sheriff-want] → [tryserver][capacity][sheriff-want]
Updated•12 years ago
|
Keywords: sheriffing-P1
Whiteboard: [tryserver][capacity][sheriff-want] → [tryserver][capacity]
Updated•12 years ago
|
Depends on: toodamnhigh!
Comment 11•11 years ago
|
||
How are try turn around times these days?
Comment 12•11 years ago
|
||
:khuey: per dev.tree-management, we've been hitting consistently great wait times on builds and tests, across the board including Try. This is thanks to moving more jobs to AWS, reshuffling existing hardware inhouse, and turning off broken builds/tests. Any objections to closing this as FIXED?
Flags: needinfo?(khuey)
Reporter | ||
Comment 13•11 years ago
|
||
Yeah I think we're doing pretty well these days. Someone else can file a new bug if they have current issues. Good job folks.
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(khuey)
Resolution: --- → FIXED
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Assignee | ||
Updated•7 years ago
|
Component: Tools → General
You need to log in
before you can comment on or make changes to this bug.
Description
•