At some point in the past two months, our (SUMO) test suite switched to use a separate ES cluster. This improved out times a bunch and our test suite was taking around 5-10minutes reliably. In the past month, something happened and now our tests times are all over the place (depending on load? or?). I've seen them take up to about an hour. The last 5 tests were 42, 24, 16, 23, and 22 minutes. :groovecoder told me that they are seeing the same thing for mdn. Can we figure out what happened and fix this? It really affects our dev process since we do continuous deployment.
You can see all our build times here: https://ci.mozilla.org/job/sumo-master/buildTimeTrend
For a better graph, I grabbed the last 800 builds from the Jenkins API, and graphed them here: http://bl.ocks.org/mythmon/6072959. That shows each build, and you can hover over it to get the date, and it also shows a 30 build moving average , which shows that average response times have been going up lately.
More graphs: http://bl.ocks.org/mythmon/raw/6073408/ This shows 5 projects, and the trend of increasing build times is evident in all of them, except socorro-crashstats. socorro-crashstats does not (I am told) use a database, so my theory is that the database is what is causing the slowness here.
Assignee: server-ops-webops → bburton
Component: Server Operations: Web Operations → WebOps: Socorro
Product: mozilla.org → Infrastructure & Operations
We're seeing odd slowness from other jobs, such as socorro-*, as well. I am going to restart Jenkins at 9PM PDT tonight to see if that helps.
Status: NEW → ASSIGNED
Jenkins and data services restarted at 9:01PM Jenkins happy at 9:40PM Let's see how your builds go over the next couple of days
2 builds on mdn were 7m then 12m 8 builds on mdn-github were about the same
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.