Closed Bug 810540 Opened 12 years ago Closed 12 years ago

Diagnose ElasticSearch timeouts on Mozillians jenkins builds.

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

x86_64
Windows 7
task
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 803599

People

(Reporter: sancus, Assigned: dmaher)

References

Details

We're getting ES timeouts on jenkins builds: https://ci.mozilla.org/job/mozillians/296/console

I've tried reverting the code all the way back to the build previous to that, which was green, but it still failed: https://ci.mozilla.org/job/mozillians/306/console

Thus, I'm at least fairly sure that we didn't cause this string of failures directly with a code change, and I'd like some help trying to figure out what's going on here so we can get it corrected and get our builds back to green.

Thanks!
I increased the ES timeout in jenkins settings and builds returned to green.

rel commit: https://github.com/mozilla/mozillians/commit/2c654dd60d18856c5a660ef3c998853faa1d0564
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
It seems that the problem is back.

E.g. https://ci.mozilla.org/job/mozillians/333/console

"""
...

TimeoutError: Request timed out after 5.000000 seconds
...
"""
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This is due to a general resource problem on the Jenkins box.  Basically, Elasticsearch and Jenkins are constantly competing for RAM and CPU, which results in one or the other failing in unpredictable ways.  The plan is to build a new services node / cluster to support Jenkins (bug 811380).

That said, I'm not sure that there's really a good short-term solution for the behaviour you're currently experiencing. :/
Depends on: 811380
Thanks for the update Daniel!

As you said Jenkins / ES are failing randomly and with some luck and multiple tries we still get to run our tests. I guess this is the short-term "solution" for this problem :)
As per comment #3 and bug 803599, closing as dupe.
Assignee: server-ops-webops → dmaher
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → DUPLICATE
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.