The default bug view has changed. See this FAQ.

Java is choking on leap second.

RESOLVED FIXED in Unreviewed

Status

Mozilla Metrics
Metrics Operations
--
blocker
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: ericz, Assigned: ericz)

Tracking

unspecified
Unreviewed

Details

(Assignee)

Description

5 years ago
Servers running java apps such as Hadoop and ElasticSearch and java doesn't appear to be working.  We believe this is related to the leap second happening tonight becuase it happened at midnight GMT.
(Assignee)

Comment 1

5 years ago
Elevating to blocker.  I believe we need to restart Java everywhere, and possibly reboot servers but need some feedback from Hadoop owners, etc.
Severity: critical → blocker
opening this bug up
Group: metrics-private

Comment 3

5 years ago
Still needs to be confirmed, but I was able to fix one of the issues with an elasticsearch server that I have installed by manually adjusting the date "date --help" (there was a service restart involved, but no reboots)
See Also: → bug 769973, bug 769971
We are updating kernels and rebooting HBase clusters right now.

Comment 5

5 years ago
For those machines that shouldn't be rebooted:

/etc/init.d/ntp stop; date; date `date +"%m%d%H%M%C%y.%S"`; date;

Then restart affected Java applications.
This stops ntpd, sets the date manually to the current date, confirms it.
You may or may not get the bug back after restarting ntpd.
(Assignee)

Updated

5 years ago
Assignee: nobody → eziegenhorn
(In reply to Ricardo Pardini from comment #5)
> For those machines that shouldn't be rebooted:
> 
> /etc/init.d/ntp stop; date; date `date +"%m%d%H%M%C%y.%S"`; date;

we are injecting this fix into all systems through our base puppet module as we speak

Comment 7

5 years ago
For what it's worth, we've stabilized our java apps across our servers simply via:

date; date `date +"%m%d%H%M%C%y.%S"`; date;

The CPU of the JVMs drops instantly when that is run. There was no need to stop/restart ntpd nor the JVMs themselves.

Comment 8

5 years ago
(In reply to Mina Naguib from comment #7)

> The CPU of the JVMs drops instantly when that is run. There was no need to
> stop/restart ntpd nor the JVMs themselves.

I've got mixed results, some machines go back to 100% when ntpd is restarted, some don't. Out of uncertainty, I'm keeping ntpd stopped for now. I will bring some back online later and report.

Comment 9

5 years ago
> I'm keeping ntpd stopped for now.
> I will bring some back online later and report.

I've brought ntpd back online now on all my servers, and it seems stable.
It definitely caused the CPU issue to reappear some time before, but no longer.
(Assignee)

Comment 10

5 years ago
Socorro and most Hadoop stuff is back up.  Everything else hadoop-related can wait until Monday.
For reference, this is the fix that is getting pushed out:  http://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.