Closed Bug 773981 Opened 12 years ago Closed 12 years ago

BIND crashes with assertion failure

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: dumitru, Assigned: dumitru)

Details

(Whiteboard: [buildduty][outage])

Dumitru Gherman [:dumitru]

Assignee

Description

•

12 years ago

named on ns2.private.scl3 hit a bug last night:

14-Jul-2012 02:02:25.131 general: critical: rbtdb.c:1619: INSIST(!((void *)((node)->deadlink.prev) != (void *)(-1))) failed
14-Jul-2012 02:02:25.131 general: critical: exiting (due to assertion failure)

bind version is 9.8.2, release 0.10.rc1.el6, OS RHEL 6.3.

First reported on a ISC mailing list:
https://lists.isc.org/pipermail/bind-users/2012-February/086793.html

ISC patched it in bind 9.8.2rc2, per https://lists.isc.org/pipermail/bind-announce/2012-March/000766.html [RT #27738]

Red Hat didn't ship an update yet, although it's been 4 months since ISC patched this:

https://bugzilla.redhat.com/show_bug.cgi?id=837165

Adrian J Fernandez [:Aj]

Comment 1

•

12 years ago

named crashed again and was restarted.

Just for clarification, running OS version is RHEL 6.2 (x86_64)

Dumitru Gherman [:dumitru]

Assignee

Comment 2

•

12 years ago

(In reply to Adrian Fernandez [:Aj] from comment #1)

> Just for clarification, running OS version is RHEL 6.2 (x86_64)

Yeah, OS doesn't matter too much, all RHEL 6 flavors that use that named package are affected.

Mike Taylor [:bear]

Comment 3

•

12 years ago

this has happened again in scl1 and has caused some build jobs to fail

Whiteboard: [buildduty][outage]

Dustin J. Mitchell [:dustin] (he/him)

Comment 4

•

12 years ago

Just happened on admin1a.infra.scl1 too:

17-Jul-2012 15:25:46.262 general: critical: rbtdb.c:1619: INSIST(!((void *)((node)->deadlink.prev) != (void *)(-1))) failed
17-Jul-2012 15:25:46.262 general: critical: exiting (due to assertion failure)

Is this worth building a patched RPM?

Mike Taylor [:bear]

Comment 5

•

12 years ago

for buildduty's benefit:

first nagios alerts happened at 1528:

[15:28]  <nagios-releng-scl1> [09] buildbot-master06.build.scl1:MySQL connectivity is WARNING: Unknown MySQL server host buildbot-rw-vip.db.scl3.mozilla.com (1)
[15:30]  <nagios-releng-scl1> [10] buildbot-master15.build.scl1:MySQL connectivity is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds.

and continued until 1546:

[15:46]  <nagios-releng-scl1> buildbot-master21.build.scl1 is UP: PING OK - Packet loss = 0%, RTA = 0.64 ms

Adrian J Fernandez [:Aj]

Comment 6

•

12 years ago

named restarted on ns2.private.scl3 (again).

However, besides the known bug, seems odd that this is only occurring on ns2 and not ns1 as well.

Dustin J. Mitchell [:dustin] (he/him)

Comment 7

•

12 years ago

What do you think about either adding monitoring for named processes to nagios, or (better) monitoring the named daemon in keepalived so that the VIP fails over when this occurs?  Apologies if I'm not being helpful..

Dumitru Gherman [:dumitru]

Assignee

Comment 8

•

12 years ago

I filed a case with Red Hat to address this.

Dumitru Gherman [:dumitru]

Assignee

Comment 9

•

12 years ago

http://rhn.redhat.com/errata/RHBA-2012-1107.html

Assignee: server-ops-infra → dgherman

Dumitru Gherman [:dumitru]

Assignee

Comment 10

•

12 years ago

Seems that puppet upgraded this across our infra.
Verified a couple of hosts and they have the new package.

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

11 years ago

Component: Server Operations: Infrastructure → Infrastructure: Other

Product: mozilla.org → Infrastructure & Operations

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

BIND crashes with assertion failure

Categories

(Infrastructure & Operations :: Infrastructure: Other, task)

Tracking

(Not tracked)

People

(Reporter: dumitru, Assigned: dumitru)

References

Details

(Whiteboard: [buildduty][outage])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated