Closed
Bug 833883
Opened 11 years ago
Closed 11 years ago
svn1.dmz.phx1 issues
Categories
(Developer Services :: General, task)
Developer Services
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dumitru, Assigned: bkero)
References
Details
After svn1 crashed (bug 831642), we received two reports that sometimes svn commits fail (see bug 833830 and bug 833001). Looking at svn1: [root@svn1.dmz.phx1 ~]# ps aux | grep svnserve verbatim 658 0.0 0.0 93756 4652 ? Ss Jan19 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 4040 0.0 0.1 93840 6892 ? Ss Jan22 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 4428 0.0 0.1 93828 6724 ? Ss Jan19 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log 553 5630 0.0 0.0 189592 5416 ? Ss Jan22 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 6776 0.0 0.1 93788 8712 ? Ss Jan21 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 7605 0.0 0.1 93764 6688 ? Ss Jan19 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log 550 7679 0.0 0.1 189992 6068 ? Ss 10:19 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 9995 0.0 0.1 93600 6680 ? Ss Jan21 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 10104 0.0 0.0 92800 5612 ? Ss Jan19 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 11226 0.0 0.0 92832 3380 ? Ss Jan19 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log root 12291 0.0 0.0 103248 856 pts/0 S+ 10:30 0:00 grep svnserve verbatim 13688 0.0 0.1 93828 6956 ? Ss 02:48 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 15035 0.0 0.1 93788 6904 ? Ss Jan21 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log 1127 16985 0.0 0.0 189936 5768 ? Ss Jan22 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 19097 0.0 0.1 93832 6860 ? Ss 02:59 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 19197 0.0 0.0 92564 5384 ? Ss Jan22 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 20378 0.0 0.1 93620 6824 ? Ss Jan21 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 20930 0.0 0.1 93648 6772 ? Ss Jan20 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 22306 0.0 0.1 93788 6720 ? Ss Jan21 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log 1727 23461 0.0 0.0 92604 5528 ? Ss Jan20 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log 1727 24045 0.0 0.0 92632 5632 ? Ss Jan20 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 24326 0.0 0.0 92476 5356 ? Ss Jan20 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 24993 0.0 0.1 93848 6828 ? Ss Jan22 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 25541 0.0 0.1 93808 6920 ? Ss Jan21 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 25571 0.0 0.1 94044 6956 ? Ss Jan21 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 26041 0.0 0.1 93504 8448 ? Ss Jan20 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log 553 26686 0.0 0.1 190292 6192 ? Ss Jan19 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 30397 0.0 0.1 93508 6540 ? Ss Jan21 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 31393 0.0 0.1 93756 6832 ? Ss Jan19 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log verbatim 32037 0.0 0.1 93744 6908 ? Ss Jan22 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log 2139 32685 0.0 0.0 92324 5180 ? Ss Jan22 0:00 svnserve -t -r /repo/svn/mozilla --log-file /var/log/svn.log Doing a lsof on their PIDs, they all have this in common: svnserve 22306 verbatim 15u REG 0,19 51657728 18693 /repo/svn/mozilla/db/rep-cache.db (10.8.74.10:/vol/svn) I've confirmed that svn1 is the only one causing the issues (I drained svn2 and svn3 in Zeus and asked reed to commit, it hang). Removing svn1 from the pool fixed this.
Assignee | ||
Comment 1•11 years ago
|
||
This was a very hairy yak. I traced one of the svnserve processes, which yielded: [root@svn1.dmz.phx1 ~]# strace -f -p 32037 Process 32037 attached - interrupt to quit read(8, From here I saw it was stuck on file descriptor 8. Which happened to be: [root@svn1.dmz.phx1 ~]# lsof -p 32037 | grep 8r svnserve 32037 verbatim 8r FIFO 0,8 0t0 99760462 pipe stuck in a pipe. Pipe # 99760462 no less. Sniffing through the /proc tree to find out what else was using that pipe I came across: [root@svn1.dmz.phx1 fd]# (find /proc -type l | xargs ls -l | fgrep 'pipe:[99760462]') 2>/dev/null lr-x------ 1 verbatim users 64 Jan 23 10:03 /proc/32037/fd/8 -> pipe:[99760462] lr-x------ 1 verbatim users 64 Jan 23 11:09 /proc/32037/task/32037/fd/8 -> pipe:[99760462] l-wx------ 1 verbatim users 64 Jan 23 10:50 /proc/32046/fd/2 -> pipe:[99760462] l-wx------ 1 verbatim users 64 Jan 23 11:09 /proc/32046/task/32046/fd/2 -> pipe:[99760462] l-wx------ 1 verbatim users 64 Jan 23 10:50 /proc/32058/fd/2 -> pipe:[99760462] l-wx------ 1 verbatim users 64 Jan 23 11:09 /proc/32058/task/32058/fd/2 -> pipe:[99760462] Snooping on a few of these procs, I found: [root@svn1.dmz.phx1 fd]# strace -f -p 32058 Process 32058 attached - interrupt to quit connect(4, {sa_family=AF_FILE, path="/var/run/abrt/abrt.socket"}, 27 It was hanging on abrt.socket. So I kicked the abrtd service and everything unfucked itself. [root@svn1.dmz.phx1 abrt]# /etc/init.d/abrtd restart Stopping abrt daemon: [ OK ] Starting abrt daemon: [ OK ] [root@svn1.dmz.phx1 abrt]# ps aux|grep svn root 7749 0.0 0.0 103244 840 pts/2 S+ 11:17 0:00 grep svn [root@svn1.dmz.phx1 abrt]#
Assignee: server-ops-devservices → bkero
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Component: Server Operations: Developer Services → General
Product: mozilla.org → Developer Services
You need to log in
before you can comment on or make changes to this bug.
Description
•