Closed Bug 833590 Opened 13 years ago Closed 13 years ago

Need automated workaround to socket hang issue

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: hwine, Assigned: hwine)

References

Details

Attachments

(1 file, 1 obsolete file)

attempt hup of hg in socket wait 13 years ago hwine 2.93 KB, patch	nthomas : feedback+	Details \| Diff \| Splinter Review
automatically try to fix a hung socket 13 years ago hwine 5.49 KB, patch	nthomas : review+ hwine : checked-in+	Details \| Diff \| Splinter Review

hwine

Assignee

Description

•

13 years ago

Our current alerting script should be modified to auto-apply the workaround (HUP the hg process), and complain noisily if it can not. Ideally, it would auto update bug 829025 with the details of the hang, but that may remain manual for a while.

hwine

Assignee

Comment 1

•

13 years ago

Attached patch attempt hup of hg in socket wait (obsolete) — Details — Splinter Review

Only easy way to test is to run live in production - want a sanity check first. Script would be started from cron with the --fix option. That should attempt one HUP of the socket (existing messages have shown solid detection of just the hg in socket wait), then report again. Ideally, I'll get to run manually on a live hang prior to deploy via cron.

Assignee: nobody → hwine

Status: NEW → ASSIGNED

Attachment #705130 - Flags: feedback?(nthomas)

Nick Thomas [:nthomas] (UTC+12)

Comment 2

•

13 years ago

Comment on attachment 705130 [details] [diff] [review] attempt hup of hg in socket wait Re-entrant bash doesn't make me nervous, no, not at all. >diff --git a/check_process_delay b/check_process_delay >+email_subject="[vcs2vcs] process delays" You could use this at the end of the on_exit() function that follows, yes ? >+ log "socket hang on pid $p" Nit, trailing whitespace. >+# process command line args >+attempt_fix=false >+while test $# -gt 0; do >+ case "$1" in >+ --fix) attempt_fix=true ;; >+ -h | --help) usage ;; >+ -*) usage "unknown option '$1'" ;; >+ *) break ;; >+ esac >+ shift >+done If memory serves, the case options can be indented for greater readability.

Attachment #705130 - Flags: feedback?(nthomas) → feedback+

hwine

Assignee

Comment 3

•

13 years ago

Attached patch automatically try to fix a hung socket — Details — Splinter Review

This has been successfully running on gd3 for a while, and incorporates :nthomas previous feedback.

Attachment #705130 - Attachment is obsolete: true

Attachment #709202 - Flags: review?(nthomas)

Nick Thomas [:nthomas] (UTC+12)

Comment 4

•

13 years ago

Comment on attachment 709202 [details] [diff] [review] automatically try to fix a hung socket > # likely i/o to NFS slowing things down. Notify, but may not >- # be error (unsubscripted array access is element 0) >+ # be error nit, be an error

Attachment #709202 - Flags: review?(nthomas) → review+

hwine

Assignee

Comment 5

•

13 years ago

Comment on attachment 709202 [details] [diff] [review] automatically try to fix a hung socket http://hg.mozilla.org/users/hwine_mozilla.com/repo-sync-tools/rev/0986499abadc and deployed in production

Attachment #709202 - Flags: checked-in+

hwine

Assignee

Updated

•

13 years ago

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

12 years ago

Product: mozilla.org → Release Engineering

Nobody; OK to take it and work on it

Updated

•

8 years ago

Component: Tools → General

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Need automated workaround to socket hang issue

Categories

(Release Engineering :: General, defect)

Tracking

(Not tracked)

People

(Reporter: hwine, Assigned: hwine)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Updated

Updated

Attachment

General

Description

File Name

Content Type