Closed
Bug 665123
Opened 13 years ago
Closed 10 years ago
change timeouts on hg pulse plugin
Categories
(Webtools :: Pulse, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: dustin, Unassigned)
References
Details
As per discussion in #ops, the pulse hook currently has a longer timeout than the hg-lock-file timeout, resulting in errors for others pushing when the pulse hook is timing out.
As a short-term solution, these timeouts should be ordered such that a pulse send will fail and the push complete in less time than it takes another user to time out waiting on the hg lock.
Reporter | ||
Comment 1•13 years ago
|
||
Christian, realizing you're really busy, is there any way we can take care of this today (Friday)? I'd hate to have this melt down again over the weekend.
If it's a tweak that someone with root on the hg systems can make, can you describe that here?
Hmmm, looked into this a bit and I think it may be due to flow control kicking in...so the connection is there it just isn't accepting published messages. I need to investigate more. I'll watch this over the weekend to make sure it doesn't bring hg down.
If it is flow control it may help to update erlang:
http://www.lshift.net/blog/2009/12/01/garbage-collection-in-erlang
Still investigating...
Comment 5•13 years ago
|
||
(In reply to comment #1)
> Christian, realizing you're really busy, is there any way we can take care
> of this today (Friday)? I'd hate to have this melt down again over the
> weekend.
...
(In reply to comment #4)
...
>
> Still investigating...
legneato: While you investigate, can we meanwhile disable this on production hg asap? I'm totally fine with re-enabling this hook after the investigation+fix are done, but right now, I'm concerned that this could cause another tree closure with no warning.
Reporter | ||
Comment 7•13 years ago
|
||
I think the purpose of this bug was just to correctly order the timeout of the hook script with respect to the overall hg lock timeout, so that even if the script *does* hang again, it doesn't cause the cascading failure. It seems like the investigation is into deeper issues better investigated on bug 665118?
Updated•10 years ago
|
Assignee: christian → nobody
Comment 8•10 years ago
|
||
This is going to be fixed by completely changing the way the publisher works (decoupling it via a "maildir"-style queue), so WONTFIXing this, as we're going to ditch all the old hg pulse-shim code.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•