Closed
Bug 468480
Opened 16 years ago
Closed 16 years ago
hg.mozilla.org very slow or nonfunctional (pushlog db problems?)
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dbaron, Assigned: aravind)
Details
hg.mozilla.org seems to be having serious problems right now:
http://hg.mozilla.org/mozilla-central/pushloghtml doesn't load in any reasonable amount of time
When I pushed 15 minutes ago, it took at least half a dozen tries. The first bunch all failed with a bunch of errors ending with:
remote: pysqlite2.dbapi2.OperationalError: database is locked
abort: unexpected response: empty string
(Sorry, I don't have the rest of the errors anymore; it was a python exception stack, or something like that.)
Comment 1•16 years ago
|
||
I'm seeing hgweb be nonfunctional, even when not hitting the pushlog. I suspect this is not the pushlog's fault, perhaps something is spidering hgweb?
Reporter | ||
Comment 2•16 years ago
|
||
(But note that pushing to my user repo is fine, so it's mozilla-central specific or something like that.)
Reporter | ||
Comment 3•16 years ago
|
||
(Then again, the web interface for my user repo is not fine.)
Comment 4•16 years ago
|
||
That would lead me to believe that something is spidering hg.mozilla.org/mozilla-central/{pushlog,pushloghtml,json-pushes}. That would cause lots of queries against the pushlog.db, which would keep it locked, and also lots of load on hgweb. (The actual HG server is on a separate machine, so it's probably unaffected, aside from the pushlog db getting locked by read-only queries.)
Assignee | ||
Comment 6•16 years ago
|
||
This should be cleared up now. Not sure what triggered it in the first place (I had to reboot the boxes).
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Comment 7•16 years ago
|
||
This looks broken again. Aravind: I think there must be some real root cause here. I still suspect spidering, but clearly I have no way to verify that.
Assignee: aravind → server-ops
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
It's been up and down all night because of load. I'm still trying to get it stable.
Should be more stable now. One of the vmware hosts (pm-vmware02) involved here lost it and dm-vcview02 went missing/unresponsive and couldn't be reset, overloading dm-vcview01.
Handing off to aravind for more investigation. The load is still high, there are some outside ips hitting it frequently but no one really spidering it that I could tell.
Assignee: thardcastle → aravind
Assignee | ||
Comment 10•16 years ago
|
||
As trevor said, I couldn't find anyone spidering it either. I will leave the bug open an try to find more. Can you guys tell me when this started happening? At what time did you notice the slow down?
Comment 11•16 years ago
|
||
11:46 < Pike> is hg.m.o slow-up-to-dead for other folks, too?
11:47 < djc> Pike: yeah, looks like it
(That's CET.)
Comment 12•16 years ago
|
||
Not sure if this will help but after about 10 minutes I got this:
Proxy Error
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /.
Reason: Error reading from remote server
Apache/2.2.3 (Red Hat) Server at hg.mozilla.org Port 80
Assignee | ||
Comment 13•16 years ago
|
||
This seems to be a problem with the ESX servers hosting the VM. We are looking at it. Will update the bug once we figure out whats going on with it.
Assignee | ||
Comment 14•16 years ago
|
||
We now have a case open with vmware. For now, I am routing this traffic to a different server, so hgweb should work okay. I will move it back to the VMs once we have resolution on the vmware problems.
Assignee | ||
Comment 15•16 years ago
|
||
This should be fixed now. Please re-open as necessary. We migrated the VMs off a problem netapp, so hopefully we won't see these problems again.
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•