Closed Bug 521449 Opened 15 years ago Closed 15 years ago

Can't push to m-c

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: zpao, Assigned: aravind)

References

Details

Attachments

(1 file)

Keep getting the same error: "remote: pysqlite2.dbapi2.OperationalError: database is locked". Full error output http://pastebin.mozilla.org/675591 I haven't tried pushing to any branches, so unsure if there are problems there too.
Please try your commit/submit again. Maybe you just happened to hit it when there was a different commit in progress. If you not, please add your comment to bug 519594
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → DUPLICATE
This is just massive right now, multiple people have problems. I also see a ton of queries on pushlog taking more than 10 seconds, at which point I stopped that daemon from polling for now. Reopening as there seems to be something troubling hg.m.o more than usual just now.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Assignee: server-ops → aravind
Can those of you still having a problem pushing to m-c please join #hg?
I have tried setting timeout=10, isolation_level="EXCLUSIVE" in the connect statement. We have also tried bumping up that timeout up to 25s. None of those helped. I am past the levels of my knowledge with this stuff at this point.
Since the failures seem to be happening in the commit section of the pushlog script, I added a retry loop in the hook (mostly off some docs I found online). Would like Benjamin and Ted to look over it.
Comment on attachment 405623 [details] [diff] [review] Patch to pushlog hook. + else: + conn.close() + sleep(0.1) + num_attempts += 1 + log(ui, repo, node, **kwargs) + seems to have an indention error in num_attempts. I'd personally prefer a loop and a return inside the try to a recursive function with a global.
From reading pysqlite docs and mailing list posts a while ago when looking into this problem, setting the timeout parameter higher when connecting to the database should essentially perform the same function. pysqlite just calls sqlite3_busy_timeout with the timeout value you pass: http://www.sqlite.org/c3ref/busy_timeout.html "This routine sets a busy handler that sleeps for a specified amount of time when a table is locked. The handler will sleep multiple times until at least "ms" milliseconds of sleeping have accumulated. After "ms" milliseconds of sleeping, the handler returns 0 which causes sqlite3_step() to return SQLITE_BUSY or SQLITE_IOERR_BLOCKED." (the default pysqlite timeout is 5 seconds)
increasing the timeout will just help with those sporadic cases where one gets the db locked, and retrying the push finishes correctly. Btw, i just want to signal that today everything is fine, i can push as usual, dunno if something has been done in the meantime or if it's just due to reduced load on servers.
I saw from the IRC conversation that someone was spidering hg.mozilla.org's web interface, and this was likely what was causing all the issues here. We should figure out what exactly happens when someone spiders the web interface that causes things to get locked up, and figure out a way to fix or mitigate that.
Hitting the web interface sends you to backend servers that server mercurial off a read-only mount point. So they shouldn't be adding lock files or anything like that, even if stuff were being spidered. What it could do however, is add additional load onto the nfs servers providing these mount points, but that should just be slow performance and not cause for locks. @ted: Also, we tried the timeout param in the connect function, but it didn't help. I think the timeout in the connect function only matters when you are opening a connection to the db. The other times (commits for example), it just fails when it sees a lock. Would it be okay to try the patch I attached for the pushlog hook?
Please re-open the bug if this is still happening.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: