Closed Bug 822317 Opened 12 years ago Closed 11 years ago

treestatus_mozilla_org should handle failover gracefully

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: scabral, Assigned: catlee)

References

Details

(Keywords: sheriffing-P1)

This morning, we switched over the database backend behind treestatus_mozilla_org (basically, simulated a failover so we could do maintenance on the database). 

treestatus_mozilla_org was giving errors like:
	 [Mon Dec 17 09:02:49 2012] [error] StatementError: Can't reconnect until invalid transaction is rolled back (original cause: InvalidRequestError: Can't reconnect until invalid transaction is rolled back) 'SELECT trees.tree AS trees_tree, trees.status AS trees_status, trees.reason AS trees_reason \\nFROM trees \\nWHERE trees.tree = %s' [{u'%(139912741355152 param)s': u'mozilla-inbound'}]

restarting the treestatus app seems to fix the problem....but if there is a failover in general, treestatus handle it gracefully.
Keywords: sheriffing-P1
OS: Mac OS X → All
Hardware: x86 → All
I don't see anything in here that looks sensitive from my POV; any objections to me unhiding?
I wasn't sure if it's a security problem to let the world know that treestatus doesn't handle failover gracefully. Feel free to unhide if that's not a problem.
Ah I see. Thank you :-)
Group: mozilla-corporation-confidential
Blocks: 831363
I think I fixed it:
https://github.com/catlee/treestatus/commit/6d3cb3e4a8333589a635f617ccca4dc1743ba13a#L0L601

It's live on https://treestatus-dev.allizom.org/. We should try kicking the DB and see if the app recovers properly.
Agreed! Let's coordinate for sometime this week.
sheeri and I tested this today, and the app recovers properly when it loses its connection to the db.

However, I wasn't able to verify this by looking at the application logs - I was expecting to see some messages about retrying the request due to the DB failure, but I don't see any app-specific messages in the logs.
Assignee: nobody → catlee
Summary: treestatus_mozilla_org should handle failover gracefully → where are logs? was treestatus_mozilla_org should handle failover gracefully
We did a failover today and things worked, so the only thing left is to find the app logs....I'm un-cc'ing myself from this bug. Woo hoo for graceful failover!
Not sure about the logs, but that's an orthogonal problem now - morphing back to the original issue and marking fixed.

Thank you for confirming :-)
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Summary: where are logs? was treestatus_mozilla_org should handle failover gracefully → treestatus_mozilla_org should handle failover gracefully
Product: Webtools → Tree Management
Product: Tree Management → Release Engineering
Component: Applications: TreeStatus → General
You need to log in before you can comment on or make changes to this bug.