bugzilla.mozilla.org has resumed normal operation. Attachments prior to 2014 will be unavailable for a few days. This is tracked in Bug 1475801.
Please report any other irregularities here.

5 second timeout for hbase connections

RESOLVED WONTFIX

Status

Socorro
Backend
RESOLVED WONTFIX
6 years ago
6 years ago

People

(Reporter: selenamarie, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

We're having a look at Zeus and timeouts, and noticed that our timeout value for Thrift/HBase connections is set to 5000 seconds, which is 1h23m. 

Is this a reasonable timeout? 


Older bugs related to HBase timeouts/slowness:

https://bugzilla.mozilla.org/show_bug.cgi?id=558119
https://bugzilla.mozilla.org/show_bug.cgi?id=600998
Lars,

Given that our timeouts from Zeus are 10 seconds for connect, and 60 seconds for idle connections, I'd like to document an appropriate timeout for our database connections that can be used for all Socorro database connections and change our configs to match.
"all Socorro database connections", meaning postgres, too?   

Are you suggesting that connections should have a timeouts in the range listed in comment #1?  In other words, we need to eliminate long lasting connections and move to a model where we connect/disconnect with each transaction?
(In reply to K Lars Lohn [:lars] [:klohn] from comment #2)
> "all Socorro database connections", meaning postgres, too?   
> 
> Are you suggesting that connections should have a timeouts in the range
> listed in comment #1?  In other words, we need to eliminate long lasting
> connections and move to a model where we connect/disconnect with each
> transaction?

No.

We have two timeouts to contend with: 

* Initial connection timeouts: this occurs when a connection does not complete normally. With Thrift, this can happen if an exception is thrown during startup or if the initial connect just takes a while.  I don't know that we see this with Postgres connections.

* Idle timeouts of existing connections: this occurs when a connection goes idle and doesn't send keepalives.

So, in either of those cases, the Zeus timeouts kick in. If we wish for this value to be different, we need to give that value for these timeouts to IT to implement in Zeus. And we need to change our timeouts configured in software to be consistent.

If we are ok with those timeouts, then we need to change our configurations to match.
Not doing anything further with this due to the hbase code refactor currently underway.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → WONTFIX
Summary: 5000 second timeout for hbase connections → 5 second timeout for hbase connections
You need to log in before you can comment on or make changes to this bug.