Closed Bug 558119 Opened 15 years ago Closed 15 years ago

Socorro Collector needs timeout on hbase calls

Categories

(Socorro :: General, task, P1)

x86
Linux

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: lars, Assigned: dre)

Details

On 2010-04-08 there collector stalled out because hbase calls were taking too long. In 1.7, the hbase becomes the primary storage and we cannot afford such a failure. With timeouts, we can fallback to local storage quickly. The collector will not bog down, and we won't lose crashes like we did. Can the timeouts be implemented in the thrift layer?
Priority: -- → P1
Target Milestone: --- → 1.7
Assignee: nobody → deinspanjer
Anurag and I will hit up people on the #hbase channel to see what we can get done with this.
The python version of hbaseClient seems to be having an infinite timeout. I would assume/hope that adding a socket timeout will resolve this issue. adding following line below line #69 a.k.a. transport = self.tsocketModule.TSocket(self.host, self.port) should do the trick: line to be added: transport.setTimeout(1000) #in ms I have checked out the code from: http://code.google.com/p/socorro/source/browse/trunk/socorro/hbase/ and made the change, but not sure how to test it locally Daniel - Can you let me know how to test it (mainly config details, path etc.) once you have some time to spare?
I'll work with you tomorrow morning to get something set up on khan so we can try to test it.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.