Closed
Bug 626328
Opened 15 years ago
Closed 15 years ago
Socorro - collectors can fail to send to disk
Categories
(Socorro :: General, task)
Tracking
(Not tracked)
RESOLVED
FIXED
1.7.6
People
(Reporter: lars, Assigned: lars)
Details
Attachments
(1 file)
3.23 KB,
patch
|
rhelmer
:
review+
|
Details | Diff | Splinter Review |
something was different in tonight's downtime for hbase. We discovered that the collectors were not sending things to the file system when hbase is down.
the fallback system was designed to fallback to the filesystem when hbase times out or fails. It doesn't respond so well when hbase just isn't even there in the first place. We need to know how this down episode of hbase was different than previous times.
The collector is failing in initialization when hbase isn't there. It never even gets to the point of trying to accept a crash.
I patched the code as a temporary solution, and the patch works. But there should be a better solution to this problem engineered.
Assignee | ||
Updated•15 years ago
|
Assignee: nobody → lars
Target Milestone: --- → 1.7.6
Comment 1•15 years ago
|
||
How does it fail?
How was HBase differently absent than usual?
Can you attach your patch to the bug?
Assignee | ||
Comment 2•15 years ago
|
||
The CollectorCrashStorageSytemForHBase class relies on having an hbaseConnection to do its work. It cannot deal with and send things to fallback storage if that connection isn't even there. The solution is to never let the HBaseClient constructor fail, NoConnection exceptions must be dealt with within the constructor and not allowed to propagate outward. By allowing the constructor to complete with a bad connection to HBase, we allow the reconnection mechanisms in further method calls to do their work.
Attachment #504844 -
Flags: review?(rhelmer)
Updated•15 years ago
|
Attachment #504844 -
Flags: review?(rhelmer) → review+
Assignee | ||
Comment 3•15 years ago
|
||
This is fixed in release 1.7.5.6 and is currently in production and verified to work. The fix has been ported forward to 1.7.6.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 4•15 years ago
|
||
(In reply to comment #3)
> This is fixed in release 1.7.5.6 and is currently in production and verified to
> work. The fix has been ported forward to 1.7.6.
Lars and I tested the trunk (1.7.6) version on staging, too.
Updated•14 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
You need to log in
before you can comment on or make changes to this bug.
Description
•