623659 - Middleware bridge to search two different HBase instances for a crash

Reporter

Description

•

14 years ago

If xstevens and dre can't get all the crashes to PHX before the migration, we may find ourselves in the position of having to search both PHX and SJC for a crash.

This will require:
- changes to GetCrash
- addition of a config option for a secondary HBase instance

We'll also need Metrics to set up a second instance for us to test against (on the research cluster?) - do you need a bug, Daniel/Xavier?

We may also need to get holes poked in the firewall from PHX middleware to SJC Hbase.  Jabba, can you clarify if this is needed?

K Lars Lohn [:lars] [:klohn]

Assignee

Comment 1

•

14 years ago

ok, i've implemented and started testing a dual hbase scheme for the middleware.  If, through the web service, a request is made for an ooid (meta, dump or processed), the web service will first look to a primary HBase instance.  If it fails to find the ooid there, it will try a secondary HBase instance.  

As it turns out, however, the middleware is not the only place that will have to know about this dual scheme.  The processor also fetches meta and raw dump data from hbase.

If the two HBase instances are arranged such that the older crashes are in the secondary instance and new crashes are going into the primary HBase, most of the processor's work will come from the primary HBase.  However, if a request for a priority job comes in from beyond primary HBase's threshold age, the processor will have to go to the secondary HBase for fulfillment.

Programmatically, this is not difficult.  I just want it known that the ramifications of this are not confined to the middleware.

I'll soon post a patch here that implements this change.  

While I'm arrogant and assume my code is golden by default, it would be wise to test this.  Got any suggestions how I can test?

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 2

•

14 years ago

Have some code that points at both the dev and staging instances.  Ask for crashes that only exist in one or the other.

Laura Thomson :laura

Reporter

Comment 3

•

14 years ago

Lars, status?

K Lars Lohn [:lars] [:klohn]

Assignee

Comment 4

•

14 years ago

The "bridge" exists in svn as the lars-176dev branch.  

I will make another for 175x as lars-175xdev.  That way we'll have code for a bridge in either direction.

Laura Thomson :laura

Reporter

Comment 5

•

14 years ago

Also need to add functionality to search two different reports tables in the same instance (for sharding, basically).

K Lars Lohn [:lars] [:klohn]

Assignee

Comment 6

•

14 years ago

now coding the variation where the two instances of hbase are really the same instance. In this case, one instance will have to use a different table name other than 'crash_reports'.

I'm making the assumption that I should make the code work for both read/write operations with an alternate primary table name. If I were to make only the read operations respond to the alternate table name, then we'd have a very confusing hack. See * below for the alternative.

Having the read/write capability means that there needs to be index tables for the alternate table. Right now, if the primary table is 'crash_reports' the index table names will be 'crash_reports_index...' where '...' the name of the index topic. I'm assuming that the index names for the alternate tables will follow the same pattern. Here is a list of the parameterized names that I will be templatizing, comment if anything is missing:

crash_reports
crash_reports_index_legacy_unprocessed_flag
crash_reports_index_legacy_submitted_time
crash_reports_index_hang_id_submitted_time
crash_reports_index_hang_id
crash_reports_index_submitted_time
crash_reports_index_unprocessed_flag

This leaves the issue of the 'metrics' table. Should that be settable, too?

* If we were to want to go the way of read only table name templatization, I could implement it as a subclass. That way I can halt any attempt at write operations by raising a "NotImplemanted" exception. Then I could create a new crashStorage wrapper class that could be used in the DualHbaseCrashStorageSystem as the secondary store.

K Lars Lohn [:lars] [:klohn]

Assignee

Comment 7

•

13 years ago

this code exists in the lars-1761dev branch in googlecode.  It was not deployed.

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → WONTFIX

K Lars Lohn [:lars] [:klohn]

Assignee

Comment 8

•

13 years ago

this is suddenly hot again for use in staging.  It now resides in lars-177dev5 in googlecode.  Need it integrated into 177 in one week.

Severity: normal → major

Status: RESOLVED → REOPENED

Resolution: WONTFIX → ---

Target Milestone: 1.7.6 → 1.7.7

K Lars Lohn [:lars] [:klohn]

Assignee

Comment 9

•

13 years ago

it is my understanding that this feature is wanted only for use of two HBases for the middleware.  However, I suspect that it might be needed in the processor, too.  Consider this scenario:

an ooid is requested, but it is not found it the primary HBase, but it is found in the secondary.  Further, in the secondary, the crash has not been processed, so the middleware queues the crash for priority processing.  If the processor doesn't also know about the secondary HBase, the processor will be unable to find the ooid.

What is the use case in staging where we need this double-barreled HBase approach.  Is it true that we will not need this in production?  How do we certify a version in staging when the production system will have a different configuration and run different code?

Laura Thomson :laura

Reporter

Comment 10

•

13 years ago

(In reply to comment #9)
> it is my understanding that this feature is wanted only for use of two HBases
> for the middleware.  However, I suspect that it might be needed in the
> processor, too.  Consider this scenario:
> 
> an ooid is requested, but it is not found it the primary HBase, but it is found
> in the secondary.  Further, in the secondary, the crash has not been processed,
> so the middleware queues the crash for priority processing.  If the processor
> doesn't also know about the secondary HBase, the processor will be unable to
> find the ooid.
> 
> What is the use case in staging where we need this double-barreled HBase
> approach.  Is it true that we will not need this in production?  How do we
> certify a version in staging when the production system will have a different
> configuration and run different code?

The use case in staging is as follows:
The HBase instance we have in staging contains much less data than the PG instance.  This means that we may try to load a raw crash that isn't there, which makes it hard to test UI features at times.

The limitations of this specific implementation are, I think, that we only need the bridge for GetCrash.  We may expand this later, but this will work for now.

K Lars Lohn [:lars] [:klohn]

Assignee

Comment 11

•

13 years ago

checked in to trunk as r3027.

Status: REOPENED → RESOLVED

Closed: 13 years ago → 13 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

13 years ago

Component: Socorro → General

Product: Webtools → Socorro

Bugzilla

Quick Search

Middleware bridge to search two different HBase instances for a crash

Categories

(Socorro :: General, task)

Tracking

(Not tracked)

People

(Reporter: laura, Assigned: lars)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Updated