Closed Bug 543759 Opened 14 years ago Closed 13 years ago

HBase storage for processed json data (jsonz)

Categories

(Socorro :: General, task, P1)

x86
macOS

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ozten, Assigned: aphadke)

References

Details

(Whiteboard: ETA - 7/16, confirm hbase stability)

Once we move to HDFS for raw crash dumps, we need to design a solution for the 'cooked' dumps that the processor generates. These are available to the public and are used to populate the screens of the crash reporter.

For recent crashes, there should be low latency.

Also see Bug#542624 Comment#1
Depends on: 542624
Target Milestone: --- → 2.0
-> Pythonic middleware
Target Milestone: 2.0 → 1.7
Assignee: nobody → lars
Priority: -- → P1
Assignee: lars → aphadke
Summary: HDFS or other storage for processed jsonz files → HBase storage for processed json data (jsonz)
No longer depends on: 565692
To talk bit more:
A lot of crash dumps currently reside on NFS and we need to move them off to HBase. A Hadoop/HBase script has been written that does the following:
-download crash dump data for given day
-hit the crash-reports URL to download the jsonz
-un-compress the jsonz
-insert the processed data in HBase.

SVN location for code/script:
svn+ssh://svn.mozilla.org/moco/metrics/hadoop/crash-reports/
This will be done in prod *after* the 1.7 push
Whiteboard: ETA - 7/16 need some spare cycles on my (aphadke's) end to run the job/s to completion
Whiteboard: ETA - 7/16 need some spare cycles on my (aphadke's) end to run the job/s to completion → ETA - 7/16
Whiteboard: ETA - 7/16 → ETA - 7/16, confirm hbase stability
laura - should i close this bug?
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.