Write Telemetry view (Scala)

RESOLVED WONTFIX

Status

RESOLVED WONTFIX
2 years ago
2 years ago

People

(Reporter: peterbe, Assigned: peterbe)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Description

2 years ago
Once we've uploaded crashes to S3 specifically for going into Telemetry we also need to run a piece of Scala code that builds up a struct. 

The work is to clone https://github.com/mozilla/telemetry-batch-view/blob/master/src/main/scala/com/mozilla/telemetry/views/MainSummaryView.scala and call it something like CrashView.scala

First we need to re-write the buildSchema function [0] to generate a struct based on *our* JSON Schema (processed_crash.json). Then we also need to re-write the messageToRow function [1].

[0] https://github.com/mozilla/telemetry-batch-view/blob/a401112b72e1cf92c47083ea76bd67afeef6c71a/src/main/scala/com/mozilla/telemetry/views/MainSummaryView.scala#L378
[1] https://github.com/mozilla/telemetry-batch-view/blob/a401112b72e1cf92c47083ea76bd67afeef6c71a/src/main/scala/com/mozilla/telemetry/views/MainSummaryView.scala#L141
(Assignee)

Comment 1

2 years ago
Here are some notes about the latest development on this:
https://public.etherpad-mozilla.org/p/socorro-to-telemetry-july2016
(Assignee)

Updated

2 years ago
Depends on: 1299183

Comment 2

2 years ago
I don't think we'll need this after all. Since the data is already in the desired structure and we're not transforming anything, we just need the SparkSQL Struct definition and Spark can automatically convert it from JSON to Parquet.

See proof-of-concept at https://gist.github.com/mreid-moz/31ac995e3180c156db61e5f1c0ee745b
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WONTFIX

Updated

2 years ago
Depends on: 1314252
You need to log in before you can comment on or make changes to this bug.