Closed Bug 1429955 Opened 8 years ago Closed 8 years ago

Create a prototype ingestion pipeline in gcp

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: relud, Assigned: relud)

References

Details

Daniel Thorn [:relud]

Assignee

Description

•

8 years ago

at a glance it looks like this would be a good pipeline: app engine (1) -> cloud pub/sub -> app engine (2) -> cloud storage -> cloud function -> big query then set up big query to be accessible in redash. app engine (1) validates json schema and forwards to pub/sub cloud pub/sub makes sure we can connect other data warehouses and real-time analytics app engine (2) batches up incoming data and writes to cloud storage (in avro format maybe) cloud storage makes sure we don't run into limits when inserting to big query cloud function executes on a regular basis and inserts data from cloud storage to data partitioned big query tables

Daniel Thorn [:relud]

Assignee

Comment 1

•

8 years ago

date* partitioned big query tables

Daniel Thorn [:relud]

Assignee

Updated

•

8 years ago

Priority: P2 → P1

Daniel Thorn [:relud]

Assignee

Comment 2

•

8 years ago

reference material https://cloud.google.com/solutions/architecture/optimized-large-scale-analytics-ingestion

Daniel Thorn [:relud]

Assignee

Updated

•

8 years ago

Points: 2 → 3

Daniel Thorn [:relud]

Assignee

Comment 3

•

8 years ago

Recording work here https://docs.google.com/document/d/1ENoZqLYBl-EyS9b8dZ-QDpsYLBZfGLKRdYO4NbpfMYk/edit# Git repo here: https://github.com/relud/telemetry-sample

Daniel Thorn [:relud]

Assignee

Updated

•

8 years ago

Blocks: 1431769

Daniel Thorn [:relud]

Assignee

Updated

•

8 years ago

Blocks: 1431770

Daniel Thorn [:relud]

Assignee

Comment 4

•

8 years ago

The google doc in comment 3 is up to date with current work on this. resolving fixed. Future work will be done in other bugs.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

3 years ago

Component: Pipeline Ingestion → General

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Create a prototype ingestion pipeline in gcp

Categories

(Data Platform and Tools :: General, enhancement, P1)

Tracking

(Not tracked)

People

(Reporter: relud, Assigned: relud)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Updated

Comment 3

Updated

Updated

Comment 4

Updated