Closed Bug 1628539 Opened 2 years ago Closed 2 years ago

Create beam job for decrypting pioneer v2 documents

Categories

(Data Platform and Tools :: Pipeline Ingestion, task, P1)

task
Points:
3

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: amiyaguchi, Assigned: amiyaguchi)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 1 obsolete file)

Pioneer v2 requires an extra decoding step in the gcp-ingestion/ingestion-beam job. The transform will decrypt messages encrypted using JOSE and decompress the payload, before passing it to the rest of the ingestion stack.

Assignee: nobody → amiyaguchi
Depends on: 1630121
Blocks: 1634552
Priority: -- → P1
Attachment #9139357 - Attachment is obsolete: true
Blocks: 1631849

The decoder can be run using the relevant options. The benchmark script demonstrates the use of the new options and verifies that the decoder runs in a batch context on dataflow. The new options are --pioneerEnabled, --pioneerMetadataLocation, --pioneerKmsEnabled, and --pioneerDecompressPayload.

./bin/mvn compile exec:java -Dexec.mainClass=com.mozilla.telemetry.Decoder -Dexec.args="\
    --runner=Dataflow \
    --profilingAgentConfiguration='{\"APICurated\": true}'
    --project=$project \
    --autoscalingAlgorithm=NONE \
    --workerMachineType=n1-standard-1 \
    --gcpTempLocation=$bucket/tmp \
    --numWorkers=2 \
    --pioneerEnabled=true \
    --pioneerMetadataLocation=$bucket/$prefix/metadata/metadata.json \
    --pioneerKmsEnabled=false \
    --pioneerDecompressPayload=false \
    --geoCityDatabase=$bucket/$prefix/metadata/GeoLite2-City.mmdb \
    --geoCityFilter=$bucket/$prefix/metadata/cities15000.txt \
    --schemasLocation=$bucket/$prefix/metadata/schemas.tar.gz \
    --inputType=file \
    --input=$bucket/$prefix/input/ciphertext/'part-*' \
    --outputType=file \
    --output=$bucket/$prefix/output/ciphertext/ \
    --errorOutputType=file \
    --errorOutput=$bucket/$prefix/error/ciphertext/ \
"
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.