If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Firefox Engagement Ratio Spark job is failing

RESOLVED FIXED

Status

Cloud Services
Metrics: Pipeline
P1
normal
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: mreid, Assigned: mreid)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Description

a year ago
Figure out why, fix it.
(Assignee)

Comment 1

a year ago
Looks somewhat obscure from the log:

...snip...
Caused by: java.io.FileNotFoundException: /mnt/yarn/usercache/hadoop/appcache/application_1462853298787_0001/blockmgr-f4358145-c989-417a-bcce-d0bfcc1a1371/09/shuffle_23_12_0.index (No such file or directory)
	at java.io.FileInputStream.open(Native Method)
	at java.io.FileInputStream.<init>(FileInputStream.java:146)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getSortBasedShuffleBlockData(ExternalShuffleBlockResolver.java:275)
	... 27 more

	at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:323)
	at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:300)
	at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:51)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
Assignee: nobody → mreid
Points: --- → 1
Priority: -- → P1
(Assignee)

Comment 2

a year ago
There was a problem with the job that converts the Heka Executive Summary stream to Parquet, so the underlying derived dataset was missing some data.

I've changed the job to use the main_summary dataset here:
https://github.com/mozilla-services/data-pipeline/pull/205

The dashboard data has already been fixed and backfilled.
(Assignee)

Updated

a year ago
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.