Closed
Bug 1412898
Opened 7 years ago
Closed 7 years ago
ExperimentAnalysisView failed over the weekend with java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
Categories
(Data Platform and Tools :: General, enhancement, P1)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bugzilla, Assigned: bugzilla)
Details
Attachments
(1 file)
Working on repro, but I suspect we have a large unbalanced partition (and therefore we may need to force repartitioning always instead of skipping in some circumstances.) Full stack trace: ShuffleMapStage 16 (collect at MetricAnalyzer.scala:57) failed in 0.218 s due to Job aborted due to stage failure: Task 293 in stage 16.0 failed 4 times, most recent failure: Lost task 293.3 in stage 16.0 (TID 8571, ip-172-31-9-158.us-west-2.compute.internal, executor 5): java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:869) at org.apache.spark.storage.DiskStore$$anonfun$getBytes$4.apply(DiskStore.scala:125) at org.apache.spark.storage.DiskStore$$anonfun$getBytes$4.apply(DiskStore.scala:124) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1337) at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:126) at org.apache.spark.storage.BlockManager.getLocalValues(BlockManager.scala:520) at org.apache.spark.storage.BlockManager.get(BlockManager.scala:693) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:753) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334) at org.apache.spark.rdd.RDD.iterator(RDD.scala:285) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.RDD.iterator(RDD.scala:285) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Comment 1•7 years ago
|
||
Making the job much faster made the issue go away: https://github.com/mozilla/telemetry-batch-view/pull/333
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Updated•2 years ago
|
Component: Datasets: Experiments → General
You need to log in
before you can comment on or make changes to this bug.
Description
•