Closed Bug 1182499 Opened 9 years ago Closed 9 years ago

Reduce Spark jobs memory pressure

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: rvitillo, Assigned: rvitillo)

Details

(Whiteboard: spark [unifiedTelemetry])

Roberto Agostino Vitillo (:rvitillo)

Assignee

Description

•

9 years ago

Pyspark jobs can run out of memory when Java and the Python processes don't play nice together.

Roberto Agostino Vitillo (:rvitillo)

Assignee

Updated

•

9 years ago

Priority: -- → P1

Whiteboard: spark [unifiedTelemetry]

Roberto Agostino Vitillo (:rvitillo)

Assignee

Comment 1

•

9 years ago

As the JVM doens't release unused memory back to the OS even after the GC is run, I had to reduce the maximum size of the heap to avoid starving the rest of the sytem.

I have also added some configuration parameters to YARN to not let it kill an application that is consuming more virtual, or physical, memory than it's supposed to. As we are running a single application on the YARN cluster it's safe to do so.

Furthermore, thanks to the reduced memory pressure, I was able to increase the chunk size for partial reads from S3 to 100MB which speeds up the initial phase of fetching and parsing submissions considerably.

Finally I tested the new settings using the v4 aggregation job. Previously we weren't able to run the aggregator over a month of data without causing some OOM error while now it works flawlessy.

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

BMO Automation

Updated

•

6 years ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Reduce Spark jobs memory pressure

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

Tracking

(Not tracked)

People

(Reporter: rvitillo, Assigned: rvitillo)

References

Details

(Whiteboard: spark [unifiedTelemetry])

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Updated