Closed
Bug 1182499
Opened 9 years ago
Closed 9 years ago
Reduce Spark jobs memory pressure
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rvitillo, Assigned: rvitillo)
Details
(Whiteboard: spark [unifiedTelemetry])
Pyspark jobs can run out of memory when Java and the Python processes don't play nice together.
Assignee | ||
Updated•9 years ago
|
Priority: -- → P1
Whiteboard: spark [unifiedTelemetry]
Assignee | ||
Comment 1•9 years ago
|
||
As the JVM doens't release unused memory back to the OS even after the GC is run, I had to reduce the maximum size of the heap to avoid starving the rest of the sytem. I have also added some configuration parameters to YARN to not let it kill an application that is consuming more virtual, or physical, memory than it's supposed to. As we are running a single application on the YARN cluster it's safe to do so. Furthermore, thanks to the reduced memory pressure, I was able to increase the chunk size for partial reads from S3 to 100MB which speeds up the initial phase of fetching and parsing submissions considerably. Finally I tested the new settings using the v4 aggregation job. Previously we weren't able to run the aggregator over a month of data without causing some OOM error while now it works flawlessy.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•