Closed Bug 1308169 Opened 9 years ago Closed 9 years ago

spark-csv package should be included by default in our environment

Tracking

(Not tracked)

Status:

RESOLVED WONTFIX

People

(Reporter: rvitillo, Unassigned)

References

Details

User Story

The spark-csv package [1] should be included by default in our Python Spark environment. This was previously the case but some recent changes might have reverted that. The change should be applied to both Airflow and ATMO jobs.

[1] https://github.com/databricks/spark-csv

Roberto Agostino Vitillo (:rvitillo)

Reporter

Description

•

9 years ago

No description provided.

Roberto Agostino Vitillo (:rvitillo)

Reporter

Updated

•

9 years ago

Blocks: 1283447

Roberto Agostino Vitillo (:rvitillo)

Reporter

Updated

•

9 years ago

User Story: (updated)

Roberto Agostino Vitillo (:rvitillo)

Reporter

Updated

•

9 years ago

User Story: (updated)

Frank Bertsch [:frank]

Comment 1

•

9 years ago

Issue has not been resolved. Currently, the following work, and spark-csv can be used to load files: pyspark --packages com.databricks:spark-csv_2.10:1.2.0 spark-shell --packages com.databricks:spark-csv_2.10:1.2.0 But Jupyter still doesn't load the package correctly. There seems to be others who ran into this issue, but the solutions there don't seem to work. (https://github.com/databricks/spark-csv/issues/247)

Roberto Agostino Vitillo (:rvitillo)

Reporter

Comment 2

•

9 years ago

Mauro, any thoughts about how to fix this?

Flags: needinfo?(mdoglio)

Roberto Agostino Vitillo (:rvitillo)

Reporter

Updated

•

9 years ago

Assignee: fbertsch → nobody

Frank Bertsch [:frank]

Comment 3

•

9 years ago

We are going to close this bug as not-fixed, as this functionality will be available in Spark 2.0 (available in < month). If you don't need this job for the time being, please consider terminating it.

Flags: needinfo?(vfilippov)

Vlad Filippov (:vladikoff)

Comment 4

•

9 years ago

(In reply to Frank Bertsch [:frank] from comment #3) > We are going to close this bug as not-fixed, as this functionality will be > available in Spark 2.0 (available in < month). If you don't need this job > for the time being, please consider terminating it. Sounds good!

Flags: needinfo?(vfilippov)

Frank Bertsch [:frank]

Updated

•

9 years ago

Status: NEW → RESOLVED

Closed: 9 years ago

Flags: needinfo?(mdoglio)

Resolution: --- → WONTFIX

BMO Automation

Updated

•

7 years ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

spark-csv package should be included by default in our environment

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect)

Tracking

(Not tracked)

People

(Reporter: rvitillo, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Updated

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Updated

Updated