SparkSession does not have access to Hive Metastore tables

RESOLVED WORKSFORME

Status

P2
normal
RESOLVED WORKSFORME
2 years ago
a year ago

People

(Reporter: frank, Assigned: whd)

Tracking

Details

(Whiteboard: [SvcOps])

(Reporter)

Description

2 years ago
The new way of using spark is using a SparkSession:
```
import org.apache.spark.sql.SparkSession

val spark = SparkSession
   .builder()
   .appName("my-app")
   .getOrCreate()
```

Unfortunately, doing it this way the Hive tables are not accessible. `spark.sql("SELECT * FROM main_summary LIMIT 1")` errors out.

Updated

2 years ago
Points: --- → 1
Priority: -- → P1
Whiteboard: [SvcOps]

Updated

2 years ago
Points: 1 → 2
(Assignee)

Updated

2 years ago
Assignee: nobody → whd
(Assignee)

Comment 1

2 years ago
(In reply to Frank Bertsch [:frank] from comment #0)
> Unfortunately, doing it this way the Hive tables are not accessible.
> `spark.sql("SELECT * FROM main_summary LIMIT 1")` errors out.

Can you be more specific about the environment you are using and the error you are seeing? I just tried this from an interactive atmo cluster (emr 5.2.1) using spark-shell and it worked for me. I also saw enableHiveSupport at https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html but didn't need to add it to get your example to work.
Flags: needinfo?(fbertsch)

Updated

2 years ago
Priority: P1 → P2
(Reporter)

Comment 2

a year ago
Works for me now.
Status: NEW → RESOLVED
Last Resolved: a year ago
Flags: needinfo?(fbertsch)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.