Allow databricks clusters to access the tiles redshift database
Categories
(Data Platform and Tools Graveyard :: Operations, defect, P1)
Tracking
(Not tracked)
People
(Reporter: emtwo, Assigned: robotblake)
References
Details
(Whiteboard: [DataOps])
Updated•9 years ago
|
Comment 1•8 years ago
|
||
Reporter | ||
Comment 2•8 years ago
|
||
Updated•7 years ago
|
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Comment 3•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Comment 4•6 years ago
|
||
Updated•6 years ago
|
Updated•6 years ago
|
Comment 5•6 years ago
|
||
Comment 6•6 years ago
|
||
Comment 7•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 8•6 years ago
|
||
Comment 9•6 years ago
|
||
Comment 10•6 years ago
|
||
Assignee | ||
Comment 12•6 years ago
|
||
Assignee | ||
Comment 13•6 years ago
|
||
Assignee | ||
Comment 14•6 years ago
|
||
Comment 15•6 years ago
|
||
Assignee | ||
Comment 16•6 years ago
|
||
Comment 17•6 years ago
|
||
Assignee | ||
Comment 18•6 years ago
|
||
Comment 19•6 years ago
|
||
Comment 20•6 years ago
|
||
Hi, it seems that this workaround for accessing tables in the Tiles DB isn't working anymore (was working previously).
- example of attempt to use: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/51576/command/51578
seeing this in the errors: Caused by: org.postgresql.util.PSQLException: ERROR: relation "assa_impresson_stats_daily" does not exist
I know there was a reset of AWS credentials recently, is this related to it?
Comment 21•6 years ago
|
||
:shong - I think that's just a typo in the table name, missing the i
in impression. Can you fix that and confirm?
Comment 22•6 years ago
•
|
||
Hi Jon,
Yes, you're right. I had a typo in that example, sorry about that!
So I'm able to access the tiles DBs using this methodology, but I'm running into issues when the data size gets moderately large (small enough that spark should normally be able to handle it easily). Example below:
I'm able to work around it for now by limiting my columns, but this could become a limiting issue sooner rather then later (a number of things related to Ridgeline like snippets data are contained in Tiles). We might need a long term solution that gives Databricks direct access to the database like we do with the regular Telemetry data
Thank you!
Assignee | ||
Comment 23•6 years ago
|
||
The real solution here is probably to migrate all of this data into BigQuery, :jbuck is that something we're looking at doing at some point?
Comment 24•6 years ago
|
||
(In reply to Blake Imsland [:robotblake] from comment #23)
The real solution here is probably to migrate all of this data into BigQuery, :jbuck is that something we're looking at doing at some point?
Yes, we've already migrated part of the AS telemetry to BQ, the rest will be done shortly in Q3&Q4.
Assignee | ||
Comment 25•6 years ago
|
||
Excellent! 👏
Comment 26•6 years ago
|
||
:shong - Lets work through this issue with Redshift in a new bug. I tried accessing that notebook, but it says I don't have View permissions on it
Comment 27•6 years ago
|
||
Excellent. Thank you everyone.
Yes, I see that it's restricted, oversight on my part.
- Su
Comment 28•5 years ago
|
||
update: the Tiles access methodology provided by blake in comment 16 has been deprecated (the pw will not work).
use the new method using dbutils secrets provided below going forward:
Updated•2 years ago
|
Description
•