Please disable the Databricks query watchdog globally on all clusters
Categories
(Data Platform and Tools Graveyard :: Operations, enhancement)
Tracking
(Not tracked)
People
(Reporter: tdsmith, Assigned: jason)
References
Details
(Whiteboard: [DataOps])
Routine operations on main_summary fail by default on Databricks because the query watchdog thinks they touch too many partitions. It is not possible to set the query watchdog partition limit high enough to accommodate all partitions of main_summary. It's unusual that a real query would actually touch all partitions of main_summary, but it seems to be the case that real queries might have to enumerate them, which the watchdog prevents.
This causes a lot of user confusion and drives people towards unnecessary workarounds. Disabling the watchdog is the most correct option -- spark.conf.set('spark.databricks.queryWatchdog.enabled', False) is a pinned snippet in the internal data science Slack channel and it's become a daily incantation for us.
If we can disable the watchdog entirely on all clusters by default, this pain point would go away (either permanently or -- if we decide that the watchdog is actually misbehaving -- until Databricks fixes its pathological behaviors).
Thanks!
| Assignee | ||
Updated•7 years ago
|
| Assignee | ||
Comment 1•7 years ago
|
||
I've added a cluster init script that will set 'spark.databricks.queryWatchdog.enabled' to 'false' on cluster startup. Clusters that are already running will need to be restarted. I've only restarted 'shared_serverless_python3' since it was inactive.
| Reporter | ||
Comment 2•7 years ago
|
||
🎉🎉🎉 Thank you!
Updated•3 years ago
|
Description
•