Closed Bug 1522682 Opened 7 years ago Closed 7 years ago

Please disable the Databricks query watchdog globally on all clusters

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: tdsmith, Assigned: jason)

References

Details

(Whiteboard: [DataOps])

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Reporter

Description

•

7 years ago

Routine operations on main_summary fail by default on Databricks because the query watchdog thinks they touch too many partitions. It is not possible to set the query watchdog partition limit high enough to accommodate all partitions of main_summary. It's unusual that a real query would actually touch all partitions of main_summary, but it seems to be the case that real queries might have to enumerate them, which the watchdog prevents.

This causes a lot of user confusion and drives people towards unnecessary workarounds. Disabling the watchdog is the most correct option -- spark.conf.set('spark.databricks.queryWatchdog.enabled', False) is a pinned snippet in the internal data science Slack channel and it's become a daily incantation for us.

If we can disable the watchdog entirely on all clusters by default, this pain point would go away (either permanently or -- if we decide that the watchdog is actually misbehaving -- until Databricks fixes its pathological behaviors).

Thanks!

Jeff Klukas [:klukas] (UTC-4)

Updated

•

7 years ago

Updated

•

7 years ago

Assignee: nobody → jthomas

Whiteboard: [DataOps]

Jason Thomas [:jason]

Assignee

Comment 1

•

7 years ago

I've added a cluster init script that will set 'spark.databricks.queryWatchdog.enabled' to 'false' on cluster startup. Clusters that are already running will need to be restarted. I've only restarted 'shared_serverless_python3' since it was inactive.

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → FIXED

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Reporter

Comment 2

•

7 years ago

🎉🎉🎉 Thank you!

BMO Automation

Updated

•

3 years ago

Product: Data Platform and Tools → Data Platform and Tools Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Please disable the Databricks query watchdog globally on all clusters

Categories

(Data Platform and Tools Graveyard :: Operations, enhancement)

Tracking

(Not tracked)

People

(Reporter: tdsmith, Assigned: jason)

References

Details

(Whiteboard: [DataOps])

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Comment 2

Updated