Closed
Bug 1472710
Opened 7 years ago
Closed 7 years ago
Remove old inactive dags from airflow dag list UI
Categories
(Data Platform and Tools :: General, enhancement, P1)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bugzilla, Assigned: amiyaguchi)
Details
(Whiteboard: [DataPlatform])
Attachments
(1 file)
|
587 bytes,
text/x-python
|
Details |
We have a couple old dags that have been removed from the codebase but show up in the dag list with the tooltip "This DAG isn't available in the web server's DagBag object. It shows up in this list because the scheduler marked it as active in the metadata database."
We should figure out how to clean up after these old dags, especially as the list of dags grows beyond one page.
| Assignee | ||
Updated•7 years ago
|
Assignee: nobody → amiyaguchi
Priority: P2 → P1
| Assignee | ||
Comment 1•7 years ago
|
||
It looks like all the references to the DAG from the metadata table.
https://issues.apache.org/jira/browse/AIRFLOW-1002
It looks like some new functionality was added in airflow 1.10 to do this through a REST interface, otherwise it needs to be done manually. I don't think I have access to the production postgres server where the airflow_db lives, but I can write the script that needs to be run.
| Assignee | ||
Comment 2•7 years ago
|
||
I don't have access to the Airflow cluster. :hwoo can you run this on the web node after checking that the script looks sane? This can be run in any container that has access to the postgres database.
Flags: needinfo?(hwoo)
Comment 3•7 years ago
|
||
Script isn't working. Unsure how the connection is created, would require a dive into the source code.
The other comments on the stackoverflow page you referenced seem to indicate that these metadata relations exist but I cannot find them in the postgres db when looking around.
Flags: needinfo?(hwoo)
Comment 4•7 years ago
|
||
Ok figured it out. For future reference:
The postgres metadata db is separate and defined by AIRFLOW_DATABASE_URL. I had to create a connection via the airflow UI at Admin -> Connections called 'airflow_db_meta' to reference in the script.
hook = PostgresHook(postgres_conn_id = "airflow_db_meta")
Then run the script on the web container in a python shell.
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Updated•3 years ago
|
Component: Scheduling → General
You need to log in
before you can comment on or make changes to this bug.
Description
•