Problem Definition
One typical Airflow usage scenario is to continuously execute some workflow with regular base, and the output data of last iteration will be the input data for the next iteration.
One way we can do that is to keep your output data as a local file or store that into database table, and read and update those data in every iteration. However, with those solutions you need to manual handle database connections and that is not convenient sometime.
If you are looking for passing variables across different tasks within a dag, have a look at Airflow XComs and that could potentially solve your problem.
Solution with Airflow
Airflow provide a feature called Variables to solve such problem. Under the hood Variables are just a key-value storage, and the data will be stored into the PostgreSQL database.
Airflow will automatic handle the database connection, and you can direct use it in your Airflow Dag and don’t need to worry about database connection, release database pool resource, etc.
Example
An example dag script would be the following
|
|
The above code define a PythonOperator called set_variable_opt
, in which it calls another set_variable
Python function and setup a Variable called foo
with value bar
.