In this article, we will learn how to develop ETL(Extract Transform Load) pipeline using Apache Airflow. Here are list of things that we will do in this article:
- Call an API
- Setup database
- Setup airflow
Call an API
We will create a module
getWeather.py
, and inside it we will create a get_weather()
function which will call the API. We will then create a directory
data/
where we will save daily data obtained from API. We do this under createDirectory()
function as shown below.Setup Database
We will create a module
createTable.py
, and inside it we will create a make_database()
function which will create database.Setup Airflow
In order to use Airflow, you will have to set up Airflow first. You can see Airflow installation documentation on how to setup Airflow.
Once Airflow has been set up, we will define our dag.
Now we can run our DAG from Apache Airflow.
Complete code for this article can be found in this Github Repository.
Special thanks to Michael Harmon. This article is developed using his publication. You can find it over
.here