To Be Continued...
Brief Intro to Airflow
Airflow is a platform to programmatically author, schedule and monitor workflows like Oozie which was more famous in workflow industry.Airflow is a incubating project which is very new .But the project has not been hidden for his very huge advantage.
Airflow Advantages:
- airflow is developed by Python .it destined that airflow is well maintained and second-developed
- airflow has nice UI for controlling,displaying,monitoring workflow
- airflow has been running in the backend of Electron Project which is bigddata log analyser application in Youzu .the airflow has been proved to be stable and smooth
airflow needs a home, ~/airflow is the default,
but you can lay foundation somewhere else if you prefer
export AIRFLOW_HOME=~/airflow
install from pypi using pip
pip install airflow
pip install airflow[mysql]
initialize the database
airflow initdb
start the web server, default port is 8080
airflow webserver -p 8080
install celery when you intend to use celery executor
pip install airflow[celery]
Airflow Case
alter two lines in airflow.cfg
executor = CeleryExecutor
store metadata using mysql
sql_alchemy_conn = mysql://username:password@ipaddress/dbname?charset=utf8
start airflow webserver,airflow celery worker
airflow webserver
airflow worker
write dag file in dag_folder which can be modify in airflow.cfg setting file
eg: $AIRFLOW_HOME/dags/ is dag file
submit dag file to airflow for generating airflow task
airflow trigger_dag example #example is #eg: $AIRFLOW_HOME/dags/