本次安装Airflow版本为1.10,其需要依赖Python3.6和DB,本次选择的DB为Mysql
mysql安装
参考文档: [https://blog.csdn.net/shangmingtao/article/details/78895101] 里面包含很多错误总结
中文文档: [https://www.kancloud.cn/luponu/airflow-doc-zh/889657]
[https://www.cnblogs.com/wuotto/p/9682400.html]修改linux mysql密码
https://www.cnblogs.com/cord/p/9397584.html 问题总结
# 有两种安装方式 1.在windows本地去mysql官网下载([https://downloads.mysql.com/archives/community/]
# 2. 通过mysql 仓库 [http://repo.mysql.com/](http://repo.mysql.com/)去下载 wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
#本文选择第二种方式,下载 mysql 的 repo 源
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
# mysql-community-release-el7-5.noarch.rpm 包
rpm -ivh mysql-community-release-el7-5.noarch.rpm
#安装 mysql
yum install mysql-server
#启动mysql
systemctl start mysqld
#配置开机启动
chkconfig mysqld on
#如果出现mysql root密码忘记的情况
vi /etc/my.cnf
[mysqld]
skip-grant-tables
#然后进入mysql 连接中: mysql -u root
update mysql.user set authentication_string=password('root_password') where user='root';
flush privileges;
#后面设置mysql远程连接可能会遇到 alter user user()错误,只需要
alter user user() identified by "qwer1234";
#通过远程连接,远程连接的密码可以和实际设置的不相同
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'qwer1234' WITH GRANT OPTION;
flush privileges;
#如果mysql不再/usr/bin下面,可以设置软连接
ln -s /usr/local/mysql/bin/mysql /usr/bin/
#mysql服务重启
systemctl restart mysqld.service
下面废话不说,直接开始安装airflow
#环境变量设置如下代码配置环境变量,此环境变量仅需要设置成临时变量即可并不需要配置成永久变量
export SLUGIFY_USES_TEXT_UNIDECODE=yes
#设置环境变量, 后面安装db和修改配置文件用,非常重要
export AIRFLOW_HOME=/home/hadoop/airflow
#升级pip,先用find / -name python3的安装目录,先将pip3 放入环境变量当中
ln -s /usr/local/python-3.6.4/bin/pip3 /usr/bin/pip3
pip3 install --upgrade pip
pip3 install --upgrade setuptools
#修复sqlalchemy的兼容性BUG,airflow-1.9.0 和sqlalchemy1.2.7无法兼容,导致用户认证和密码设置出现异常,后期版本可能会修复
pip install 'sqlalchemy>=1.1.15, <1.2.0'
#安装airflow,安装前先安装python mysql
yum install python-devel mysql-devel
pip install apache-airflow
pip install apache-airflow[celery,crypto,mysql,password,redis]
#安装完成后,查询airflow所在安装目录 find / -name airflow,我所在本地是/usr/local/python-3.6.4/bin/airflow
./airflow #会在AIRFLOW_HOME 中生成配置文件
#在mysql中新增数据源
#创建库表和用户
create database airflow default charset utf8mb4 collate utf8mb4_general_ci;
create user airflow@'%' identified by 'airflow';
grant all on airflow.* to airflow@'%';
flush privileges;
#修改配置文件(airflow.cfg)
vi airflow.cfg
#修改数据库配置
sql_alchemy_conn = mysql://airflow:airflow@localhost:3306/airflow
#初始化db
/usr/local/python-3.6.4/bin/airflow initdb
#之后可以通过远程连接,在mysql airflow库中看到建表情况
#插入admin用户
python3
import airflow
from airflow import models, settings
from airflow.contrib.auth.backends.password_auth import PasswordUser
user = PasswordUser(models.User())
user.username = 'afuser'
user.email = 'afuser@example.com'
user.password = 'afuser'
session = settings.Session()
session.add(user)
session.commit()
session.close()
exit()
#启用访问认证(airflow.cfg)
[webserver]
authenticate = true
auth_backend = airflow.contrib.auth.backends.password_auth
# 修改时区
default_timezone = Asia/Shanghai
#修改邮件配置
[smtp]
smtp_host = smtp.163.com
smtp_starttls = True
smtp_ssl = False
# Uncomment and set the user/pass settings if you want to use SMTP AUTH
smtp_user = mailExample@163.com
smtp_password = password
#设置Executor
executor = LocalExecutor
#修改webserver地址
[webserver]
base_url = http://host:port
#修改页面时区
vi /usr/local/python-3.6.4/lib/python3.6/site-packages/airflow/www/templates/admin/master.html
#修改格式见下图
#可以在系统中设置环境变量 vi /etc /profile
export AIRFLOW=/usr/local/python-3.6.4/bin
export AIRFLOW_HOME=/home/hadoop/airflow
export PATH=$PATH:$AIRFLOW:$AIRFLOW_HOME
#启动schedule
nohup airflow scheduler >scheduler.log 2>&1 &
#启动webserver
nohup airflow webserver -p 8089 &
image.png
问题总结
设置超级管理员(页面admin中只显示variable): 将0修改成1
image.png
airflow设置connect:
image.png
可能出现pandas core的问题只需要安装: pip3 install --upgrade pandas
网友评论