开始直接用红帽的试用版 ansible-tower,可以直接使用红帽的安装源
后来正版化,以及方便部署,部署ansible-awx 到openshift的社区版上
后来开发有其他的需求,所在机房没有k8s的集群,所以找到了这个社区源,备忘
https://copr.fedorainfracloud.org/coprs/mrmeee/awx/
安装流程
yum -y install epel-release
yum -y install centos-release-scl centos-release-scl-rh
yum install -y wget
wget -O /etc/yum.repos.d/ansible-awx.repo https://copr.fedorainfracloud.org/coprs/mrmeee/ansible-awx/repo/epel-7/mrmeee-ansible-awx-epel-7.repo
echo "[bintraybintray-rabbitmq-rpm]
name=bintray-rabbitmq-rpm
baseurl=https://dl.bintray.com/rabbitmq/rpm/rabbitmq-server/v3.7.x/el/7/
gpgcheck=0
repo_gpgcheck=0
enabled=1
[bintraybintray-rabbitmq-erlang-rpm]
name=bintray-rabbitmq-erlang-rpm
baseurl=https://dl.bintray.com/rabbitmq-erlang/rpm/erlang/21/el/7/
gpgcheck=0
repo_gpgcheck=0
enabled=1" > /etc/yum.repos.d/rabbitmq-erlang.repo
yum -y install rabbitmq-server
yum install -y rh-postgresql10 memcached
yum -y install rh-python36
yum -y install --disablerepo='*' --enablerepo='mrmeee-ansible-awx, base' -x -debuginfo rh-python36
scl enable rh-postgresql10 "postgresql-setup initdb"
systemctl enable rabbitmq-server
systemctl start rabbitmq-server
-
Start services: Postgresql Database
systemctl start rh-postgresql10-postgresql.service systemctl enable rh-postgresql10-postgresql.service
-
Start services: Memcached
systemctl enable memcached
systemctl start memcached
-
Create Postgres user and DB:
scl enable rh-postgresql10 "su postgres -c \"createuser -S awx\"" scl enable rh-postgresql10 "su postgres -c \"createdb -O awx awx\""
Configure AWX
- Import Database data:
sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage migrate"
- Initial configuration of AWX
echo "from django.contrib.auth.models import User; User.objects.create_superuser('admin', 'root@localhost', 'password')" | sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage shell"
sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage create_preload_data"
sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage provision_instance --hostname=$(hostname)"
sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage register_queue --queuename=tower --hostnames=$(hostname)"
Install and Configure Web Server Proxy
- Install NGINX as proxy:
yum -y install nginx
wget -O /etc/nginx/nginx.conf https://raw.githubusercontent.com/MrMEEE/awx-build/master/nginx.conf
systemctl enable nginx
systemctl start nginx
Start and Enable AWX
- Start Services
systemctl start awx-cbreceiver
systemctl start awx-dispatcher
systemctl start awx-channels-worker
systemctl start awx-daphne
systemctl start awx-web
- Enable Services
systemctl enable awx-cbreceiver
systemctl enable awx-dispatcher
systemctl enable awx-channels-worker
systemctl enable awx-daphne
systemctl enable awx-web
yum install -y ansible-awx
troubshooting
所有的任务全部pending,日志里打印了大量的
Jul 25 05:20:09 awx-office scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
Jul 25 05:20:09 awx-office scl: utility.execute()
Jul 25 05:20:09 awx-office scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/django/core/management/__init__.py", line 375, in execute
Jul 25 05:20:09 awx-office scl: self.fetch_command(subcommand).run_from_argv(self.argv)
Jul 25 05:20:09 awx-office scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/django/core/management/base.py", line 323, in run_from_argv
Jul 25 05:20:09 awx-office scl: self.execute(*args, **cmd_options)
Jul 25 05:20:09 awx-office scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/django/core/management/base.py", line 364, in execute
Jul 25 05:20:09 awx-office scl: output = self.handle(*args, **options)
Jul 25 05:20:09 awx-office scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/awx/main/management/commands/run_dispatcher.py", line 123, in handle
Jul 25 05:20:09 awx-office scl: reaper.reap()
Jul 25 05:20:09 awx-office scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/awx/main/dispatch/reaper.py", line 36, in reap
Jul 25 05:20:09 awx-office scl: me = instance or Instance.objects.me()
Jul 25 05:20:09 awx-office scl: File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/awx/main/managers.py", line 116, in me
Jul 25 05:20:09 awx-office scl: raise RuntimeError("No instance found with the current cluster host id")
Jul 25 05:20:09 awx-office scl: RuntimeError: No instance found with the current cluster host id
Jul 25 05:20:09 awx-office systemd: awx-dispatcher.service: main process exited, code=exited, status=1/FAILURE
Jul 25 05:20:09 awx-office systemd: Unit awx-dispatcher.service entered failed state.
Jul 25 05:20:09 awx-office systemd: awx-dispatcher.service failed.
Jul 25 05:20:11 awx-office systemd: awx-dispatcher.service holdoff time over, scheduling restart.
Jul 25 05:20:11 awx-office systemd: Stopped AWX Dispatcher.
Jul 25 05:20:11 awx-office systemd: Started AWX Dispatcher.
找了半天文档,发现是一个bug
This is a bug in Ansible Tower 3.1.x and 3.2 backup and restore that will be addressed in Ansible Tower 3.2.2.
The current resolution is to rerun the setup.sh after you do the restore. That will reconfigure the rabbitmq.py in the current installation.
Root cause is while doing a backup on a Tower instance, it is not excluding rabbitmq.py and hence while doing a restore on a different Ansible Tower instance it restores the original rabbitmq.py, which breaks the rabbitmq clustering.
用下面的命令修复
sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage create_preload_data"
sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage provision_instance --hostname=$(hostname)"
sudo -u awx scl enable rh-python36 rh-postgresql10 "awx-manage register_queue --queuename=tower --hostnames=$(hostname)"
systemctl restart awx-cbreceiver
systemctl restart awx-dispatcher
systemctl restart awx-channels-worker
systemctl restart awx-daphne
systemctl restart awx-web
网友评论