openstack集群中发现某计算节点计算服务down
在计算节点输入:
service nova-compute restart
服务无法启动,查看:
/var/log/nova/nova-compute.log
发现如下错误:
InstanceNotFound: Instance 00f71bfc-b8c5-474c-94c8-02686c1af24b could not be found
说明出现了僵尸实例,可能是使用虚拟机时非法关闭nova服务或关闭虚拟机导致。解决办法是删除该实例。
首先出现该问题的计算节点上找到如下目录:
/var/lib/nova/instacnes
删除该实例文件:
sudo rm -rf 00f71bfc-b8c5-474c-94c8-02686c1af24b
然后在控制节点数据库中删除该实例,以下语句中
$1 代表 00f71bfc-b8c5-474c-94c8-02686c1af24b:
sudo mysql -u root -p
use nova;
delete from security_group_instance_association where instance_uuid='$1';
delete from instance_info_caches where instance_uuid='$1';
delete from block_device_mapping where instance_uuid='$1';
delete from instance_actions where instance_uuid='$1'; //注意,这里需要手动删除另外的记录,下文会提到
delete from instance_faults where instance_uuid='$1';
delete from instance_extra where instance_uuid='$1';
delete from instance_system_metadata where instance_uuid='$1';
delete from instances where uuid='$1'; //注意,在我使用的pike版本中,instances 表中为 uuid,而不是 instance_id
在如下语句中:
delete from instance_actions where instance_uuid='$1';
提示:
Cannot delete or update a parent row: a foreign key constraint fails('nova'.'instance_actions_events', CONSTRAINT 'instance_actions_events_ibfk_1' FOREIGN KEY ('action_id') REFERENCES 'instance_actions'('id'))
因为 instance_actions 表中 主键id 是 instance_actions_events 表中 action_id 的外键,如果删除 instance_actions 表中记录,且该记录中的id主键被 instance_actions_events 表中 action_id 引用,就会报错。
所以先判断删除 instance_action_events 表中的数据。
首先根据 uuid 找到 instance_actions 表中的记录:
select id from instance_actions where instance_uuid='$1';
记录找到的id,然后去 instance_actions_events 表中删除记录,假设id=$2:
delete from instances_actions_events where action_id='$2';
最后执行:
delete from instance_actions where instance_uuid='$1';
成功删除。
在以下语句时:
delete from instances where uuid='$1';
遇到类似问题:
Cannot delete or update a parent row: a foreign key constraint fails ('nova'.'migrations', CONSTRAINT 'fk_migrations_instance_uuid' FOREIGH KEY ('instance_uuid') REFERENCES 'instances('uuid')')
看来这次产生僵尸实例是虚拟机迁移出错,所以在 migrations 表中删除记录:
delete from migrations where instance_uuid='$1';
再次执行:
delete from instances where uuid='$1';
遇到相似提示:
Cannot delete or update a parent row: a foreign key constraint fails ('nova'.'virtual_interfaces', CONSTRAINT 'virtual_interfaces_instance_uuid' FOREIGH KEY ('instance_uuid') REFERENCES 'instances('uuid')')
执行:
delete from virtual_interfaces where instance_uuid='$1';
最后执行:
delete from instances where uuid='$1';
成功删除记录!
在出错的计算节点中运行:
service nova-compute restart
service nova-compute status
发现服务成功启动,错误解决。
网友评论