Kolla-ansible 源码分析
简介
Kolla-ansible项目提供一个完整的Ansible Playbook,来部署Docker的镜像,再完成openstack组件的自动化部署。并提供all-in-one和multihost的环境。
源码地址:https://github.com/openstack/kolla-ansible.git
源码目录概要
一级目录
- Ansible: ansible的整个playbook代码,包括部署docker容器和openstack组件。源码主要集中在这个目录下。
- Contrib:包括用heat和magnum的部署环境和vagrant的部署环境。
- Deploy-guide: 部署指南,主要包括all-in-one和mulihosts两种部署方式的指南。
- Doc:文档。
- Etc: 一些配置文件,安装完了引用到了/etc目录下,all-in-one只要修改很少量的配置。
- Kolla-ansible: 记录版本信息,cmd子目录下有生成密码和合并密码的两个脚本。pbr打包的时候被封装成为可执行命令。
- Releasenodes: 发布特性说明。
- Specs: 包含有Kolla社区关键参数代码库的变化。
- Tests: 包括一些功能测试工具,这里还包括两个自定义的ansible plugin(merge_config)和module(kolla_docker)的测试。
- Tools: 一些和kolla交换的脚本工具,大部分是可手动调用,主要完成一些安装前后的一些操作。有些会被ansible目录下的task调用到。
二级目录
- Ansible/action_plugins: 自定义ansible插件,两个脚本,用于合并yml和conifg的配置文件。
- Ansible/group_vars: ansible脚本的全局变量定义。
- Ansible/inventory: 包含all-in-one和mulitnode的样板hosts清单。
- Ansible/library: 包括一些自定义的ansible模块,bslurp.py和kolla_docker.py用到比较多。
- Ansible/role: 所有的openstack的组件,几乎包含了说有开源项目,当前版本有60个组件。
- Ansible:除了文件夹之外的ansible脚本,主要用户安装前后的环境准备和清理,数据库恢复等特殊系统级的操作。
关键代码解读
Setup.cfg安装配置入口文件, 见中文注释
[metadata]
name = kolla-ansible // 项目名称
summary = Ansible Deployment of Kolla containers
description-file = README.rst
author = OpenStack
author-email = openstack-dev@lists.openstack.org
home-page = http://docs.openstack.org/developer/kolla-ansible/
license = Apache License, Version 2.0
classifier =
Environment :: OpenStack
Intended Audience :: Information Technology
Intended Audience :: System Administrators
License :: OSI Approved :: Apache Software License
Operating System :: POSIX :: Linux
Programming Language :: Python
Programming Language :: Python :: 2
Programming Language :: Python :: 2.7
Programming Language :: Python :: 3
Programming Language :: Python :: 3.5
[files]
packages = kolla_ansible //包名
data_files = //pbr方式打包对应的文件映射
share/kolla-ansible/ansible = ansible/*
share/kolla-ansible/tools = tools/validate-docker-execute.sh
share/kolla-ansible/tools = tools/cleanup-containers
share/kolla-ansible/tools = tools/cleanup-host
share/kolla-ansible/tools = tools/cleanup-images
share/kolla-ansible/tools = tools/stop-containers
share/kolla-ansible/doc = doc/*
share/kolla-ansible/etc_examples = etc/*
share/kolla-ansible = tools/init-runonce
share/kolla-ansible = tools/init-vpn
share/kolla-ansible = tools/openrc-example
share/kolla-ansible = setup.cfg
scripts = //可执行脚本
tools/kolla-ansible
[entry_points]
console_scripts = //控制台可执行脚本,执行两个Python文件的main函数
kolla-genpwd = kolla_ansible.cmd.genpwd:main
kolla-mergepwd = kolla_ansible.cmd.mergepwd:main
[global]
setup-hooks =
pbr.hooks.setup_hook
[pbr] //打包方式
[build_sphinx]
all_files = 1
build-dir = doc/build
source-dir = doc
[build_releasenotes]
all_files = 1
build-dir = releasenotes/build
source-dir = releasenotes/source
Setup.py
安装执行脚本,通过pbr打包,执行过程会读取setup.cfg配置,还会安装同父目录下requirements.txt中的依赖。更多参考https://julien.danjou.info/blog/2017/packaging-python-with-pbr
import setuptools
# In python < 2.7.4, a lazy loading of package `pbr` will break
# setuptools if some other modules registered functions in `atexit`.
# solution from: http://bugs.python.org/issue15881#msg170215
try:
import multiprocessing # noqa
except ImportError:
pass
setuptools.setup(
setup_requires=['pbr>=2.0.0'],
pbr=True)
tools\kolla-ansible
该脚本是封装了ansible-playbook,对kolla进行了ansible的定制。主要根据action的类型,传递不同的配置文件。
中间基础变量定义:
find_base_dir
INVENTORY="${BASEDIR}/ansible/inventory/all-in-one"
PLAYBOOK="${BASEDIR}/ansible/site.yml"
VERBOSITY=
EXTRA_OPTS=${EXTRA_OPTS}
CONFIG_DIR="/etc/kolla"
PASSWORDS_FILE="${CONFIG_DIR}/passwords.yml"
DANGER_CONFIRM=
INCLUDE_IMAGES=
Find_base_dir是一个脚本开始时候的一个函数(不展开解释),用于找到kolla-ansible脚本所在的路径。
脚本传参解释:
while [ "$#" -gt 0 ]; do
case "$1" in
(--inventory|-i)
INVENTORY="$2"
shift 2
;;
(--playbook|-p)
PLAYBOOK="$2"
shift 2
;;
(--tags|-t)
EXTRA_OPTS="$EXTRA_OPTS --tags $2"
shift 2
;;
(--verbose|-v)
VERBOSITY="$VERBOSITY --verbose"
shift 1
;;
(--configdir)
CONFIG_DIR="$2"
shift 2
;;
(--yes-i-really-really-mean-it)
DANGER_CONFIRM="$1"
shift 1
;;
(--include-images)
INCLUDE_IMAGES="$1"
shift 1
;;
(--key|-k)
VAULT_PASS_FILE="$2"
EXTRA_OPTS="$EXTRA_OPTS --vault-password-file=$VAULT_PASS_FILE"
shift 2
;;
(--extra|-e)
EXTRA_OPTS="$EXTRA_OPTS -e $2"
shift 2
;;
(--passwords)
PASSWORDS_FILE="$2"
shift 2
;;
(--help|-h)
usage
shift
exit 0
;;
(--)
shift
break
;;
(*)
echo "error"
exit 3
;;
esac
done
case "$1" in
(prechecks)
ACTION="Pre-deployment checking"
EXTRA_OPTS="$EXTRA_OPTS -e action=precheck"
;;
(check)
ACTION="Post-deployment checking"
EXTRA_OPTS="$EXTRA_OPTS -e action=check"
;;
(mariadb_recovery)
ACTION="Attempting to restart mariadb cluster"
EXTRA_OPTS="$EXTRA_OPTS -e action=deploy -e common_run=true"
PLAYBOOK="${BASEDIR}/ansible/mariadb_recovery.yml"
;;
(destroy)
ACTION="Destroy Kolla containers, volumes and host configuration"
PLAYBOOK="${BASEDIR}/ansible/destroy.yml"
if [[ "${INCLUDE_IMAGES}" == "--include-images" ]]; then
EXTRA_OPTS="$EXTRA_OPTS -e destroy_include_images=yes"
fi
if [[ "${DANGER_CONFIRM}" != "--yes-i-really-really-mean-it" ]]; then
cat << EOF
WARNING:
This will PERMANENTLY DESTROY all deployed kolla containers, volumes and host configuration.
There is no way to recover from this action. To confirm, please add the following option:
--yes-i-really-really-mean-it
EOF
exit 1
fi
;;
(bootstrap-servers)
ACTION="Bootstraping servers"
PLAYBOOK="${BASEDIR}/ansible/kolla-host.yml"
EXTRA_OPTS="$EXTRA_OPTS -e action=bootstrap-servers"
;;
(deploy)
ACTION="Deploying Playbooks"
EXTRA_OPTS="$EXTRA_OPTS -e action=deploy"
;;
(deploy-bifrost)
ACTION="Deploying Bifrost"
PLAYBOOK="${BASEDIR}/ansible/bifrost.yml"
EXTRA_OPTS="$EXTRA_OPTS -e action=deploy"
;;
(deploy-servers)
ACTION="Deploying servers with bifrost"
PLAYBOOK="${BASEDIR}/ansible/bifrost.yml"
EXTRA_OPTS="$EXTRA_OPTS -e action=deploy-servers"
;;
(post-deploy)
ACTION="Post-Deploying Playbooks"
PLAYBOOK="${BASEDIR}/ansible/post-deploy.yml"
;;
(pull)
ACTION="Pulling Docker images"
EXTRA_OPTS="$EXTRA_OPTS -e action=pull"
;;
(upgrade)
ACTION="Upgrading OpenStack Environment"
EXTRA_OPTS="$EXTRA_OPTS -e action=upgrade -e serial=${ANSIBLE_SERIAL}"
;;
(reconfigure)
ACTION="Reconfigure OpenStack service"
EXTRA_OPTS="$EXTRA_OPTS -e action=reconfigure -e serial=${ANSIBLE_SERIAL}"
;;
(stop)
ACTION="Stop Kolla containers"
PLAYBOOK="${BASEDIR}/ansible/stop.yml"
;;
(certificates)
ACTION="Generate TLS Certificates"
PLAYBOOK="${BASEDIR}/ansible/certificates.yml"
;;
(genconfig)
ACTION="Generate configuration files for enabled OpenStack services"
EXTRA_OPTS="$EXTRA_OPTS -e action=config"
;;
(*) usage
exit 0
;;
esac
这段是根据传递的参数,不同的参数针对不同的配置文件等额外属性。这里第一个参数有好多action,如deploy,post-deploy,stop等等。
最后三行,组合命令,执行:
CONFIG_OPTS="-e @${CONFIG_DIR}/globals.yml -e @${PASSWORDS_FILE} -e CONFIG_DIR=${CONFIG_DIR}"
CMD="ansible-playbook -i $INVENTORY $CONFIG_OPTS $EXTRA_OPTS $PLAYBOOK $VERBOSITY"
process_cmd
传递进来的参数组合成ansible-playbook的CMD命令后,调用process_cmd函数执行。
- 例子1:初始命令
kolla-ansible deploy -i /home/all-in-one
封装后命令
ansible-playbook -i /home/all-in-one -e @/etc/kolla/globals.yml -e @/etc/kolla/passwords.yml -e CONFIG_DIR=/etc/kolla -e action=deploy /usr/share/kolla-ansible/ansible/site.yml - 例子2:初始命令
kolla-ansible post-deploy
封装后命令
ansible-playbook -i /usr/share/kolla-ansible/ansible/inventory/all-in-one -e @/etc/kolla/globals.yml -e @/etc/kolla/passwords.yml -e CONFIG_DIR=/etc/kolla /usr/share/kolla-ansible/ansible/post-deploy.yml
ansible剧本代码解
由于openstack的组件比较多,且大多数处于并列关系。这里不再一一展开,以neutron为例进行解读。其他没有涉及的有重要的点,会进行内容穿插。
Ansible/library/
该目录下是一些自定义的一些模块,这些module在目标节点上运行,包括bslurp.py,kolla_container_facts.py,kolla_docker.py,kolla-toolbox.py,merge_configs.py,merge_yaml.py,前四个我会一一介绍,后两个是空文件,代码其实在action_plugins目录中(那两个空文件是映射同名的action,我们可以向module一样使用action)。
Bslurp.py,好像做了文件分发的事情,通过copy_from_host(我没查到调用它的地方)和copy_to_host(在ceph角色部署的时候,下发keyring到osd节点是通过这个模块函数下发的)两个函数实现,。判断依据是模块参数dest。见代码中文注释。
def copy_from_host(module):
#此处省略不少代码
module.exit_json(content=base A64.b64encode(data), sha1=sha1, mode=mode,
source=src)
def copy_to_host(module):
compress = module.params.get('compress')
dest = module.params.get('dest')
mode = int(module.params.get('mode'), 0)
sha1 = module.params.get('sha1')
# src是加密后的数据
src = module.params.get('src')
# decode已经加密的数据
data = base64.b64decode(src)
#解压数据
raw_data = zlib.decompress(data) if compress else data
#sha1安全算法数据校验
if sha1:
if os.path.exists(dest):
if os.access(dest, os.R_OK):
with open(dest, 'rb') as f:
if hashlib.sha1(f.read()).hexdigest() == sha1:
module.exit_json(changed=False)
else:
module.exit_json(failed=True, changed=False,
msg='file is not accessible: {}'.format(dest))
if sha1 != hashlib.sha1(raw_data).hexdigest():
module.exit_json(failed=True, changed=False,
msg='sha1 sum does not match data')
# 保存数据到dest值。这段代码有健壮性问题,没有考虑到磁盘写满的场景,保险的做法创建一个tmp文件,把数据拷贝到tmp文件,再把tmp文件重命名为dest值。否则容易把文件写空。
with os.fdopen(os.open(dest, os.O_WRONLY | os.O_CREAT, mode), 'wb') as f:
f.write(raw_data)
#调用module要求的exit_json接口退出。
module.exit_json(changed=True)
def main():
# 定义dict类型的参数,ansible module的接口要求
argument_spec = dict(
compress=dict(default=True, type='bool'),
dest=dict(type='str'),
mode=dict(default='0644', type='str'),
sha1=dict(default=None, type='str'),
src=dict(required=True, type='str')
)
# 创建ansible模块对象
module = AnsibleModule(argument_spec)
# 获取模块dest参数值
dest = module.params.get('dest')
try:
if dest:
# 如果dest参数存在,则推送操作,push下发到相应的host
copy_to_host(module)
else:
# 如果dest参数不存在,则进行pull操作。
copy_from_host(module)
except Exception:
# 异常场景下退出,ansible自定义模块语法规范。
module.exit_json(failed=True, changed=True,
msg=repr(traceback.format_exc()))
# import module snippets
from ansible.module_utils.basic import * # noqa
if __name__ == '__main__':
main()
kolla_docker.py,容器相关的操作,openstack的组件都通过容器部署,每个组件role的部署都会用到,非常重要。
...
#创建docker的client函数
def get_docker_client():
try:
return docker.Client
except AttributeError:
return docker.APIClient
A
class DockerWorker(object):
def __init__(self, module):
# 构造函数,传入参数是AnsibleModule类型的对象
self.module = module
# params参数续传
self.params = self.module.params
self.changed = False
# TLS not fully implemented
# tls_config = self.generate_tls()
# 创建一个docker.client对象
options = {
'version': self.params.get('api_version')
}
self.dc = get_docker_client()(**options)
# ....
# ....
# 启动容器的函数,是AnsibleModule的其中一种action
def start_container(self):
#检查镜像是否存在,不存在pull
if not self.check_image():
self.pull_image()
#检查容器
container = self.check_container()
#容器异样,则删除,再回调
if container and self.check_container_differs():
self.stop_container()
self.remove_container()
container = self.check_container()
#容器不存在,创建,再回调
if not container:
self.create_container()
container = self.check_container()
#容器状态非启动,则启动
if not container['Status'].startswith('Up '):
self.changed = True
self.dc.start(container=self.params.get('name'))
# We do not want to detach so we wait around for container to exit
#如果container没有detach断开,那么进入wait状态,调用fail_json方法,传递fail的参数
if not self.params.get('detach'):
rc = self.dc.wait(self.params.get('name'))
if rc != 0:
self.module.fail_json(
failed=True,
changed=True,
msg="Container exited with non-zero return code"
)
#如果返回参数remove_on_exit,那么删除该container
if self.params.get('remove_on_exit'):
self.stop_container()
self.remove_container()
def generate_module():
# NOTE(jeffrey4l): add empty string '' to choices let us use
# pid_mode: "{{ service.pid_mode | default ('') }}" in yaml
#定义参数字典,ansible module的api规范
argument_spec = dict(
common_options=dict(required=False, type='dict', default=dict()),
#action参数,必须传递,类型为str,value值必须在choices的列表
action=dict(required=True, type='str',
choices=['compare_container', 'compare_image',
'create_volume', 'get_container_env',
'get_container_state', 'pull_image',
'recreate_or_restart_container',
'remove_container', 'remove_volume',
'restart_container', 'start_container',
'stop_container']),
api_version=dict(required=False, type='str', default='auto'),
auth_email=dict(required=False, type='str'),
auth_password=dict(required=False, type='str'),
auth_registry=dict(required=False, type='str'),
auth_username=dict(required=False, type='str'),
detach=dict(required=False, type='bool', default=True),
labels=dict(required=False, type='dict', default=dict()),
name=dict(required=False, type='str'),
environment=dict(required=False, type='dict'),
image=dict(required=False, type='str'),
ipc_mode=dict(required=False, type='str', choices=['host', '']),
cap_add=dict(required=False, type='list', default=list()),
security_opt=dict(required=False, type='list', default=list()),
pid_mode=dict(required=False, type='str', choices=['host', '']),
privileged=dict(required=False, type='bool', default=False),
graceful_timeout=dict(required=False, type='int', default=10),
remove_on_exit=dict(required=False, type='bool', default=True),
restart_policy=dict(required=False, type='str', choices=[
'no',
'never',
'on-failure',
'always',
'unless-stopped']),
restart_retries=dict(required=False, type='int', default=10),
tls_verify=dict(required=False, type='bool', default=False),
tls_cert=dict(required=False, type='str'),
tls_key=dict(required=False, type='str'),
tls_cacert=dict(required=False, type='str'),
volumes=dict(required=False, type='list'),
volumes_from=dict(required=False, type='list')
)
# 属性依赖ansible module的api规范,如start_container这个action, #必须要image和name这个两个属性。
required_if = [
['action', 'pull_image', ['image']],
['action', 'start_container', ['image', 'name']],
['action', 'compare_container', ['name']],
['action', 'compare_image', ['name']],
['action', 'create_volume', ['name']],
['action', 'get_container_env', ['name']],
['action', 'get_container_state', ['name']],
['action', 'recreate_or_restart_container', ['name']],
['action', 'remove_container', ['name']],
['action', 'remove_volume', ['name']],
['action', 'restart_container', ['name']],
['action', 'stop_container', ['name']]
]
#实例化
module = AnsibleModule(
argument_spec=argument_spec,
required_if=required_if,
bypass_checks=False
)
#以下部分主要做环境变量和通用参数以及特殊参数的更新。
new_args = module.params.pop('common_options', dict())
# NOTE(jeffrey4l): merge the environment
env = module.params.pop('environment', dict())
if env:
new_args['environment'].update(env)
for key, value in module.params.items():
if key in new_args and value is None:
continue
new_args[key] = value
# if pid_mode = ""/None/False, remove it
if not new_args.get('pid_mode', False):
new_args.pop('pid_mode', None)
# if ipc_mode = ""/None/False, remove it
if not new_args.get('ipc_mode', False):
new_args.pop('ipc_mode', None)
module.params = new_args
# 返回为AnsibleModule实例
return module
def main():
module = generate_module()
try:
dw = DockerWorker(module)
# TODO(inc0): We keep it bool to have ansible deal with consistent
# types. If we ever add method that will have to return some
# meaningful data, we need to refactor all methods to return dicts.
#返回值 result是action传递的函数名的运行成功与否的结果,意义#不大
result = bool(getattr(dw, module.params.get('action'))())
module.exit_json(changed=dw.changed, result=result)
except Exception:
module.exit_json(failed=True, changed=True,
msg=repr(traceback.format_exc()))
# import module snippets
from ansible.module_utils.basic import * # noqa
if __name__ == '__main__':
main()
Kolla_toolbox.py,在toolbox容器中运行ansible命令。这个我就不展开了,只贴关键代码。
#生成commandline函数,只包含ansible的命令
def gen_commandline(params):
command = ['ansible', 'localhost']
....
....
return command
#主函数
def main():
....
....
client = get_docker_client()(
version=module.params.get('api_version'))
#调用函数,获取命令(dict类型)
command_line = gen_commandline(module.params)
#过滤名称为kolla_toolbox的容器列表
kolla_toolbox = client.containers(filters=dict(name='kolla_toolbox',
status='running'))
if not kolla_toolbox:
module.fail_json(msg='kolla_toolbox container is not running.')
#默认只有一个,所有选了数组的第一个。kolla_toolbox变量名不建议重复使用,开源代码就是坑。
kolla_toolbox = kolla_toolbox[0]
#在容器中执行命令
job = client.exec_create(kolla_toolbox, command_line)
output = client.exec_start(job)
....
....
module.exit_json(**ret)
****Kolla_container_facts.py**** 调用dockerclient的python接口获取指定容器的facts信息,只传递一个name值即可。result类型是dict(changed=xxx, _containers=[]),代码不展开了。
Ansible/action_plugins/
该目录下记录了自定义的action的plugins,这些plugins在master上运行。但也可以在library目录下定义同名空文件,可以当作module使用。这里有两个代码文件,merge_configs.py和merge_yaml.py,用于conf和yml配置文件的合并。这里就分下下merge_yaml.py这个action plugin.
Merge_yaml.py,在task的参数sources中传递多个yml文件,合并之后输出到目标节点的dest中。期间在合并的同时,进行了参数模拟变量的渲染工作,最后调用copy模块把渲染后的数据文件复制过去。分析代码如下。
from ansible.plugins import action
#继承父类action.ActionBase
class ActionModule(action.ActionBase):
TRANSFERS_FILES = True
def read_config(self, source):
result = None
# Only use config if present
if os.access(source, os.R_OK):
with open(source, 'r') as f:
template_data = f.read()
# 渲染template模板数据,因为最终执行copy模块的时候,
# 变量被重新还原了,所以这里要先template变量先渲染,
# 因为有些变量对可能在copy模块中会消失
template_data = self._templar.template(template_data)
#把YAML数据,转化为dict对象
result = safe_load(template_data)
return result or {}
# 自定义action plugin必须实现的方法
def run(self, tmp=None, task_vars=None):
#task_vars是这个task的一些外传入的变量,
# 如host vars, group vars, config vars,etc
if task_vars is None:
task_vars = dict()
#自定义action plugin必须调用父类的run方法
result = super(ActionModule, self).run(tmp, task_vars)
# NOTE(jeffrey4l): Ansible 2.1 add a remote_user param to the
# _make_tmp_path function. inspect the number of the args here. In
# this way, ansible 2.0 and ansible 2.1 are both supported
#创建tmp临时目录,兼容2.0以后的版本
make_tmp_path_args = inspect.getargspec(self._make_tmp_path)[0]
if not tmp and len(make_tmp_path_args) == 1:
tmp = self._make_tmp_path()
if not tmp and len(make_tmp_path_args) == 2:
remote_user = (task_vars.get('ansible_user')
or self._play_context.remote_user)
tmp = self._make_tmp_path(remote_user)
# save template args.
# _task.args是这个task的参数,这里把参数中key为vars的对应值
# 保存到extra_vars变量中
extra_vars = self._task.args.get('vars', list())
# 备份template的可用变量
old_vars = self._templar._available_variables
# 将task_vars和extra_vars的所有变量merge到一起,赋值到temp_vars
temp_vars = task_vars.copy()
temp_vars.update(extra_vars)
#把最新的变量数据设置到templar对象(模板对象)
self._templar.set_available_variables(temp_vars)
output = {}
# 获取task的参数为sources的values值,可能是单个文件,
# 也有可能是多个文件组成的list
sources = self._task.args.get('sources', None)
#非数组,转化为只有一个item的数组
if not isinstance(sources, list):
sources = [sources]
#便历sources数组,读取文件中内容,并合并更新
#dict.update方式有去重效果,相当于merge
for source in sources:
output.update(self.read_config(source))
# restore original vars
#还原templar对象的变量
self._templar.set_available_variables(old_vars)
#把最新的合并好的数据output传递到远端的target host。复制给xfered变量
remote_path = self._connection._shell.join_path(tmp, 'src')
xfered = self._transfer_data(remote_path,
dump(output,
default_flow_style=False))
#把本task的参数拷贝,作为新模块的参数new_module_args
new_module_args = self._task.args.copy()
#更新new_module_args的src的值,后面copy模块的参数要求
new_module_args.update(
dict(
src=xfered
)
)
#删除new_module_args的sources参数,后面copy模块的参数要求
del new_module_args['sources']
#传入最新的参数new_module_args,task_vars执行copy的module
result.update(self._execute_module(module_name='copy',
module_args=new_module_args,
task_vars=task_vars,
tmp=tmp))
#返回result, action plugin 接口的要求
return result
Ansible/inventory/all-in-one
#control主机组包含本地localhost节点,连接方式为local
[control]
localhost ansible_connection=local
[network]
localhost ansible_connection=local
#neutron主机组包含network组下的所有节点
[neutron:children]
network
# Neutron
#neutron-server主机组包含control组下的所有节点
[neutron-server:children]
control
#neutron-dhcp-agent主机组包含neutron组下的所有节点
[neutron-dhcp-agent:children]
neutron
[neutron-l3-agent:children]
neutron
[neutron-lbaas-agent:children]
neutron
[neutron-metadata-agent:children]
neutron
[neutron-vpnaas-agent:children]
neutron
[neutron-bgp-dragent:children]
neutron
Ansible/site.yml
#调用ansible的setup获取节点的facts。gather_facts被设置为false是为了避免ansible再次去gathering facts.
- name: Gather facts for all hosts
hosts: all
serial: '{{ serial|default("0") }}'
gather_facts: false
tasks:
- setup:
tags: always
# NOTE(pbourke): This case covers deploying subsets of hosts using --limit. The
# limit arg will cause the first play to gather facts only about that node,
# meaning facts such as IP addresses for rabbitmq nodes etc. will be undefined
# in the case of adding a single compute node.
# We don't want to add the delegate parameters to the above play as it will
# result in ((num_nodes-1)^2) number of SSHs when running for all nodes
# which can be very inefficient.
- name: Gather facts for all hosts (if using --limit)
hosts: all
serial: '{{ serial|default("0") }}'
gather_facts: false
tasks:
- setup:
delegate_facts: True
delegate_to: "{{ item }}"
with_items: "{{ groups['all'] }}"
when:
- (play_hosts | length) != (groups['all'] | length)
#检测openstack_release全局变量信息,默认在globals.yml是不配置的,而
#在ansible/group_vars/all.yml中配置的默认值是auto。这里的两个tasks
#就是如果在auto的场景下,通过python的pbr包去检测安装好的#kolla-ansible版本,再将该版本号赋值给openstack_release变量。
#这里用到了ansible自带的local_action模块和register中间信息存储模块。
- name: Detect openstack_release variable
hosts: all
gather_facts: false
tasks:
- name: Get current kolla-ansible version number
local_action: command python -c "import pbr.version; print(pbr.version.VersionInfo('kolla-ansible'))"
register: kolla_ansible_version
changed_when: false
when: openstack_release == "auto"
- name: Set openstack_release variable
set_fact:
openstack_release: "{{ kolla_ansible_version.stdout }}"
when: openstack_release == "auto"
tags: always
#对所有节点进行recheck检查,前提条件是ansible-playbook命令传递的#action的值是precheck。
- name: Apply role prechecks
gather_facts: false
hosts:
- all
roles:
- role: prechecks
when: action == "precheck"
# 基于ntp时间同步角色的部署,hosts组为chrony-server和chrony
#前提条件是enable_chrony变量是否是yes,该值可在etc/kolla/globals.yml
#中配置,默认是no。
- name: Apply role chrony
gather_facts: false
hosts:
- chrony-server
- chrony
serial: '{{ serial|default("0") }}'
roles:
- { role: chrony,
tags: chrony,
when: enable_chrony | bool }
#部署neutron角色,这里部署的节点除了neutron相关的host组之外,还包括#compute和manila-share(openstack的一个文件共享组件)组。
- name: Apply role neutron
gather_facts: false
hosts:
- neutron-server
- neutron-dhcp-agent
- neutron-l3-agent
- neutron-lbaas-agent
- neutron-metadata-agent
- neutron-vpnaas-agent
- compute
- manila-share
serial: '{{ serial|default("0") }}'
roles:
- { role: neutron,
tags: neutron,
when: enable_neutron | bool }
Ansible/role/neutron/task 检查场景
该场景的action是precheck。由tasks/main.yml引用precheck.yml。
---
# kolla_container_facts是自定义的library,上文已经分析过代码,
# 用于获取容器名为neutron_server的一些容器属性数据,注册到中间变量container_facts
- name: Get container facts
kolla_container_facts:
name:
- neutron_server
register: container_facts
# 中间变量container_facts没有找到neutron_server关键字且该主机在neutron-server主机组中,
# 判断neutron_server_port 端口是否已经stopped
- name: Checking free port for Neutron Server
wait_for:
host: "{{ hostvars[inventory_hostname]['ansible_' + api_interface]['ipv4']['address'] }}"
port: "{{ neutron_server_port }}"
connect_timeout: 1
timeout: 1
state: stopped
when:
- container_facts['neutron_server'] is not defined
- inventory_hostname in groups['neutron-server']
# enable_neutron_agent_ha是true,且只规划了多个一个dhcp和l3服务节点,给出fail提示
- name: Checking number of network agents
local_action: fail msg="Number of network agents are less than two when enabling agent ha"
changed_when: false
when:
- enable_neutron_agent_ha | bool
- groups['neutron-dhcp-agent'] | length < 2
or groups['neutron-l3-agent'] | length < 2
# When MountFlags is set to shared, a signal bit configured on 20th bit of a number
# We need to check the 20th bit. 2^20 = 1048576. So we are validating against it.
# 检查docker服务的MountFlags是否设置为了shared
- name: Checking if 'MountFlags' for docker service is set to 'shared'
command: systemctl show docker
register: result
changed_when: false
failed_when: result.stdout.find('MountFlags=1048576') == -1
when:
- (inventory_hostname in groups['neutron-dhcp-agent']
or inventory_hostname in groups['neutron-l3-agent']
or inventory_hostname in groups['neutron-metadata-agent'])
- ansible_os_family == 'RedHat' or ansible_distribution == 'Ubuntu'
Ansible/role/neutron/task 部署场景
该场景的action是deploy。由tasks/main.yml引用deploy.yml
# enforce ironic usage only with openvswitch
# 裸机部署检查,检查ironic服务必须启动,neutron的plugin必须使用OpenvSwitch
- include: ironic-check.yml
#在neutron-server的节点执行注册
- include: register.yml
when: inventory_hostname in groups['neutron-server']
#执行配置,拷贝配置文件,启动组件容器主要都在这里实现
- include: config.yml
#在nova fake driver模拟场景下,计算节点执行config-neutron-fake.yml,不详细分析
#nova fake driver可以在单个计算节点中创建多个docker容器运行novc-compute,
#Nova fake driver can not work with all-in-one deployment. This is because the fake
#neutron-openvswitch-agent for the fake nova-compute container conflicts with
#neutron-openvswitch-agent on the compute nodes. Therefore, in the inventory
#the network node must be different than the compute node.
- include: config-neutron-fake.yml
when:
- enable_nova_fake | bool
- inventory_hostname in groups['compute']
#在neutron-server的节点执行创建数据库,创建容器
#bootstrap.yml会去创建数据库相关信息,结束后会去调用#bootstrap_servcie.yml该文件是用于在server节点上创建容器。
- include: bootstrap.yml
when: inventory_hostname in groups['neutron-server']
#执行handlers目录下的task任务
- name: Flush Handlers
meta: flush_handlers
Register.yml, 往keystone中注册neutron服务的鉴权相关信息。
---
# 在keystone创建neutron的service 和endpoint
# kolla_toolbox见library分析,用于在toolbox容器中执行ansible命令
# kolla_keystone_service模块是kolla-ansible的父项目kolla中的代码,已经是一个可调用的ansible模块
#service名称为neutron,对应的endpoint分为内部,管理员,公共三个。
# 变量主要在ansible/role/neutron/defauts/main.yml和ansible/group_vars/all.yml中
- name: Creating the Neutron service and endpoint
kolla_toolbox:
module_name: "kolla_keystone_service"
module_args:
service_name: "neutron"
service_type: "network"
description: "Openstack Networking"
endpoint_region: "{{ openstack_region_name }}"
url: "{{ item.url }}"
interface: "{{ item.interface }}"
region_name: "{{ openstack_region_name }}"
auth: "{{ '{{ openstack_neutron_auth }}' }}"
module_extra_vars:
openstack_neutron_auth: "{{ openstack_neutron_auth }}"
run_once: True
with_items:
- {'interface': 'admin', 'url': '{{ neutron_admin_endpoint }}'}
- {'interface': 'internal', 'url': '{{ neutron_internal_endpoint }}'}
- {'interface': 'public', 'url': '{{ neutron_public_endpoint }}'}
# 同上,创建项目,用户,角色。经分析openstack_neutron_auth变量实际为#openstack的admin的auth。
- name: Creating the Neutron project, user, and role
kolla_toolbox:
module_name: "kolla_keystone_user"
module_args:
project: "service"
user: "{{ neutron_keystone_user }}"
password: "{{ neutron_keystone_password }}"
role: "admin"
region_name: "{{ openstack_region_name }}"
auth: "{{ '{{ openstack_neutron_auth }}' }}"
module_extra_vars:
openstack_neutron_auth: "{{ openstack_neutron_auth }}"
run_once: True
Config.yml,配置文件合并下发,创建或重启容器。
#调用sysctl模块,配置ip转发相关配置
- name: Setting sysctl values
vars:
neutron_l3_agent: "{{ neutron_services['neutron-l3-agent'] }}"
neutron_vpnaas_agent: "{{ neutron_services['neutron-vpnaas-agent'] }}"
sysctl: name={{ item.name }} value={{ item.value }} sysctl_set=yes
with_items:
- { name: "net.ipv4.ip_forward", value: 1}
- { name: "net.ipv4.conf.all.rp_filter", value: 0}
- { name: "net.ipv4.conf.default.rp_filter", value: 0}
when:
- set_sysctl | bool
- (neutron_l3_agent.enabled | bool and neutron_l3_agent.host_in_groups | bool)
or (neutron_vpnaas_agent.enabled | bool and neutron_vpnaas_agent.host_in_groups | bool)
# 创建neutron各服务的配置文件目录,前提条件主要看host_in_groups变量,这个在
# ansible/role/neutron/defauts/main.yml文件中进行了详细的定义
- name: Ensuring config directories exist
file:
path: "{{ node_config_directory }}/{{ item.key }}"
state: "directory"
recurse: yes
when:
- item.value.enabled | bool
- item.value.host_in_groups | bool
with_dict: "{{ neutron_services }}"
....
....
#下发配置文件到指定目录,三份陪配置文件合一,merge_conifgs模块在action plugin中已经分析过了
#文件下发完了之后,通知相应组件的容器重启。在handlers目录下。
# 重启容器这个操作会调用recreate_or_restart_container这个action,第一次会创建容器。
- name: Copying over neutron_lbaas.conf
vars:
service_name: "{{ item.key }}"
services_need_neutron_lbaas_conf:
- "neutron-server"
- "neutron-lbaas-agent"
merge_configs:
sources:
- "{{ role_path }}/templates/neutron_lbaas.conf.j2"
- "{{ node_custom_config }}/neutron/neutron_lbaas.conf"
- "{{ node_custom_config }}/neutron/{{ inventory_hostname }}/neutron_lbaas.conf"
dest: "{{ node_config_directory }}/{{ item.key }}/neutron_lbaas.conf"
register: neutron_lbaas_confs
when:
- item.value.enabled | bool
- item.value.host_in_groups | bool
- item.key in services_need_neutron_lbaas_conf
with_dict: "{{ neutron_services }}"
notify:
- "Restart {{ item.key }} container"
....
....
#kolla_docker是自定义的模块,通过调用compare_container查找该节点的所有neutron service容器
- name: Check neutron containers
kolla_docker:
action: "compare_container"
common_options: "{{ docker_common_options }}"
name: "{{ item.value.container_name }}"
image: "{{ item.value.image }}"
privileged: "{{ item.value.privileged | default(False) }}"
volumes: "{{ item.value.volumes }}"
register: check_neutron_containers
when:
- action != "config"
- item.value.enabled | bool
- item.value.host_in_groups | bool
with_dict: "{{ neutron_services }}"
notify:
- "Restart {{ item.key }} container"
Bootstrap.yml,创建neutron数据库对象。
---
# kolla_toolbox自定义模块,在toolbox容器中调用mysql_db的ansible模块,创建db
# delegate_to指定在第一个neutron-server上执行,run_onece只运行一次
- name: Creating Neutron database
kolla_toolbox:
module_name: mysql_db
module_args:
login_host: "{{ database_address }}"
login_port: "{{ database_port }}"
login_user: "{{ database_user }}"
login_password: "{{ database_password }}"
name: "{{ neutron_database_name }}"
register: database
run_once: True
delegate_to: "{{ groups['neutron-server'][0] }}"
# 创建neuron数据库的用户并设置权限
- name: Creating Neutron database user and setting permissions
kolla_toolbox:
module_name: mysql_user
module_args:
login_host: "{{ database_address }}"
login_port: "{{ database_port }}"
login_user: "{{ database_user }}"
login_password: "{{ database_password }}"
name: "{{ neutron_database_name }}"
password: "{{ neutron_database_password }}"
host: "%"
priv: "{{ neutron_database_name }}.*:ALL"
append_privs: "yes"
run_once: True
delegate_to: "{{ groups['neutron-server'][0] }}"
#数据库改变之后,调用bootstrap_service.yml
- include: bootstrap_service.yml
when: database.changed
Bootstrap_service.yml,创建bootrsap_neutron,bootrsap_neutron_lbassd-agent,bootrsap_neutron_vpnaas_agent容器。
Ansible/role/neutron/handlers/main.yml,重建或重启neutron相关容器。列出一个分析下。
# 举例分析:neutron_lbaas_confs变量是之前执行的时候注册的变量,这里的task变量
# neutron_lbaas_conf从neutron_lbaas_confs的结果中取值分析作为when的条件判断
# kolla_docker模块的volumes参数是从role/neutron/defaults/main.yml中获取,如下,
#volumes:
# - "{{ node_config_directory }}/neutron-lbaas-agent/:{{
# container_config_directory }}/:ro"
# - "/etc/localtime:/etc/localtime:ro"
# - "/run:/run:shared"
# - "kolla_logs:/var/log/kolla/"
#且在config.yml已经把neutron_lbaas的配置文件下发{{ node_config_directory }}/
# neutron-lbaas-agent/这个目录系了。 容器中的路径container_config_directory变量
# 可以在groups_var中找到,值为 /var/lib/kolla/config_files。
- name: Restart neutron-server container
vars:
service_name: "neutron-server"
service: "{{ neutron_services[service_name] }}"
config_json: "{{ neutron_config_jsons.results|selectattr('item.key', 'equalto', service_name)|first }}"
neutron_conf: "{{ neutron_confs.results|selectattr('item.key', 'equalto', service_name)|first }}"
neutron_lbaas_conf: "{{ neutron_lbaas_confs.results|selectattr('item.key', 'equalto', service_name)|first }}"
neutron_ml2_conf: "{{ neutron_ml2_confs.results|selectattr('item.key', 'equalto', service_name)|first }}"
policy_json: "{{ policy_jsons.results|selectattr('item.key', 'equalto', service_name)|first }}"
neutron_server_container: "{{ check_neutron_containers.results|selectattr('item.key', 'equalto', service_name)|first }}"
kolla_docker:
action: "recreate_or_restart_container"
common_options: "{{ docker_common_options }}"
name: "{{ service.container_name }}"
image: "{{ service.image }}"
volumes: "{{ service.volumes }}"
privileged: "{{ service.privileged | default(False) }}"
when:
- action != "config"
- service.enabled | bool
- service.host_in_groups | bool
- config_json | changed
or neutron_conf | changed
or neutron_lbaas_conf | changed
or neutron_vpnaas_conf | changed
or neutron_ml2_conf | changed
or policy_json | changed
or neutron_server_container | changed
Ansible/role/neutron/task 下拉镜像
Pull.yml 下拉镜像
#调用kolla_docker的pull_image 下拉镜像
- name: Pulling neutron images
kolla_docker:
action: "pull_image"
common_options: "{{ docker_common_options }}"
image: "{{ item.value.image }}"
when:
- item.value.enabled | bool
- item.value.host_in_groups | bool
with_dict: "{{ neutron_services }}"
流程介绍
镜像路径
kolla-ansible\ansible\roles\chrony\defaults\main.yml文件中的一段,如下:
#docker_registry是本地注册的registry地址,在global.yml中会配置。Registry虚拟机为会从定向5000端口带宿主机的4000端口。
docker_namespace也在global.yml定义,如果从官网下载源码镜像的话,配置成lokolla。
kolla_base_distro默认centos,也可以配置成ubuntu
kolla_install_type 默认为binary, 我们配置在global.yml配置成了source
例子:
docker_registry:192.168.102.15:4000
docker_namespace:lokolla
kolla_base_distro:“centos”
kolla_install_type: source
openstack_release: auto (自发现,前文有讲到)
所以最后的chrony_image_full为192.168.102.15:4000/lokolla/centos-source-chrony:4.0.2
chrony_image: "{{ docker_registry ~ '/' if docker_registry else '' }}{{ docker_namespace }}/{{ kolla_base_distro }}-{{ kolla_install_type }}-chrony"
chrony_tag: "{{ openstack_release }}"
chrony_image_full: "{{ chrony_image }}:{{ chrony_tag }}"