[openstack-dev] [kolla] Notes on Ansible fact gathering
Paul Bourke
paul.bourke at oracle.com
Wed Nov 23 11:12:40 UTC 2016
There is some concerns internally around Ansible fact gathering, and how
it relates to adding / removing individual nodes in Kolla. As I've spent
a fair bit of time in this area I decided to send some info around both
for anyone confused about this in the future and also to tease out any
ways we might be able to do this better.
Here is some info on how Ansible fact gathering works, and how it
relates to decisions around adding / removing single nodes.
Many roles need to know info about other nodes in the cluster. This is
true regardless of whether they are being deployed individually or as
part of a larger deploy. As Ansible is a push based tool, the only way
they can get this information is to ssh to those nodes and gather that
info in the form of 'facts'.
Ideally, Ansible would only touch (i.e. run fact gathering) on nodes
referenced inside it's play[2]. However, it is not smart enough to do
this, and so each node a play cares about must be listed explicitly up
front in the playbook. In the past this has being a maintenance burden,
as any time a new role was added that for example, needed to know about
the IP addresses of all rabbitmq servers, care had to be taken to also
list that group of nodes under that new play. If this wasn't done
correctly, it lead to the commonly reported "'dict object' has no
attribute u'ansible_eth0'".
After discussion and research it was determined by Kolla[0][1] that the
most reliable and pragmatic way to solve this was to gather facts about
all nodes up front, once. In the case of a full deploy, this will
essentially happen anyway. For the single node case however, it's not
ideal as it may visit more nodes than needed. It's still my belief that
this is the least error prone solution 'out of the box'. In the cases
where users have 500 control nodes and want to add five more
sequentially (the value of doing more than one sequentially is still not
clear to me), we could look into turning on fact caching. Given the fact
gathering in my experience is reasonably fast however, combined with
Ansible ssh pipelining, I'm not sure how much this would even gain.
-Paul
[0] https://review.openstack.org/#/c/376524/
[1] https://review.openstack.org/#/c/398313/
[2]
https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/nova/templates/nova.conf.j2#L64
More information about the OpenStack-dev
mailing list