[openstack-dev] [kolla] Notes on Ansible fact gathering

Paul Bourke paul.bourke at oracle.com
Wed Nov 23 11:12:40 UTC 2016

There is some concerns internally around Ansible fact gathering, and how 
it relates to adding / removing individual nodes in Kolla. As I've spent 
a fair bit of time in this area I decided to send some info around both 
for anyone confused about this in the future and also to tease out any 
ways we might be able to do this better.

Here is some info on how Ansible fact gathering works, and how it 
relates to decisions around adding / removing single nodes.

Many roles need to know info about other nodes in the cluster. This is 
true regardless of whether they are being deployed individually or as 
part of a larger deploy. As Ansible is a push based tool, the only way 
they can get this information is to ssh to those nodes and gather that 
info in the form of 'facts'.

Ideally, Ansible would only touch (i.e. run fact gathering) on nodes 
referenced inside it's play[2]. However, it is not smart enough to do 
this, and so each node a play cares about must be listed explicitly up 
front in the playbook. In the past this has being a maintenance burden, 
as any time a new role was added that for example, needed to know about 
the IP addresses of all rabbitmq servers, care had to be taken to also 
list that group of nodes under that new play. If this wasn't done 
correctly, it lead to the commonly reported "'dict object' has no 
attribute u'ansible_eth0'".

After discussion and research it was determined by Kolla[0][1] that the 
most reliable and pragmatic way to solve this was to gather facts about 
all nodes up front, once. In the case of a full deploy, this will 
essentially happen anyway. For the single node case however, it's not 
ideal as it may visit more nodes than needed. It's still my belief that 
this is the least error prone solution 'out of the box'. In the cases 
where users have 500 control nodes and want to add five more 
sequentially (the value of doing more than one sequentially is still not 
clear to me), we could look into turning on fact caching. Given the fact 
gathering in my experience is reasonably fast however, combined with 
Ansible ssh pipelining, I'm not sure how much this would even gain.


[0] https://review.openstack.org/#/c/376524/
[1] https://review.openstack.org/#/c/398313/

More information about the OpenStack-dev mailing list