[Openstack-operators] Nodes and configurations management in Puppet

Joe Topjian joe at topjian.net
Fri Sep 26 02:30:30 UTC 2014


Hi Jay,

Hiera takes the cake for my love/hate of Puppet. I try really hard to keep
the number of hierarchies small and even then I find it awkward sometimes.
I love the concept of Hiera, but I find it can be unintuitive. Similar to
the other replies, I have a "common" hierarchy where 90% of the data is
stored. The other hierarchies either override "common" or append to it.
When I need to know where a parameter is ultimately configured, I find
myself thinking "is that parameter common across everything or specific to
a certain location or node, and if so, why did I make it specific?", then
doing a "grep -R" to find where it's located, and finally thinking "oh
right - that's why it's there".

Another area of Puppet that I'm finding difficult to work with is
configuring HA environments. There are two main pain points here and
they're pretty applicable to using Puppet with OpenStack:

The first is standing up a cluster of servers. For example MySQL and
Galera. The first server in the cluster gets a blank gcomm setting. This
means that I either need to have a blank gcomm setting in Hiera just for
the first node or I need to remove the list of nodes from Hiera temporarily
and add it back after the first node is set up. puppetdbquery
<https://github.com/dalen/puppet-puppetdbquery> (very awesome module) can
also be used. There are pros and cons to all of these.

After the first node, the rest of the servers are trickier because Puppet
will usually start the MySQL service before all settings are in place. Then
after all settings are in place, MySQL will restart. But the timing of
ensuring all settings are in place for the new node to connect to the other
nodes and sync with Galera is rarely right. This is fixable by doing a lot
of forced ordering in Puppet, but (at least in my current setup), that
would require a custom MySQL module or modifying the PuppetLabs MySQL
module.

And finally, just coordinating Puppet runs of multiple cluster members is a
bear. It can take up to 3 Puppet runs for two nodes to connect. Think syn,
syn/ack, ack here.

The other HA pain point is creating many-to-one configurations, for example
putting those MySQL nodes behind HAProxy. I want HAProxy to have an entry
for MySQL that lists all nodes. Listing the nodes is easy: each MySQL
server can simply export itself as a valid node, but what about the rest of
the HAProxy MySQL config that's "singular" (for example, the "listen" and
"option" settings)? Only one node can host that data or Puppet will
complain about duplicate data. I don't want one node responsible for the
data because I feel that creates a "pet" environment vs a "cattle" one.
There are two hacks I've done to get around this:

The first is to create resources in each MySQL server with unique resource
names but the same value. A simple example is:

file_line { '/etc/foobar.conf for node-1':
  path => '/etc/foobar.conf',
  line => 'Hello, World!',
}

file_line { '/etc/foobar.conf for node-2':
  path => '/etc/foobar.conf',
  line => 'Hello, World!',
}

Now if these resources are exported and collected on a central server,
Puppet will not complain about duplicate resources and the single line will
exist in the foobar.conf file.

The second hack is more involved and it would take a lot of detail to show
a full example. In summary, the Puppet stdlib module has a function
called ensure_resource
<https://github.com/puppetlabs/puppetlabs-stdlib#ensure_resource>that will
only apply a resource if it doesn't already exist. If several nodes export
the same ensure_resource with the exact same settings, the single resource
will be built on the server that imports them. The problem here is that if
the resource is ever modified, then I must comment out the resource, run
"puppet apply --noop" so it's removed from PuppetDB, make the change, then
run Puppet again. If not, then the first node that runs with the new
setting will export the modified resource and Puppet will complain because
two resources exist with different data. This might sound very confusing
and awkward and it is, but doing this allows me to store resources like
firewall rules, MySQL users, HAProxy entries, etc on multiple nodes when
those settings should only be applied once in the entire environment.

I think a cleaner way of doing this is to introduce service discovery into
my environment, but I haven't had time to look into this in more detail.

I should mention that some of these HA pains can be resolved by just moving
all of the data to the HAProxy nodes themselves. So when I want to add a
new service, such as RabbitMQ, to HAProxy, I add the RabbitMQ settings to
the HAProxy role/profiles. But I want HAProxy to be "dumb" about what it's
hosting. I want to be able to use it in a Juju-like fashion where I can
introduce any arbitrary service and HAProxy configures itself without prior
knowledge of the new service.

I say *some* pains, because this isn't applicable to, say, a MySQL cluster
where users and databases are shared across the entire cluster.

There are probably one or two other areas I'm forgetting, but typing the
above has cleared my head. :)

In general, though, I really enjoy working with Puppet. Our current Puppet
configurations allow us to stand up test OpenStack environments with little
manual input as well as upgrade to newer releases of OpenStack with very
little effort. For example, short of our compute nodes, all other OpenStack
components run in LXC containers. Our Icehouse upgrade procedure consisted
of killing those containers, making new ones based on Ubuntu 14.04, and
letting Puppet do the rest.

Let me know if you'd like more details or clarification.

Thanks,
Joe


On Thu, Sep 25, 2014 at 12:12 PM, Jay Pipes <jaypipes at gmail.com> wrote:

> On 09/25/2014 11:45 AM, Joe Topjian wrote:
>
>> Hi Mathieu,
>>
>> My setup is very similar to yours. Node definitions are in site.pp and
>> Hiera is used for all configuration. The Hiera hierarchies are also very
>> similar.
>>
>> Overall, I have a love/hate relationship with the setup. I could go on
>> in detail, but it'd be all Puppet-specific rather than OpenStack. I'd be
>> happy to discuss off-list.
>>
>> Of if there's enough interest, I can post it here. I just don't want to
>> muddy up this list with non-OpenStack things.
>>
>
> I certainly would enjoy a good discussion here about your love-hate
> relationship with your deployment tooling. I have the same relationship
> with the Chef work we did at AT&T and am curious whether there are
> Puppetisms that similarly cause grief as some Chefisms.
>
> /me readies the popcorn.
>
> Best,
> -jay
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140925/32467509/attachment.html>


More information about the OpenStack-operators mailing list