[openstack-dev] [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Bogdan Dobrelya bdobreli at redhat.com
Thu Nov 29 09:28:27 UTC 2018


On 11/28/18 8:55 PM, Doug Hellmann wrote:
> I thought the preferred solution for more complex settings was config maps. Did that approach not work out?
> 
> Regardless, now that the driver work is done if someone wants to take another stab at etcd integration it’ll be more straightforward today.
> 
> Doug
> 

While sharing configs is a feasible option to consider for large scale 
configuration management, Etcd only provides a strong consistency, which 
is also known as "Unavailable" [0]. For edge scenarios, to configure 
40,000 remote computes over WAN connections, we'd rather want instead 
weaker consistency models, like "Sticky Available" [0]. That would allow 
services to fetch their configuration either from a central "uplink" or 
locally as well, when the latter is not accessible from remote edge 
sites. Etcd cannot provide 40,000 local endpoints to fit that case I'm 
afraid, even if those would be read only replicas. That is also 
something I'm highlighting in the paper [1] drafted for ICFC-2019.

But had we such a sticky available key value storage solution, we would 
indeed have solved the problem of multiple configuration management 
system execution for thousands of nodes as James describes it.

[0] https://jepsen.io/consistency
[1] 
https://github.com/bogdando/papers-ieee/blob/master/ICFC-2019/LaTeX/position_paper_1570506394.pdf

On 11/28/18 11:22 PM, Dan Prince wrote:
> On Wed, 2018-11-28 at 13:28 -0500, James Slagle wrote:
>> On Wed, Nov 28, 2018 at 12:31 PM Bogdan Dobrelya <bdobreli at redhat.com
>>> wrote:
>>> Long story short, we cannot shoot both rabbits with a single shot,
>>> not
>>> with puppet :) May be we could with ansible replacing puppet
>>> fully...
>>> So splitting config and runtime images is the only choice yet to
>>> address
>>> the raised security concerns. And let's forget about edge cases for
>>> now.
>>> Tossing around a pair of extra bytes over 40,000 WAN-distributed
>>> computes ain't gonna be our the biggest problem for sure.
>>
>> I think it's this last point that is the crux of this discussion. We
>> can agree to disagree about the merits of this proposal and whether
>> it's a pre-optimzation or micro-optimization, which I admit are
>> somewhat subjective terms. Ultimately, it seems to be about the "why"
>> do we need to do this as to the reason why the conversation seems to
>> be going in circles a bit.
>>
>> I'm all for reducing container image size, but the reality is that
>> this proposal doesn't necessarily help us with the Edge use cases we
>> are talking about trying to solve.
>>
>> Why would we even run the exact same puppet binary + manifest
>> individually 40,000 times so that we can produce the exact same set
>> of
>> configuration files that differ only by things such as IP address,
>> hostnames, and passwords? Maybe we should instead be thinking about
>> how we can do that *1* time centrally, and produce a configuration
>> that can be reused across 40,000 nodes with little effort. The
>> opportunity for a significant impact in terms of how we can scale
>> TripleO is much larger if we consider approaching these problems with
>> a wider net of what we could do. There's opportunity for a lot of
>> better reuse in TripleO, configuration is just one area. The plan and
>> Heat stack (within the ResourceGroup) are some other areas.
> 
> We run Puppet for configuration because that is what we did on
> baremetal and we didn't break backwards compatability for our
> configuration options for upgrades. Our Puppet model relies on being
> executed on each local host in order to splice in the correct IP
> address and hostname. It executes in a distributed fashion, and works
> fairly well considering the history of the project. It is robust,
> guarantees no duplicate configs are being set, and is backwards
> compatible with all the options TripleO supported on baremetal. Puppet
> is arguably better for configuration than Ansible (which is what I hear
> people most often suggest we replace it with). It suits our needs fine,
> but it is perhaps a bit overkill considering we are only generating
> config files.
> 
> I think the answer here is moving to something like Etcd. Perhaps

Not Etcd I think, see my comment above. But you're absolutely right Dan.

> skipping over Ansible entirely as a config management tool (it is
> arguably less capable than Puppet in this category anyway). Or we could
> use Ansible for "legacy" services only, switch to Etcd for a majority
> of the OpenStack services, and drop Puppet entirely (my favorite
> option). Consolidating our technology stack would be wise.
> 
> We've already put some work and analysis into the Etcd effort. Just
> need to push on it some more. Looking at the previous Kubernetes
> prototypes for TripleO would be the place to start.
> 
> Config management migration is going to be tedious. Its technical debt
> that needs to be handled at some point anyway. I think it is a general
> TripleO improvement that could benefit all clouds, not just Edge.
> 
> Dan
> 
>>
>> At the same time, if some folks want to work on smaller optimizations
>> (such as container image size), with an approach that can be agreed
>> upon, then they should do so. We just ought to be careful about how
>> we
>> justify those changes so that we can carefully weigh the effort vs
>> the
>> payoff. In this specific case, I don't personally see this proposal
>> helping us with Edge use cases in a meaningful way given the scope of
>> the changes. That's not to say there aren't other use cases that
>> could
>> justify it though (such as the security points brought up earlier).
>>
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando



More information about the openstack-discuss mailing list