[openstack-dev] [tripleo] ironic automated cleaning by default?

Dmitry Tantsur dtantsur at redhat.com
Thu Apr 26 15:24:42 UTC 2018


On 04/26/2018 05:12 PM, James Slagle wrote:
> On Thu, Apr 26, 2018 at 10:24 AM, Dmitry Tantsur <dtantsur at redhat.com> wrote:
>> Answering to both James and Ben inline.
>>
>>
>> On 04/25/2018 05:47 PM, Ben Nemec wrote:
>>>
>>>
>>>
>>> On 04/25/2018 10:28 AM, James Slagle wrote:
>>>>
>>>> On Wed, Apr 25, 2018 at 10:55 AM, Dmitry Tantsur <dtantsur at redhat.com>
>>>> wrote:
>>>>>
>>>>> On 04/25/2018 04:26 PM, James Slagle wrote:
>>>>> Well, it's not clear what is "safe" here: protect people who explicitly
>>>>> delete their stacks or protect people who don't realize that a previous
>>>>> deployment may screw up their new one in a subtle way.
>>>>
>>>>
>>>> The latter you can recover from, the former you can't if automated
>>>> cleaning is true.
>>
>>
>> Nor can we recover from 'rm -rf / --no-preserve-root', but it's not a reason
>> to disable the 'rm' command :)
> 
> This is a really disingenuous comparison. If you really want to
> compare these things with what you're proposing, then it would be to
> make --no-preserve-root the default with rm. Which it is not.

If we really go down this path, what TripleO does right now is removing the 'rm' 
command by default and saying "well, you can install it back, if you realize you 
cannot work without it" :)

> 
>>
>>>>
>>>> It's not just about people who explicitly delete their stacks (whether
>>>> intentional or not). There could be user error (non-explicit) or
>>>> side-effects triggered by Heat that could cause nodes to get deleted.
>>
>>
>> If we have problems with Heat, we should fix Heat or stop using it. What
>> you're saying is essentially "we prevent ironic from doing the right thing
>> because we're using a tool that can invoke 'rm -rf /' at a wrong moment."
> 
> Agreed on the Heat point, and once/if we're there, I'd probably not
> object to making automated clean the default.
> 
> I disagree on how you characterized what I'm saying. I'm not proposing
> to prevent Ironic from doing the right thing. If people want to use
> automated cleaning, they can. Nothing will prevent that. It just
> shouldn't be the default.

It's not about "want to use". It's about "we don't guarantee the correct 
behavior in presence of previous deployments on non-root disks" and "if you use 
ceph, you must use cleaning".

> 
>>
>>>>
>>>> You couldn't recover from those scenarios if automated cleaning were
>>>> true. Whereas you could always fix a deployment error by opting in to
>>>> do an automated clean. Does Ironic keep track of it a node has been
>>>> previously cleaned? Could we add a validation to check whether any
>>>> nodes might be used in the deployment that were not previously
>>>> cleaned?
>>
>>
>> It's may be possible possible to figure out if a node was ever cleaned. But
>> then we'll force operators to invoke cleaning manually, right? It will work,
>> but that's another step on the default workflow. Are you okay with it?
> 
> I would be ok with it. But I don't even characterize it as a
> completely necessary step on the default workflow. It fixes some
> issues as you've pointed out, but also comes with a cost. What we're
> discussing is whether it's the default or not. If it is not true by
> default, then we wouldn't make it a required step in the default
> workflow to make sure it's done. It'd be documented as choice.
> 

Sure, but how do people know if they want it? Okay, if they use Ceph, they have 
to. Then.. mm.. "if you have multiple disks and you're not sure what's on them, 
please clean"? It may work, I wonder how many people will care to follow it though.




More information about the OpenStack-dev mailing list