[TripleO][Validation] Validation CLI simplification
Hi TripleO Folks, I'm raising this topic to the ML because it appears we have some divergence regarding some design around the way the Validations should be used with and without TripleO and I wanted to have a larger audience, in particular PTL and core thoughts around this topic. The current situation is: We have an openstack tripleo validator set of sub commands to handle Validation (run, list ...). The CLI validation is taking several parameters as an entry point and in particular the stack/plan, Openstack authentication and static inventory file. By asking the stack/plan name, the CLI is trying to verify and understand if the plan or the stack is valid, if the Overcloud exists somewhere in the cloud before passing that to the tripleo-ansible-inventory script and trying to generate a static inventory file in regard to what --plan or stack has been passed. The code is mainly here: [1]. This behavior implies several constraints: * Validation CLI needs Openstack authentication in order to do those checks * It introduces some complexity in the Validation code part: querying Heat to get the plan name to be sure the name provided is correct, get the status of the stack... In case of Standalone deployment, it adds more complexity then. * This code is only valid for "standard" deployments and usage meaning it doesn't work for Standalone, for some Upgrade and FFU stages and needs to be bypassed for pre-undercloud deployment. * We hit several blockers around this part of code. My proposal is the following: Since we are thinking of the future of Validation and we want something more robust, stronger, simpler, usable and efficient, I propose to get rid of the plan/stack and authentication functionalities in the Validation code, and only ask for a valid inventory provided by the user. I propose as well to create a new entry point in the TripleO CLI to generate a static inventory such as: openstack tripleo inventory generate --output-file my-inv.yaml and then: openstack tripleo validator run --validation my-validation --inventory my-inv.yaml By doing that, I think we gain a lot in simplification, it's more robust, and Validation will only do what it aims for: wrapp Ansible execution to provide better logging information and history. The main concerns about this approach is that the user will have to provide a valid inventory to the Validation CLI. I understand the point of view of getting something fully autonomous, and the way of just kicking *one* command and the Validation can be *magically* executed against your cloud, but I think the less complex the Validation code is, the more robust, stable and usable it will be. Deferring a specific entry point for the inventory, which is a key part of post deployment action, seems something more clear and robust as well. This part of code could be shared and used for any other usages instead of calling the inventory script stored into tripleo-validations. It could then use the tripleo-common inventory library directly with tripleoclient, instead of calling from client -> tripleo-validations/scripts -> query tripleo-common inventory library. I know it changes a little bit the usage (adding one command line in the execution process for getting a valid inventory) but it's going in a less buggy and racy direction. And the inventory should be generated only once, or at least at any big major cloud change. So, I'm glad to get your thoughts on that topic and your overall views around this topic. Thanks, Mathieu [1] https://github.com/openstack/tripleo-validations/blob/master/tripleo_validat...
On Wed, Mar 3, 2021 at 12:51 PM Mathieu Bultel <mbultel@redhat.com> wrote:
Hi TripleO Folks,
I'm raising this topic to the ML because it appears we have some divergence regarding some design around the way the Validations should be used with and without TripleO and I wanted to have a larger audience, in particular PTL and core thoughts around this topic.
The current situation is: We have an openstack tripleo validator set of sub commands to handle Validation (run, list ...). The CLI validation is taking several parameters as an entry point and in particular the stack/plan, Openstack authentication and static inventory file.
By asking the stack/plan name, the CLI is trying to verify and understand if the plan or the stack is valid, if the Overcloud exists somewhere in the cloud before passing that to the tripleo-ansible-inventory script and trying to generate a static inventory file in regard to what --plan or stack has been passed.
Sorry if silly question but, can't we just make 'validate the stack status' as one of the validations? In fact you already have something like that there https://github.com/openstack/tripleo-validations/blob/1a9f1758d160cc2e543a1c... . Then only this validation will require the stack name passed in instead of on every validation run. BTW as an aside we should probably remove 'plan' from that code altogether given the recent 'remove swift and overcloud plan' work from ramishra/cloudnull and co @ https://review.opendev.org/q/topic:%22env_merging%22+(status:open%20OR%20sta...)
The code is mainly here: [1].
This behavior implies several constraints: * Validation CLI needs Openstack authentication in order to do those checks * It introduces some complexity in the Validation code part: querying Heat to get the plan name to be sure the name provided is correct, get the status of the stack... In case of Standalone deployment, it adds more complexity then. * This code is only valid for "standard" deployments and usage meaning it doesn't work for Standalone, for some Upgrade and FFU stages and needs to be bypassed for pre-undercloud deployment. * We hit several blockers around this part of code.
My proposal is the following:
Since we are thinking of the future of Validation and we want something more robust, stronger, simpler, usable and efficient, I propose to get rid of the plan/stack and authentication functionalities in the Validation code, and only ask for a valid inventory provided by the user. I propose as well to create a new entry point in the TripleO CLI to generate a static inventory such as: openstack tripleo inventory generate --output-file my-inv.yaml and then: openstack tripleo validator run --validation my-validation --inventory my-inv.yaml
By doing that, I think we gain a lot in simplification, it's more robust, and Validation will only do what it aims for: wrapp Ansible execution to provide better logging information and history.
The main concerns about this approach is that the user will have to provide a valid inventory to the Validation CLI. I understand the point of view of getting something fully autonomous, and the way of just kicking *one* command and the Validation can be *magically* executed against your cloud, but I think the less complex the Validation code is, the more robust, stable and usable it will be.
Deferring a specific entry point for the inventory, which is a key part of post deployment action, seems something more clear and robust as well. This part of code could be shared and used for any other usages instead of calling the inventory script stored into tripleo-validations. It could then use the tripleo-common inventory library directly with tripleoclient, instead of calling from client -> tripleo-validations/scripts -> query tripleo-common inventory library.
I know it changes a little bit the usage (adding one command line in the execution process for getting a valid inventory) but it's going in a less buggy and racy direction. And the inventory should be generated only once, or at least at any big major cloud change.
So, I'm glad to get your thoughts on that topic and your overall views around this topic.
The proposal sounds sane to me, but just to be clear by "authentication functionalities" are you referring specifically to the '--ssh-user' argument ( https://github.com/openstack/tripleo-validations/blob/1a9f1758d160cc2e543a1c... i.e. we will already have that in the generated static inventory so no need to have in on the CLI? If the only cost is that we have to have an extra step for generating the inventory then IMO it is worth doing. I would however be interested to hear from those that are objecting to the proposal about why it is a bad idea ;) since you said there has been a divergence in opinions over the design regards, marios
Thanks, Mathieu
[1] https://github.com/openstack/tripleo-validations/blob/master/tripleo_validat...
Thank you Marios for the response. On Wed, Mar 3, 2021 at 1:45 PM Marios Andreou <marios@redhat.com> wrote:
On Wed, Mar 3, 2021 at 12:51 PM Mathieu Bultel <mbultel@redhat.com> wrote:
Hi TripleO Folks,
I'm raising this topic to the ML because it appears we have some divergence regarding some design around the way the Validations should be used with and without TripleO and I wanted to have a larger audience, in particular PTL and core thoughts around this topic.
The current situation is: We have an openstack tripleo validator set of sub commands to handle Validation (run, list ...). The CLI validation is taking several parameters as an entry point and in particular the stack/plan, Openstack authentication and static inventory file.
By asking the stack/plan name, the CLI is trying to verify and understand if the plan or the stack is valid, if the Overcloud exists somewhere in the cloud before passing that to the tripleo-ansible-inventory script and trying to generate a static inventory file in regard to what --plan or stack has been passed.
Sorry if silly question but, can't we just make 'validate the stack status' as one of the validations? In fact you already have something like that there https://github.com/openstack/tripleo-validations/blob/1a9f1758d160cc2e543a1c... . Then only this validation will require the stack name passed in instead of on every validation run.
BTW as an aside we should probably remove 'plan' from that code altogether given the recent 'remove swift and overcloud plan' work from ramishra/cloudnull and co @ https://review.opendev.org/q/topic:%22env_merging%22+(status:open%20OR%20sta...)
Not exactly, the main goal of checking if the --stack/plan value is
correct and if the stack provided exists and is right is for getting a valid Ansible inventory to execute the Validations. Meaning that all the extra checks in the code in [1] is made for generating the inventory, which imho should not belong to the Validation CLI but to something else, and Validation should consider that the --inventory file is correct (because of the reasons mentioned earlier).
The code is mainly here: [1].
This behavior implies several constraints: * Validation CLI needs Openstack authentication in order to do those checks * It introduces some complexity in the Validation code part: querying Heat to get the plan name to be sure the name provided is correct, get the status of the stack... In case of Standalone deployment, it adds more complexity then. * This code is only valid for "standard" deployments and usage meaning it doesn't work for Standalone, for some Upgrade and FFU stages and needs to be bypassed for pre-undercloud deployment. * We hit several blockers around this part of code.
My proposal is the following:
Since we are thinking of the future of Validation and we want something more robust, stronger, simpler, usable and efficient, I propose to get rid of the plan/stack and authentication functionalities in the Validation code, and only ask for a valid inventory provided by the user. I propose as well to create a new entry point in the TripleO CLI to generate a static inventory such as: openstack tripleo inventory generate --output-file my-inv.yaml and then: openstack tripleo validator run --validation my-validation --inventory my-inv.yaml
By doing that, I think we gain a lot in simplification, it's more robust, and Validation will only do what it aims for: wrapp Ansible execution to provide better logging information and history.
The main concerns about this approach is that the user will have to provide a valid inventory to the Validation CLI. I understand the point of view of getting something fully autonomous, and the way of just kicking *one* command and the Validation can be *magically* executed against your cloud, but I think the less complex the Validation code is, the more robust, stable and usable it will be.
Deferring a specific entry point for the inventory, which is a key part of post deployment action, seems something more clear and robust as well. This part of code could be shared and used for any other usages instead of calling the inventory script stored into tripleo-validations. It could then use the tripleo-common inventory library directly with tripleoclient, instead of calling from client -> tripleo-validations/scripts -> query tripleo-common inventory library.
I know it changes a little bit the usage (adding one command line in the execution process for getting a valid inventory) but it's going in a less buggy and racy direction. And the inventory should be generated only once, or at least at any big major cloud change.
So, I'm glad to get your thoughts on that topic and your overall views around this topic.
The proposal sounds sane to me, but just to be clear by "authentication functionalities" are you referring specifically to the '--ssh-user' argument ( https://github.com/openstack/tripleo-validations/blob/1a9f1758d160cc2e543a1c... i.e. we will already have that in the generated static inventory so no need to have in on the CLI?
The authentication in the current CLI is only needed to get the stack output in order to generate the inventory. If the user provides his inventory, no authentication, no heat stack check and then Validation can run everywhere on every stage of a TripleO deployment (early without any TripleO bits, or in the middle of LEAP Upgrade for example).
If the only cost is that we have to have an extra step for generating the inventory then IMO it is worth doing. I would however be interested to hear from those that are objecting to the proposal about why it is a bad idea ;) since you said there has been a divergence in opinions over the design
regards, marios
Thanks, Mathieu
[1] https://github.com/openstack/tripleo-validations/blob/master/tripleo_validat...
Hello there, On 3/3/21 2:13 PM, Mathieu Bultel wrote:
Thank you Marios for the response.
On Wed, Mar 3, 2021 at 1:45 PM Marios Andreou <marios@redhat.com <mailto:marios@redhat.com>> wrote:
On Wed, Mar 3, 2021 at 12:51 PM Mathieu Bultel <mbultel@redhat.com <mailto:mbultel@redhat.com>> wrote:
Hi TripleO Folks,
I'm raising this topic to the ML because it appears we have some divergence regarding some design around the way the Validations should be used with and without TripleO and I wanted to have a larger audience, in particular PTL and core thoughts around this topic.
The current situation is: We have an openstack tripleo validator set of sub commands to handle Validation (run, list ...). The CLI validation is taking several parameters as an entry point and in particular the stack/plan, Openstack authentication and static inventory file.
By asking the stack/plan name, the CLI is trying to verify and understand if the plan or the stack is valid, if the Overcloud exists somewhere in the cloud before passing that to the tripleo-ansible-inventory script and trying to generate a static inventory file in regard to what --plan or stack has been passed.
Sorry if silly question but, can't we just make 'validate the stack status' as one of the validations? In fact you already have something like that there https://github.com/openstack/tripleo-validations/blob/1a9f1758d160cc2e543a1c... <https://github.com/openstack/tripleo-validations/blob/1a9f1758d160cc2e543a1cf7cd4507dd3355945a/roles/stack_health/tasks/main.yml#L2> . Then only this validation will require the stack name passed in instead of on every validation run.
BTW as an aside we should probably remove 'plan' from that code altogether given the recent 'remove swift and overcloud plan' work from ramishra/cloudnull and co @ https://review.opendev.org/q/topic:%22env_merging%22+(status:open%20OR%20sta...) <https://review.opendev.org/q/topic:%22env_merging%22+(status:open%20OR%20status:merged)>
Not exactly, the main goal of checking if the --stack/plan value is correct and if the stack provided exists and is right is for getting a valid Ansible inventory to execute the Validations. Meaning that all the extra checks in the code in [1] is made for generating the inventory, which imho should not belong to the Validation CLI but to something else, and Validation should consider that the --inventory file is correct (because of the reasons mentioned earlier).
well... It actually isn't part of the validation CLI, it's part of the plugin for tripleoclient wrapping the actual validation library... Sooo... That's a usage we'd intend to get, isn't it? All the "bad things" are on the tripleo side, NOT the VF cli/code/lib/content. Cheers, C.
The code is mainly here: [1].
This behavior implies several constraints: * Validation CLI needs Openstack authentication in order to do those checks * It introduces some complexity in the Validation code part: querying Heat to get the plan name to be sure the name provided is correct, get the status of the stack... In case of Standalone deployment, it adds more complexity then. * This code is only valid for "standard" deployments and usage meaning it doesn't work for Standalone, for some Upgrade and FFU stages and needs to be bypassed for pre-undercloud deployment. * We hit several blockers around this part of code.
My proposal is the following:
Since we are thinking of the future of Validation and we want something more robust, stronger, simpler, usable and efficient, I propose to get rid of the plan/stack and authentication functionalities in the Validation code, and only ask for a valid inventory provided by the user. I propose as well to create a new entry point in the TripleO CLI to generate a static inventory such as: openstack tripleo inventory generate --output-file my-inv.yaml and then: openstack tripleo validator run --validation my-validation --inventory my-inv.yaml
By doing that, I think we gain a lot in simplification, it's more robust, and Validation will only do what it aims for: wrapp Ansible execution to provide better logging information and history.
The main concerns about this approach is that the user will have to provide a valid inventory to the Validation CLI. I understand the point of view of getting something fully autonomous, and the way of just kicking *one* command and the Validation can be *magically* executed against your cloud, but I think the less complex the Validation code is, the more robust, stable and usable it will be.
Deferring a specific entry point for the inventory, which is a key part of post deployment action, seems something more clear and robust as well. This part of code could be shared and used for any other usages instead of calling the inventory script stored into tripleo-validations. It could then use the tripleo-common inventory library directly with tripleoclient, instead of calling from client -> tripleo-validations/scripts -> query tripleo-common inventory library.
I know it changes a little bit the usage (adding one command line in the execution process for getting a valid inventory) but it's going in a less buggy and racy direction. And the inventory should be generated only once, or at least at any big major cloud change.
So, I'm glad to get your thoughts on that topic and your overall views around this topic.
The proposal sounds sane to me, but just to be clear by "authentication functionalities" are you referring specifically to the '--ssh-user' argument (https://github.com/openstack/tripleo-validations/blob/1a9f1758d160cc2e543a1c... <https://github.com/openstack/tripleo-validations/blob/1a9f1758d160cc2e543a1cf7cd4507dd3355945a/tripleo_validations/tripleo_validator.py#L243>)? i.e. we will already have that in the generated static inventory so no need to have in on the CLI?
The authentication in the current CLI is only needed to get the stack output in order to generate the inventory. If the user provides his inventory, no authentication, no heat stack check and then Validation can run everywhere on every stage of a TripleO deployment (early without any TripleO bits, or in the middle of LEAP Upgrade for example).
If the only cost is that we have to have an extra step for generating the inventory then IMO it is worth doing. I would however be interested to hear from those that are objecting to the proposal about why it is a bad idea ;) since you said there has been a divergence in opinions over the design
regards, marios
Thanks, Mathieu
[1] https://github.com/openstack/tripleo-validations/blob/master/tripleo_validat... <https://github.com/openstack/tripleo-validations/blob/master/tripleo_validations/tripleo_validator.py#L338-L382>
-- Cédric Jeanneret (He/Him/His) Sr. Software Engineer - OpenStack Platform Deployment Framework TC Red Hat EMEA https://www.redhat.com/
participants (3)
-
Cédric Jeanneret
-
Marios Andreou
-
Mathieu Bultel