[magnum] Dropping default Cluster API driver

newer
[oslo][release][requirement] FFE...

Mohammed Naser

16 Feb 2024 16 Feb '24

7:55 p.m.

Hi everyone, It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this. Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support. Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features. We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver. Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here. Thanks, Mohmamed [1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Attachments:

attachment.html (text/html — 4.7 KB)

Show replies by date

Dmitriy Rabotyagov

16 Feb 16 Feb

8:24 p.m.

Hey, I'm totally not to decide here, but just want to repeat my personal subjective opinion on the topic. From what I do recall from the last PTG, which I was part of, more opinions were inclined towards having some default driver. Otherwise, service is not functional on it's own and must rely heavily on individual organisations to keep support for their drivers without any way of influencing them. In turn, organisations are able to dictate how project should be developed, what changes they are able or not able to do. I'd say, this suggestion is similar to drop native driver from Octavia or Cinder and just assume that individual organisations will keep support of their drivers. With that, these service doing just fine with having some default drivers, and some third-party. Also adding third party CI to the service is not huge issue, to ensure that your driver integrates nicely. With that said, OpenStack-Ansible right now is merging support and CI testing of Vexxhost driver. I also personally highly likely will be using it in future magnum deployments. But I truly believe, that service without any driver is doomed to failure sooner then later. If we look at literally anything - automation tool, programming language, service - they all have some core libraries, and can be used, in a way, "out of the box". And here suggestion is to make service and all it's users to put faith in an organisation providing support for the driver, which may be quite tough sell for some highly regulated environments. In case Magnum is going the road of not having any "out-of-the-box" implementation, i actually wonder if it should stay under OpenStack namespace rather then X, for instance, since it's not self-contained and heavily depends on third party, which might be licensed in a completely different way, as well as change their licensing with time which also put users at tough spot. But dunno, as I said, this is my very subjective view, and I really can be wrong in this. Also not trying to push any party to any conclusion. So treat it as - one operator voice out of many. On Fri, Feb 16, 2024, 21:02 Mohammed Naser <mnaser@vexxhost.com> wrote:

...

Hi everyone,

It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this.

Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support.

Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features.

We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver.

Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here.

Thanks, Mohmamed

[1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Mohammed Naser

8:45 p.m.

Hi Dmitriy, I am really glad you're taking the time to respond – any feedback is greatly appreciated. With regards to the service being "functional" out of the box. The way that I see this, is that this is a decision that needs to be taken by the operator: what is the backend that I will be using? I see this in the same way that "Cinder" out of the box has no default: you need a storage backend. While we (I think?) mostly test with LVM in the CI (and some Ceph CI and what not). Deployment tools can be a bit "nuanced" in that they may support one backend over another out of the box, but for something like Cinder, you need to decide what you're using. With regards to those "drivers" not being usable or not being able to have "influence". I can't speak on any of the other ones, but on our side, I'm hoping we have enough "good karma" with the community knowing that we're welcome at working together with folks. I can't speak on behalf of others but we've had contributions from countless organizations that have added features such as Flatcar, support for HTTP proxies, and many other changes that we've merged with no issues from those organizations. I don't entirely agree on the Cinder aspect, since Cinder third party CI mostly exists because OpenDev doesn't want to own a rack with 40 different types of storage systems.. but instead the 'built-in' CI works for the open source software backends (say, Ceph or LVM) and then third party is needed for those needing physical access to the machines. I think a CI that tests drivers would be productive if there's a "no built-in approach". In this case, I do understand the concern that it seems to feel like a single organization driven project -- however, we've had contributors from all over – https://github.com/vexxhost/magnum-cluster-api/graphs/contributors – and the way I would see this is .. how is this different from any other OpenStack dependency if it was an "add-on". Silly example, `confluent-kafka` is a dependency of oslo.messaging if you're using Kafka as a backend. There's probably many other examples of other libraries that exist as OpenStack dependencies that might not even be single organization driven, but single "individual" driven... I can't speak on the rest, but we've been asked to license the code as Apache 2.0 – which it has been .. just like any other OpenStack dependency. Magnum is really good at being a standard API that implements many backends, and the user should choose what backend they want. In the same way that Cinder is same way allowing multiple backends, and letting the user pick what they want. Once again though, I really appreciate your thoughts on this and getting the conversation going, Dmitriy. Thanks, Mohammed ________________________________ From: Dmitriy Rabotyagov <noonedeadpunk@gmail.com> Sent: February 16, 2024 3:24 PM Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver Hey, I'm totally not to decide here, but just want to repeat my personal subjective opinion on the topic. From what I do recall from the last PTG, which I was part of, more opinions were inclined towards having some default driver. Otherwise, service is not functional on it's own and must rely heavily on individual organisations to keep support for their drivers without any way of influencing them. In turn, organisations are able to dictate how project should be developed, what changes they are able or not able to do. I'd say, this suggestion is similar to drop native driver from Octavia or Cinder and just assume that individual organisations will keep support of their drivers. With that, these service doing just fine with having some default drivers, and some third-party. Also adding third party CI to the service is not huge issue, to ensure that your driver integrates nicely. With that said, OpenStack-Ansible right now is merging support and CI testing of Vexxhost driver. I also personally highly likely will be using it in future magnum deployments. But I truly believe, that service without any driver is doomed to failure sooner then later. If we look at literally anything - automation tool, programming language, service - they all have some core libraries, and can be used, in a way, "out of the box". And here suggestion is to make service and all it's users to put faith in an organisation providing support for the driver, which may be quite tough sell for some highly regulated environments. In case Magnum is going the road of not having any "out-of-the-box" implementation, i actually wonder if it should stay under OpenStack namespace rather then X, for instance, since it's not self-contained and heavily depends on third party, which might be licensed in a completely different way, as well as change their licensing with time which also put users at tough spot. But dunno, as I said, this is my very subjective view, and I really can be wrong in this. Also not trying to push any party to any conclusion. So treat it as - one operator voice out of many. On Fri, Feb 16, 2024, 21:02 Mohammed Naser <mnaser@vexxhost.com<mailto:mnaser@vexxhost.com>> wrote: Hi everyone, It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this. Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support. Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features. We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver. Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here. Thanks, Mohmamed [1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Dmitriy Rabotyagov

9:04 p.m.

If making parallel with cinder, I guess huge difference here is "not having in-tree driver" as being suggested for Magnum VS not having a default driver as Cinder have, but plenty of in-tree drivers. As in first you make operator to trust an external to openinfra provider (and have to pass an extra org through assessment and approval process), while in second it's still "under governance" so to say. But again, I was not saying that Vexxhost is going to change licensing or back down on their efforts. But it kinda puts operators in a tough spot - whom to trust. And that is especially tough in light of recent decisions to back down on opensource of orgs like Hashicorp or canonical with LXD. But yes, again, I do see flaws in my arguments and I do totally see and understand your point. And it's not unreasonable for sure. On Fri, Feb 16, 2024, 21:45 Mohammed Naser <mnaser@vexxhost.com> wrote:

...

Hi Dmitriy,

I am really glad you're taking the time to respond – any feedback is greatly appreciated.

With regards to the service being "functional" out of the box. The way that I see this, is that this is a decision that needs to be taken by the operator: what is the backend that I will be using? I see this in the same way that "Cinder" out of the box has no default: you need a storage backend. While we (I think?) mostly test with LVM in the CI (and some Ceph CI and what not). Deployment tools can be a bit "nuanced" in that they may support one backend over another out of the box, but for something like Cinder, you *need* to decide what you're using.

With regards to those "drivers" not being usable or not being able to have "influence". I can't speak on any of the other ones, but on our side, I'm hoping we have enough "good karma" with the community knowing that we're welcome at working together with folks. I can't speak on behalf of others but we've had contributions from countless organizations that have added features such as Flatcar, support for HTTP proxies, and many other changes that we've merged with no issues from those organizations.

I don't entirely agree on the Cinder aspect, since Cinder third party CI mostly exists because OpenDev doesn't want to own a rack with 40 different types of storage systems.. but instead the 'built-in' CI works for the open source software backends (say, Ceph or LVM) and then third party is needed for those needing physical access to the machines. I think a CI that tests drivers would be productive if there's a "no built-in approach".

In this case, I do understand the concern that it seems to feel like a *single organization driven project* -- however, we've had contributors from all over – https://github.com/vexxhost/magnum-cluster-api/graphs/contributors – and the way I would see this is .. how is this different from any other OpenStack dependency if it was an "add-on". Silly example, `confluent-kafka` is a dependency of oslo.messaging if you're using Kafka as a backend. There's probably many other examples of other libraries that exist as OpenStack dependencies that might not even be single organization driven, but single "individual" driven...

I can't speak on the rest, but we've been asked to license the code as Apache 2.0 – which it has been .. just like any other OpenStack dependency. Magnum is really good at being a standard API that implements many backends, and the user should choose what backend they want. In the same way that Cinder is same way allowing multiple backends, and letting the user pick what they want.

Once again though, I really appreciate your thoughts on this and getting the conversation going, Dmitriy.

Thanks, Mohammed

------------------------------ *From:* Dmitriy Rabotyagov <noonedeadpunk@gmail.com> *Sent:* February 16, 2024 3:24 PM *Cc:* OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject:* Re: [magnum] Dropping default Cluster API driver

Hey,

I'm totally not to decide here, but just want to repeat my personal subjective opinion on the topic.

From what I do recall from the last PTG, which I was part of, more opinions were inclined towards having some default driver. Otherwise, service is not functional on it's own and must rely heavily on individual organisations to keep support for their drivers without any way of influencing them. In turn, organisations are able to dictate how project should be developed, what changes they are able or not able to do.

I'd say, this suggestion is similar to drop native driver from Octavia or Cinder and just assume that individual organisations will keep support of their drivers. With that, these service doing just fine with having some default drivers, and some third-party. Also adding third party CI to the service is not huge issue, to ensure that your driver integrates nicely.

With that said, OpenStack-Ansible right now is merging support and CI testing of Vexxhost driver. I also personally highly likely will be using it in future magnum deployments.

But I truly believe, that service without any driver is doomed to failure sooner then later. If we look at literally anything - automation tool, programming language, service - they all have some core libraries, and can be used, in a way, "out of the box". And here suggestion is to make service and all it's users to put faith in an organisation providing support for the driver, which may be quite tough sell for some highly regulated environments.

In case Magnum is going the road of not having any "out-of-the-box" implementation, i actually wonder if it should stay under OpenStack namespace rather then X, for instance, since it's not self-contained and heavily depends on third party, which might be licensed in a completely different way, as well as change their licensing with time which also put users at tough spot.

But dunno, as I said, this is my very subjective view, and I really can be wrong in this. Also not trying to push any party to any conclusion. So treat it as - one operator voice out of many.

On Fri, Feb 16, 2024, 21:02 Mohammed Naser <mnaser@vexxhost.com> wrote:

Hi everyone,

It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this.

Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support.

Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features.

We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver.

Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here.

Thanks, Mohmamed

[1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Nguyễn Hữu Khôi

17 Feb 17 Feb

2:09 a.m.

Hello. I am using Magnum Cluster API for production. This driver works very smooth and flexible to customise. Code is very clean and easy for people who want to integrate with magnum. I hope we will keep this driver as @Mohammed Naser <mnaser@vexxhost.com> suggestion. Nguyen Huu Khoi On Sat, Feb 17, 2024 at 4:13 AM Dmitriy Rabotyagov <noonedeadpunk@gmail.com> wrote:

...

If making parallel with cinder, I guess huge difference here is "not having in-tree driver" as being suggested for Magnum VS not having a default driver as Cinder have, but plenty of in-tree drivers.

As in first you make operator to trust an external to openinfra provider (and have to pass an extra org through assessment and approval process), while in second it's still "under governance" so to say.

But again, I was not saying that Vexxhost is going to change licensing or back down on their efforts. But it kinda puts operators in a tough spot - whom to trust. And that is especially tough in light of recent decisions to back down on opensource of orgs like Hashicorp or canonical with LXD.

But yes, again, I do see flaws in my arguments and I do totally see and understand your point. And it's not unreasonable for sure.

On Fri, Feb 16, 2024, 21:45 Mohammed Naser <mnaser@vexxhost.com> wrote:

...
Hi Dmitriy,

I am really glad you're taking the time to respond – any feedback is greatly appreciated.

With regards to the service being "functional" out of the box. The way that I see this, is that this is a decision that needs to be taken by the operator: what is the backend that I will be using? I see this in the same way that "Cinder" out of the box has no default: you need a storage backend. While we (I think?) mostly test with LVM in the CI (and some Ceph CI and what not). Deployment tools can be a bit "nuanced" in that they may support one backend over another out of the box, but for something like Cinder, you *need* to decide what you're using.

With regards to those "drivers" not being usable or not being able to have "influence". I can't speak on any of the other ones, but on our side, I'm hoping we have enough "good karma" with the community knowing that we're welcome at working together with folks. I can't speak on behalf of others but we've had contributions from countless organizations that have added features such as Flatcar, support for HTTP proxies, and many other changes that we've merged with no issues from those organizations.

I don't entirely agree on the Cinder aspect, since Cinder third party CI mostly exists because OpenDev doesn't want to own a rack with 40 different types of storage systems.. but instead the 'built-in' CI works for the open source software backends (say, Ceph or LVM) and then third party is needed for those needing physical access to the machines. I think a CI that tests drivers would be productive if there's a "no built-in approach".

In this case, I do understand the concern that it seems to feel like a *single organization driven project* -- however, we've had contributors from all over – https://github.com/vexxhost/magnum-cluster-api/graphs/contributors – and the way I would see this is .. how is this different from any other OpenStack dependency if it was an "add-on". Silly example, `confluent-kafka` is a dependency of oslo.messaging if you're using Kafka as a backend. There's probably many other examples of other libraries that exist as OpenStack dependencies that might not even be single organization driven, but single "individual" driven...

I can't speak on the rest, but we've been asked to license the code as Apache 2.0 – which it has been .. just like any other OpenStack dependency. Magnum is really good at being a standard API that implements many backends, and the user should choose what backend they want. In the same way that Cinder is same way allowing multiple backends, and letting the user pick what they want.

Once again though, I really appreciate your thoughts on this and getting the conversation going, Dmitriy.

Thanks, Mohammed

------------------------------ *From:* Dmitriy Rabotyagov <noonedeadpunk@gmail.com> *Sent:* February 16, 2024 3:24 PM *Cc:* OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject:* Re: [magnum] Dropping default Cluster API driver

Hey,

I'm totally not to decide here, but just want to repeat my personal subjective opinion on the topic.

From what I do recall from the last PTG, which I was part of, more opinions were inclined towards having some default driver. Otherwise, service is not functional on it's own and must rely heavily on individual organisations to keep support for their drivers without any way of influencing them. In turn, organisations are able to dictate how project should be developed, what changes they are able or not able to do.

I'd say, this suggestion is similar to drop native driver from Octavia or Cinder and just assume that individual organisations will keep support of their drivers. With that, these service doing just fine with having some default drivers, and some third-party. Also adding third party CI to the service is not huge issue, to ensure that your driver integrates nicely.

With that said, OpenStack-Ansible right now is merging support and CI testing of Vexxhost driver. I also personally highly likely will be using it in future magnum deployments.

But I truly believe, that service without any driver is doomed to failure sooner then later. If we look at literally anything - automation tool, programming language, service - they all have some core libraries, and can be used, in a way, "out of the box". And here suggestion is to make service and all it's users to put faith in an organisation providing support for the driver, which may be quite tough sell for some highly regulated environments.

In case Magnum is going the road of not having any "out-of-the-box" implementation, i actually wonder if it should stay under OpenStack namespace rather then X, for instance, since it's not self-contained and heavily depends on third party, which might be licensed in a completely different way, as well as change their licensing with time which also put users at tough spot.

But dunno, as I said, this is my very subjective view, and I really can be wrong in this. Also not trying to push any party to any conclusion. So treat it as - one operator voice out of many.

On Fri, Feb 16, 2024, 21:02 Mohammed Naser <mnaser@vexxhost.com> wrote:

Hi everyone,

It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this.

Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support.

Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features.

We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver.

Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here.

Thanks, Mohmamed

[1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Julia Kreger

3:52 a.m.

On Fri, Feb 16, 2024 at 1:11 PM Dmitriy Rabotyagov <noonedeadpunk@gmail.com> wrote:

...

If making parallel with cinder, I guess huge difference here is "not having in-tree driver" as being suggested for Magnum VS not having a default driver as Cinder have, but plenty of in-tree drivers.

As in first you make operator to trust an external to openinfra provider (and have to pass an extra org through assessment and approval process), while in second it's still "under governance" so to say.

But again, I was not saying that Vexxhost is going to change licensing or back down on their efforts. But it kinda puts operators in a tough spot - whom to trust.

Trust is also a two way street. And operators are largely going to focus on what makes the most sense given the constraints of time and resources. There is also nothing saying things cannot change, adapt, move, evolve as time goes on. The general idea on the software side of the universe is not to break the operators and increase the cost for operators.

...

And that is especially tough in light of recent decisions to back down on opensource of orgs like Hashicorp or canonical with LXD.

But we should not let recent events of betrayal of the Open Source ethos drive our thoughts and discussions. If we question everything, then we achieve nothing. Human infrastructure operators are more focused on moving the needle with small teams, not turning a massive profit by re-re-licensing a piece of code they wrote to solve the problem before them. Then again, if we are not receptive to operator efforts and do not practice extending trust, we create a self fulfilling prophecy.

...

But yes, again, I do see flaws in my arguments and I do totally see and understand your point. And it's not unreasonable for sure.

On Fri, Feb 16, 2024, 21:45 Mohammed Naser <mnaser@vexxhost.com> wrote:

...
Hi Dmitriy,

I am really glad you're taking the time to respond – any feedback is greatly appreciated.

With regards to the service being "functional" out of the box. The way that I see this, is that this is a decision that needs to be taken by the operator: what is the backend that I will be using? I see this in the same way that "Cinder" out of the box has no default: you need a storage backend. While we (I think?) mostly test with LVM in the CI (and some Ceph CI and what not). Deployment tools can be a bit "nuanced" in that they may support one backend over another out of the box, but for something like Cinder, you *need* to decide what you're using.

With regards to those "drivers" not being usable or not being able to have "influence". I can't speak on any of the other ones, but on our side, I'm hoping we have enough "good karma" with the community knowing that we're welcome at working together with folks. I can't speak on behalf of others but we've had contributions from countless organizations that have added features such as Flatcar, support for HTTP proxies, and many other changes that we've merged with no issues from those organizations.

I don't entirely agree on the Cinder aspect, since Cinder third party CI mostly exists because OpenDev doesn't want to own a rack with 40 different types of storage systems.. but instead the 'built-in' CI works for the open source software backends (say, Ceph or LVM) and then third party is needed for those needing physical access to the machines. I think a CI that tests drivers would be productive if there's a "no built-in approach".

In this case, I do understand the concern that it seems to feel like a *single organization driven project* -- however, we've had contributors from all over – https://github.com/vexxhost/magnum-cluster-api/graphs/contributors – and the way I would see this is .. how is this different from any other OpenStack dependency if it was an "add-on". Silly example, `confluent-kafka` is a dependency of oslo.messaging if you're using Kafka as a backend. There's probably many other examples of other libraries that exist as OpenStack dependencies that might not even be single organization driven, but single "individual" driven...

I can't speak on the rest, but we've been asked to license the code as Apache 2.0 – which it has been .. just like any other OpenStack dependency. Magnum is really good at being a standard API that implements many backends, and the user should choose what backend they want. In the same way that Cinder is same way allowing multiple backends, and letting the user pick what they want.

Once again though, I really appreciate your thoughts on this and getting the conversation going, Dmitriy.

Thanks, Mohammed

------------------------------ *From:* Dmitriy Rabotyagov <noonedeadpunk@gmail.com> *Sent:* February 16, 2024 3:24 PM *Cc:* OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject:* Re: [magnum] Dropping default Cluster API driver

Hey,

I'm totally not to decide here, but just want to repeat my personal subjective opinion on the topic.

From what I do recall from the last PTG, which I was part of, more opinions were inclined towards having some default driver. Otherwise, service is not functional on it's own and must rely heavily on individual organisations to keep support for their drivers without any way of influencing them. In turn, organisations are able to dictate how project should be developed, what changes they are able or not able to do.

I'd say, this suggestion is similar to drop native driver from Octavia or Cinder and just assume that individual organisations will keep support of their drivers. With that, these service doing just fine with having some default drivers, and some third-party. Also adding third party CI to the service is not huge issue, to ensure that your driver integrates nicely.

With that said, OpenStack-Ansible right now is merging support and CI testing of Vexxhost driver. I also personally highly likely will be using it in future magnum deployments.

But I truly believe, that service without any driver is doomed to failure sooner then later. If we look at literally anything - automation tool, programming language, service - they all have some core libraries, and can be used, in a way, "out of the box". And here suggestion is to make service and all it's users to put faith in an organisation providing support for the driver, which may be quite tough sell for some highly regulated environments.

In case Magnum is going the road of not having any "out-of-the-box" implementation, i actually wonder if it should stay under OpenStack namespace rather then X, for instance, since it's not self-contained and heavily depends on third party, which might be licensed in a completely different way, as well as change their licensing with time which also put users at tough spot.

But dunno, as I said, this is my very subjective view, and I really can be wrong in this. Also not trying to push any party to any conclusion. So treat it as - one operator voice out of many.

On Fri, Feb 16, 2024, 21:02 Mohammed Naser <mnaser@vexxhost.com> wrote:

Hi everyone,

It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this.

Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support.

Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features.

We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver.

Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here.

Thanks, Mohmamed

[1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Stig Telfer

9:43 a.m.

Hello Mohammed - I am surprised at your reaction to this. Broadly speaking this driver is a subject that has been undergoing the open design process since the Yoga PTG, and i do agree with Dmitriy that the consensus in the last PTG was that there is benefit to an in-tree driver. Development does appear to be following a course. The October 2023 PTG etherpad also begins with notes that the CAPI Helm driver cannot be merged until Magnum is refactored to accommodate it alongside the Vexxhost driver without conflict. There has been a delay of at least two cycles to support that. In short I don't think there are negative user impacts for Vexxhost's driver by this driver's introduction. The Vexxhost team has benefited from a rapid pace of development by doing so outside of OpenStack's governance and the four opens. Is it fair to obstruct the delivery of a driver that has attempted to follow those guidelines? Best wishes, Stig

...

On 16 Feb 2024, at 20:45, Mohammed Naser <mnaser@vexxhost.com> wrote:

Hi Dmitriy,

I am really glad you're taking the time to respond – any feedback is greatly appreciated.

With regards to the service being "functional" out of the box. The way that I see this, is that this is a decision that needs to be taken by the operator: what is the backend that I will be using? I see this in the same way that "Cinder" out of the box has no default: you need a storage backend. While we (I think?) mostly test with LVM in the CI (and some Ceph CI and what not). Deployment tools can be a bit "nuanced" in that they may support one backend over another out of the box, but for something like Cinder, you need to decide what you're using.

With regards to those "drivers" not being usable or not being able to have "influence". I can't speak on any of the other ones, but on our side, I'm hoping we have enough "good karma" with the community knowing that we're welcome at working together with folks. I can't speak on behalf of others but we've had contributions from countless organizations that have added features such as Flatcar, support for HTTP proxies, and many other changes that we've merged with no issues from those organizations.

I don't entirely agree on the Cinder aspect, since Cinder third party CI mostly exists because OpenDev doesn't want to own a rack with 40 different types of storage systems.. but instead the 'built-in' CI works for the open source software backends (say, Ceph or LVM) and then third party is needed for those needing physical access to the machines. I think a CI that tests drivers would be productive if there's a "no built-in approach".

In this case, I do understand the concern that it seems to feel like a single organization driven project -- however, we've had contributors from all over – https://github.com/vexxhost/magnum-cluster-api/graphs/contributors – and the way I would see this is .. how is this different from any other OpenStack dependency if it was an "add-on". Silly example, `confluent-kafka` is a dependency of oslo.messaging if you're using Kafka as a backend. There's probably many other examples of other libraries that exist as OpenStack dependencies that might not even be single organization driven, but single "individual" driven...

I can't speak on the rest, but we've been asked to license the code as Apache 2.0 – which it has been .. just like any other OpenStack dependency. Magnum is really good at being a standard API that implements many backends, and the user should choose what backend they want. In the same way that Cinder is same way allowing multiple backends, and letting the user pick what they want.

Once again though, I really appreciate your thoughts on this and getting the conversation going, Dmitriy.

Thanks, Mohammed

From: Dmitriy Rabotyagov <noonedeadpunk@gmail.com> Sent: February 16, 2024 3:24 PM Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver

Hey,

I'm totally not to decide here, but just want to repeat my personal subjective opinion on the topic.

From what I do recall from the last PTG, which I was part of, more opinions were inclined towards having some default driver. Otherwise, service is not functional on it's own and must rely heavily on individual organisations to keep support for their drivers without any way of influencing them. In turn, organisations are able to dictate how project should be developed, what changes they are able or not able to do.

I'd say, this suggestion is similar to drop native driver from Octavia or Cinder and just assume that individual organisations will keep support of their drivers. With that, these service doing just fine with having some default drivers, and some third-party. Also adding third party CI to the service is not huge issue, to ensure that your driver integrates nicely.

With that said, OpenStack-Ansible right now is merging support and CI testing of Vexxhost driver. I also personally highly likely will be using it in future magnum deployments.

But I truly believe, that service without any driver is doomed to failure sooner then later. If we look at literally anything - automation tool, programming language, service - they all have some core libraries, and can be used, in a way, "out of the box". And here suggestion is to make service and all it's users to put faith in an organisation providing support for the driver, which may be quite tough sell for some highly regulated environments.

In case Magnum is going the road of not having any "out-of-the-box" implementation, i actually wonder if it should stay under OpenStack namespace rather then X, for instance, since it's not self-contained and heavily depends on third party, which might be licensed in a completely different way, as well as change their licensing with time which also put users at tough spot.

But dunno, as I said, this is my very subjective view, and I really can be wrong in this. Also not trying to push any party to any conclusion. So treat it as - one operator voice out of many.

On Fri, Feb 16, 2024, 21:02 Mohammed Naser <mnaser@vexxhost.com <mailto:mnaser@vexxhost.com>> wrote: Hi everyone,

It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this.

Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support.

Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features.

We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver.

Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here.

Thanks, Mohmamed

[1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Mohammed Naser

1:21 p.m.

Hi Stig: I'm surprised because there was no consensus on that in the vPTG, because we went back and forth for quite a long time in that conversation. Also, I'm not sure what the four opens have to do with this. The open core comments have been thrown around a few times and our work is all open, perhaps it's the fact that it doesn't live under the governance which is the big issue here? I've stated many times that we'd gladly upstream it if we saw any contributions at all, but we haven't .. so we kept on chugging along on our own pace. Also, I'm not saying to not merge it. I'm saying that Magnum should out of the box allow the user to pick what driver they want. The deployment tools will decide what to support. Thanks Mohammed Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: Stig Telfer <stig.openstack@telfer.org> Sent: Saturday, February 17, 2024 4:43:47 AM To: Mohammed Naser <mnaser@vexxhost.com> Cc: Dmitriy Rabotyagov <noonedeadpunk@gmail.com>; OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver You don't often get email from stig.openstack@telfer.org. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification> Hello Mohammed - I am surprised at your reaction to this. Broadly speaking this driver is a subject that has been undergoing the open design process since the Yoga PTG, and i do agree with Dmitriy that the consensus in the last PTG was that there is benefit to an in-tree driver. Development does appear to be following a course. The October 2023 PTG etherpad also begins with notes that the CAPI Helm driver cannot be merged until Magnum is refactored to accommodate it alongside the Vexxhost driver without conflict. There has been a delay of at least two cycles to support that. In short I don't think there are negative user impacts for Vexxhost's driver by this driver's introduction. The Vexxhost team has benefited from a rapid pace of development by doing so outside of OpenStack's governance and the four opens. Is it fair to obstruct the delivery of a driver that has attempted to follow those guidelines? Best wishes, Stig On 16 Feb 2024, at 20:45, Mohammed Naser <mnaser@vexxhost.com> wrote: Hi Dmitriy, I am really glad you're taking the time to respond – any feedback is greatly appreciated. With regards to the service being "functional" out of the box. The way that I see this, is that this is a decision that needs to be taken by the operator: what is the backend that I will be using? I see this in the same way that "Cinder" out of the box has no default: you need a storage backend. While we (I think?) mostly test with LVM in the CI (and some Ceph CI and what not). Deployment tools can be a bit "nuanced" in that they may support one backend over another out of the box, but for something like Cinder, you need to decide what you're using. With regards to those "drivers" not being usable or not being able to have "influence". I can't speak on any of the other ones, but on our side, I'm hoping we have enough "good karma" with the community knowing that we're welcome at working together with folks. I can't speak on behalf of others but we've had contributions from countless organizations that have added features such as Flatcar, support for HTTP proxies, and many other changes that we've merged with no issues from those organizations. I don't entirely agree on the Cinder aspect, since Cinder third party CI mostly exists because OpenDev doesn't want to own a rack with 40 different types of storage systems.. but instead the 'built-in' CI works for the open source software backends (say, Ceph or LVM) and then third party is needed for those needing physical access to the machines. I think a CI that tests drivers would be productive if there's a "no built-in approach". In this case, I do understand the concern that it seems to feel like a single organization driven project -- however, we've had contributors from all over – https://github.com/vexxhost/magnum-cluster-api/graphs/contributors – and the way I would see this is .. how is this different from any other OpenStack dependency if it was an "add-on". Silly example, `confluent-kafka` is a dependency of oslo.messaging if you're using Kafka as a backend. There's probably many other examples of other libraries that exist as OpenStack dependencies that might not even be single organization driven, but single "individual" driven... I can't speak on the rest, but we've been asked to license the code as Apache 2.0 – which it has been .. just like any other OpenStack dependency. Magnum is really good at being a standard API that implements many backends, and the user should choose what backend they want. In the same way that Cinder is same way allowing multiple backends, and letting the user pick what they want. Once again though, I really appreciate your thoughts on this and getting the conversation going, Dmitriy. Thanks, Mohammed ________________________________ From: Dmitriy Rabotyagov <noonedeadpunk@gmail.com> Sent: February 16, 2024 3:24 PM Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver Hey, I'm totally not to decide here, but just want to repeat my personal subjective opinion on the topic. From what I do recall from the last PTG, which I was part of, more opinions were inclined towards having some default driver. Otherwise, service is not functional on it's own and must rely heavily on individual organisations to keep support for their drivers without any way of influencing them. In turn, organisations are able to dictate how project should be developed, what changes they are able or not able to do. I'd say, this suggestion is similar to drop native driver from Octavia or Cinder and just assume that individual organisations will keep support of their drivers. With that, these service doing just fine with having some default drivers, and some third-party. Also adding third party CI to the service is not huge issue, to ensure that your driver integrates nicely. With that said, OpenStack-Ansible right now is merging support and CI testing of Vexxhost driver. I also personally highly likely will be using it in future magnum deployments. But I truly believe, that service without any driver is doomed to failure sooner then later. If we look at literally anything - automation tool, programming language, service - they all have some core libraries, and can be used, in a way, "out of the box". And here suggestion is to make service and all it's users to put faith in an organisation providing support for the driver, which may be quite tough sell for some highly regulated environments. In case Magnum is going the road of not having any "out-of-the-box" implementation, i actually wonder if it should stay under OpenStack namespace rather then X, for instance, since it's not self-contained and heavily depends on third party, which might be licensed in a completely different way, as well as change their licensing with time which also put users at tough spot. But dunno, as I said, this is my very subjective view, and I really can be wrong in this. Also not trying to push any party to any conclusion. So treat it as - one operator voice out of many. On Fri, Feb 16, 2024, 21:02 Mohammed Naser <mnaser@vexxhost.com<mailto:mnaser@vexxhost.com>> wrote: Hi everyone, It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this. Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support. Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features. We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver. Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here. Thanks, Mohmamed [1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Stig Telfer

20 Feb 20 Feb

10:43 a.m.

Hello Mohammed - Thank you for clarifying your point about merging the driver. I agree the ability to select drivers is important, not least because any new driver must coexist with the Heat driver for continuity on existing deployments. On your point about the four opens, I think this is key. The governance clearly matters, when we consider Elastic and Hashicorp (as mentioned previously on this thread). You could well say that Vexxhost is different, but bear in mind that Hashicorp and Elastic were also champions of open source licensing - until they weren't. From my point of view it is unfortunate that Vexxhost did not contribute to the upstream development of the Cluster API Helm driver which predated Vexxhost's internal development. I think I understand your rationale for that choice - developing in the four opens can be fraught for something on which business depends. However it is a strategy that brings its own risks, which is more general than Magnum and sometimes results in outcomes of this nature. In a previous role I have also had direct experience of that. Best wishes, Stig

...

On 17 Feb 2024, at 13:21, Mohammed Naser <mnaser@vexxhost.com> wrote:

Hi Stig:

I'm surprised because there was no consensus on that in the vPTG, because we went back and forth for quite a long time in that conversation.

Also, I'm not sure what the four opens have to do with this. The open core comments have been thrown around a few times and our work is all open, perhaps it's the fact that it doesn't live under the governance which is the big issue here?

I've stated many times that we'd gladly upstream it if we saw any contributions at all, but we haven't .. so we kept on chugging along on our own pace.

Also, I'm not saying to not merge it. I'm saying that Magnum should out of the box allow the user to pick what driver they want. The deployment tools will decide what to support.

Thanks Mohammed

Get Outlook for iOS <https://aka.ms/o0ukef> From: Stig Telfer <stig.openstack@telfer.org> Sent: Saturday, February 17, 2024 4:43:47 AM To: Mohammed Naser <mnaser@vexxhost.com> Cc: Dmitriy Rabotyagov <noonedeadpunk@gmail.com>; OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver

You don't often get email from stig.openstack@telfer.org. Learn why this is important <https://aka.ms/LearnAboutSenderIdentification> Hello Mohammed -

I am surprised at your reaction to this. Broadly speaking this driver is a subject that has been undergoing the open design process since the Yoga PTG, and i do agree with Dmitriy that the consensus in the last PTG was that there is benefit to an in-tree driver. Development does appear to be following a course.

The October 2023 PTG etherpad also begins with notes that the CAPI Helm driver cannot be merged until Magnum is refactored to accommodate it alongside the Vexxhost driver without conflict. There has been a delay of at least two cycles to support that.

In short I don't think there are negative user impacts for Vexxhost's driver by this driver's introduction. The Vexxhost team has benefited from a rapid pace of development by doing so outside of OpenStack's governance and the four opens. Is it fair to obstruct the delivery of a driver that has attempted to follow those guidelines?

Best wishes, Stig

...
On 16 Feb 2024, at 20:45, Mohammed Naser <mnaser@vexxhost.com> wrote:

Hi Dmitriy,

I am really glad you're taking the time to respond – any feedback is greatly appreciated.

With regards to the service being "functional" out of the box. The way that I see this, is that this is a decision that needs to be taken by the operator: what is the backend that I will be using? I see this in the same way that "Cinder" out of the box has no default: you need a storage backend. While we (I think?) mostly test with LVM in the CI (and some Ceph CI and what not). Deployment tools can be a bit "nuanced" in that they may support one backend over another out of the box, but for something like Cinder, you need to decide what you're using.

With regards to those "drivers" not being usable or not being able to have "influence". I can't speak on any of the other ones, but on our side, I'm hoping we have enough "good karma" with the community knowing that we're welcome at working together with folks. I can't speak on behalf of others but we've had contributions from countless organizations that have added features such as Flatcar, support for HTTP proxies, and many other changes that we've merged with no issues from those organizations.

I don't entirely agree on the Cinder aspect, since Cinder third party CI mostly exists because OpenDev doesn't want to own a rack with 40 different types of storage systems.. but instead the 'built-in' CI works for the open source software backends (say, Ceph or LVM) and then third party is needed for those needing physical access to the machines. I think a CI that tests drivers would be productive if there's a "no built-in approach".

In this case, I do understand the concern that it seems to feel like a single organization driven project -- however, we've had contributors from all over – https://github.com/vexxhost/magnum-cluster-api/graphs/contributors – and the way I would see this is .. how is this different from any other OpenStack dependency if it was an "add-on". Silly example, `confluent-kafka` is a dependency of oslo.messaging if you're using Kafka as a backend. There's probably many other examples of other libraries that exist as OpenStack dependencies that might not even be single organization driven, but single "individual" driven...

I can't speak on the rest, but we've been asked to license the code as Apache 2.0 – which it has been .. just like any other OpenStack dependency. Magnum is really good at being a standard API that implements many backends, and the user should choose what backend they want. In the same way that Cinder is same way allowing multiple backends, and letting the user pick what they want.

Once again though, I really appreciate your thoughts on this and getting the conversation going, Dmitriy.

Thanks, Mohammed

From: Dmitriy Rabotyagov <noonedeadpunk@gmail.com> Sent: February 16, 2024 3:24 PM Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver

Hey,

I'm totally not to decide here, but just want to repeat my personal subjective opinion on the topic.

From what I do recall from the last PTG, which I was part of, more opinions were inclined towards having some default driver. Otherwise, service is not functional on it's own and must rely heavily on individual organisations to keep support for their drivers without any way of influencing them. In turn, organisations are able to dictate how project should be developed, what changes they are able or not able to do.

I'd say, this suggestion is similar to drop native driver from Octavia or Cinder and just assume that individual organisations will keep support of their drivers. With that, these service doing just fine with having some default drivers, and some third-party. Also adding third party CI to the service is not huge issue, to ensure that your driver integrates nicely.

With that said, OpenStack-Ansible right now is merging support and CI testing of Vexxhost driver. I also personally highly likely will be using it in future magnum deployments.

But I truly believe, that service without any driver is doomed to failure sooner then later. If we look at literally anything - automation tool, programming language, service - they all have some core libraries, and can be used, in a way, "out of the box". And here suggestion is to make service and all it's users to put faith in an organisation providing support for the driver, which may be quite tough sell for some highly regulated environments.

In case Magnum is going the road of not having any "out-of-the-box" implementation, i actually wonder if it should stay under OpenStack namespace rather then X, for instance, since it's not self-contained and heavily depends on third party, which might be licensed in a completely different way, as well as change their licensing with time which also put users at tough spot.

But dunno, as I said, this is my very subjective view, and I really can be wrong in this. Also not trying to push any party to any conclusion. So treat it as - one operator voice out of many.

On Fri, Feb 16, 2024, 21:02 Mohammed Naser <mnaser@vexxhost.com <mailto:mnaser@vexxhost.com>> wrote: Hi everyone,

It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this.

Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support.

Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features.

We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver.

Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here.

Thanks, Mohmamed

[1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Mohammed Naser

11:50 a.m.

Hi Stig, Actually, this driver was built from the start to be depending on another project's external Helm charts to exist. I've actually reached out in a review on August 22nd 2022 on an initial small stub asking if there's anything more to it: https://review.opendev.org/c/openstack/magnum/+/851076/5 The answer was no, that's as far as we've got. Hearing or seeing no updates in there, we decided to start our effort on October 20th 2022, a whole 2 months after when that patch. The next revision of that patchset came in in November 14th 2022. https://github.com/vexxhost/magnum-cluster-api/commit/8444be3a6c1b62a99a8775... I don't think we were not participating. At that point, I get removed from the CC'd on the patch in April 2023, and after a request to re-license our driver to Apache 2.0 and code which seems practically the same start to show up: https://review.opendev.org/c/openstack/magnum/+/851076/19/magnum/drivers/clu... I think it is massively unfair to first say we didn't participate, we actually tried and lead the whole thing. I also think it is not great for open source to assume that everyone will be Hashicorp or an Elastic. If we're going to be operating under that general assumption, a lot of this stuff is going to fall apart. If "bad intentions" are assumed out of the box, you're seeming to mention that the only way to do open source is under "open governance" Once again, I expressed MANY times that we will gladly contribute the driver if we see folks contributing to it, however we've yet to see that, so I don't see why we should slow everything down to satisfy a group that doesn't contribute anything to the driver. At this point, the contributed driver is "behind" the "actual" development happening out of an open governance: https://github.com/stackhpc/magnum-capi-helm We can see some folks from Catalsyt Cloud which have decided to work with you to push the driver contributing to that repository. There is also dependencies on image builds which are pointing to your servers. There is dependencies on Helm charts that are living on your GitHub pages, with no code being uploaded since the repository was created for over 6 months: https://opendev.org/openstack/magnum-capi-helm-charts So to me, it feels like there is efforts happening somewhere else, and now they're just being pushed in now. If you truly believed in what you were saying, this could have easily been an OpenDev repo, an extra project under the Magnum governance, and all the Helm chart development would have lived there. Instead, the work is ALL done elsewhere, and now it's being code dumped into Magnum. I hope you see my concerns with all this. This is why a driver that's mostly been developed out of tree, with artifacts living out of tree (arguably, the Helm charts are the biggest most critical dependency and that's not being developed in governance) being merged is concerning. Does this mean it's okay to setup a job that just code dumps our code straight into Magnum while we continue to work on it elsewhere and that's fine? Mohammed ________________________________ From: Stig Telfer <stig.openstack@telfer.org> Sent: February 20, 2024 5:43 AM To: Mohammed Naser <mnaser@vexxhost.com> Cc: Dmitriy Rabotyagov <noonedeadpunk@gmail.com>; OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver You don't often get email from stig.openstack@telfer.org. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification> Hello Mohammed - Thank you for clarifying your point about merging the driver. I agree the ability to select drivers is important, not least because any new driver must coexist with the Heat driver for continuity on existing deployments. On your point about the four opens, I think this is key. The governance clearly matters, when we consider Elastic and Hashicorp (as mentioned previously on this thread). You could well say that Vexxhost is different, but bear in mind that Hashicorp and Elastic were also champions of open source licensing - until they weren't. From my point of view it is unfortunate that Vexxhost did not contribute to the upstream development of the Cluster API Helm driver which predated Vexxhost's internal development. I think I understand your rationale for that choice - developing in the four opens can be fraught for something on which business depends. However it is a strategy that brings its own risks, which is more general than Magnum and sometimes results in outcomes of this nature. In a previous role I have also had direct experience of that. Best wishes, Stig On 17 Feb 2024, at 13:21, Mohammed Naser <mnaser@vexxhost.com> wrote: Hi Stig: I'm surprised because there was no consensus on that in the vPTG, because we went back and forth for quite a long time in that conversation. Also, I'm not sure what the four opens have to do with this. The open core comments have been thrown around a few times and our work is all open, perhaps it's the fact that it doesn't live under the governance which is the big issue here? I've stated many times that we'd gladly upstream it if we saw any contributions at all, but we haven't .. so we kept on chugging along on our own pace. Also, I'm not saying to not merge it. I'm saying that Magnum should out of the box allow the user to pick what driver they want. The deployment tools will decide what to support. Thanks Mohammed Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: Stig Telfer <stig.openstack@telfer.org> Sent: Saturday, February 17, 2024 4:43:47 AM To: Mohammed Naser <mnaser@vexxhost.com> Cc: Dmitriy Rabotyagov <noonedeadpunk@gmail.com>; OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver You don't often get email from stig.openstack@telfer.org. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification> Hello Mohammed - I am surprised at your reaction to this. Broadly speaking this driver is a subject that has been undergoing the open design process since the Yoga PTG, and i do agree with Dmitriy that the consensus in the last PTG was that there is benefit to an in-tree driver. Development does appear to be following a course. The October 2023 PTG etherpad also begins with notes that the CAPI Helm driver cannot be merged until Magnum is refactored to accommodate it alongside the Vexxhost driver without conflict. There has been a delay of at least two cycles to support that. In short I don't think there are negative user impacts for Vexxhost's driver by this driver's introduction. The Vexxhost team has benefited from a rapid pace of development by doing so outside of OpenStack's governance and the four opens. Is it fair to obstruct the delivery of a driver that has attempted to follow those guidelines? Best wishes, Stig On 16 Feb 2024, at 20:45, Mohammed Naser <mnaser@vexxhost.com> wrote: Hi Dmitriy, I am really glad you're taking the time to respond – any feedback is greatly appreciated. With regards to the service being "functional" out of the box. The way that I see this, is that this is a decision that needs to be taken by the operator: what is the backend that I will be using? I see this in the same way that "Cinder" out of the box has no default: you need a storage backend. While we (I think?) mostly test with LVM in the CI (and some Ceph CI and what not). Deployment tools can be a bit "nuanced" in that they may support one backend over another out of the box, but for something like Cinder, you need to decide what you're using. With regards to those "drivers" not being usable or not being able to have "influence". I can't speak on any of the other ones, but on our side, I'm hoping we have enough "good karma" with the community knowing that we're welcome at working together with folks. I can't speak on behalf of others but we've had contributions from countless organizations that have added features such as Flatcar, support for HTTP proxies, and many other changes that we've merged with no issues from those organizations. I don't entirely agree on the Cinder aspect, since Cinder third party CI mostly exists because OpenDev doesn't want to own a rack with 40 different types of storage systems.. but instead the 'built-in' CI works for the open source software backends (say, Ceph or LVM) and then third party is needed for those needing physical access to the machines. I think a CI that tests drivers would be productive if there's a "no built-in approach". In this case, I do understand the concern that it seems to feel like a single organization driven project -- however, we've had contributors from all over – https://github.com/vexxhost/magnum-cluster-api/graphs/contributors – and the way I would see this is .. how is this different from any other OpenStack dependency if it was an "add-on". Silly example, `confluent-kafka` is a dependency of oslo.messaging if you're using Kafka as a backend. There's probably many other examples of other libraries that exist as OpenStack dependencies that might not even be single organization driven, but single "individual" driven... I can't speak on the rest, but we've been asked to license the code as Apache 2.0 – which it has been .. just like any other OpenStack dependency. Magnum is really good at being a standard API that implements many backends, and the user should choose what backend they want. In the same way that Cinder is same way allowing multiple backends, and letting the user pick what they want. Once again though, I really appreciate your thoughts on this and getting the conversation going, Dmitriy. Thanks, Mohammed ________________________________ From: Dmitriy Rabotyagov <noonedeadpunk@gmail.com> Sent: February 16, 2024 3:24 PM Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver Hey, I'm totally not to decide here, but just want to repeat my personal subjective opinion on the topic. From what I do recall from the last PTG, which I was part of, more opinions were inclined towards having some default driver. Otherwise, service is not functional on it's own and must rely heavily on individual organisations to keep support for their drivers without any way of influencing them. In turn, organisations are able to dictate how project should be developed, what changes they are able or not able to do. I'd say, this suggestion is similar to drop native driver from Octavia or Cinder and just assume that individual organisations will keep support of their drivers. With that, these service doing just fine with having some default drivers, and some third-party. Also adding third party CI to the service is not huge issue, to ensure that your driver integrates nicely. With that said, OpenStack-Ansible right now is merging support and CI testing of Vexxhost driver. I also personally highly likely will be using it in future magnum deployments. But I truly believe, that service without any driver is doomed to failure sooner then later. If we look at literally anything - automation tool, programming language, service - they all have some core libraries, and can be used, in a way, "out of the box". And here suggestion is to make service and all it's users to put faith in an organisation providing support for the driver, which may be quite tough sell for some highly regulated environments. In case Magnum is going the road of not having any "out-of-the-box" implementation, i actually wonder if it should stay under OpenStack namespace rather then X, for instance, since it's not self-contained and heavily depends on third party, which might be licensed in a completely different way, as well as change their licensing with time which also put users at tough spot. But dunno, as I said, this is my very subjective view, and I really can be wrong in this. Also not trying to push any party to any conclusion. So treat it as - one operator voice out of many. On Fri, Feb 16, 2024, 21:02 Mohammed Naser <mnaser@vexxhost.com<mailto:mnaser@vexxhost.com>> wrote: Hi everyone, It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this. Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support. Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features. We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver. Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here. Thanks, Mohmamed [1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Jake Yip

19 Feb 19 Feb

11:08 a.m.

Hi Mohmamed, all, I would like to contribute a few points from the POV of Magnum PTL and the Magnum Core Team. I think it is important to present my point of view to see how we came to the decisions we did. First of all, let me say we are not merging a 'default' driver for ClusterAPI. In fact, all the work so far has been towards the direction of supporting multiple drivers. A brief sequence of events (that were also captured in StackHPC's blog post): * StackHPC proposed a spec[1] in the beginning of 2022. * Code for CAPI driver was also proposed around the same time[2]. * Work was paused a for a while until 2023. People were busy, Magnum Core Team members moved on. Magnum was leaderless in Zed. I came onboard as PTL in Antelope, midway through this effort. I was unaware of the history and dynamics of this effort, which is important later on. * StackHPC picked up the work on their driver in 2023. It was marked as ready for review in Jun 2023. * VEXXHOST worked on their own driver[3] outside of Magnum. * StackHPC and VEXXHOST announced a talk together about ClusterAPI[4] in Jun 2023 in the Vancouver summit. I had (mistakenly) believed that the driver StackHPC is working on is fine by both VEXXHOST and StackHPC, since they were having a talk together. * Magnum Core was ready to merge StackHPC driver for Bobcat, when jrosser[5] asked out of the blue if this will conflict with VEXXHOST implementation. Realising that is the case, we decided to hold off merging in Bobcat. * This was not an easy decision, given that StackHPC have been putting in effort into getting the driver upstream, and many users have been asking for a Cluster API driver. However, I felt that the most important thing is DO NOT BREAK PRODUCTION implementations, which lead us to holding it off. * Subsequently in the vPTG, we discussed about the state of two drivers. * Magnum Core Team decided that, instead of choosing one driver or another, Magnum should first support multiple drivers. * Improving Magnum driver discovery mechanism became a Caracal goal. There is a spec[6] and an implementation[7]. Which brings us up to current date. --- My view as the PTL is as follow: * I work within OpenStack processes. The community propose changes, the Core Team review and merge (or drop). * StackHPC driver is not the 'default'. It is a driver, albeit the only one, contributed into Magnum code. * We are open to merging more CAPI drivers into Magnum code. (Other users have raised different CAPI ideas, we have encouraged them to send patches) * Magnum is open to supporting out of tree drivers too. * Magnum have put effort into ensuring multiple drivers can co-exist. It is important to us that VEXXHOST (and others) can continue using its driver. * Deployment tools like OpenStack-Ansible can continue to package any driver as an add-on, and use that instead as their default. (there are more thoughts... but these should suffice for now...) --- Mohammed, I will respond to your comments separately, but I may await the community's input first, since the initial email was directed the community. Regards, Jake [1] https://review.opendev.org/c/openstack/magnum-specs/+/824488 [2] https://review.opendev.org/c/openstack/magnum/+/815521 [3] https://github.com/vexxhost/magnum-cluster-api [4] https://www.youtube.com/watch?v=GjSsP4-p-SQ [5] https://meetings.opendev.org/meetings/magnum/2023/magnum.2023-08-30-09.01.lo... [6] https://review.opendev.org/c/openstack/magnum-specs/+/900410 [7] https://review.opendev.org/c/openstack/magnum/+/907297 On 17/2/2024 6:55 am, Mohammed Naser wrote:

...

Hi everyone,

It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this.

Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support.

Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features.

We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver.

Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here.

Thanks, Mohmamed

[1]: https://www.stackhpc.com/magnum-clusterapi.html <https://www.stackhpc.com/magnum-clusterapi.html> [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770 <https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770>

Mohammed Naser

1:43 p.m.

Thanks for your reply Jake. However, merging a driver into Magnum does ship the message that it is the default. Also, I can’t imagine a world where there’s two CAPI drivers inside of Magnum. I just think everything should stay out of tree till an implementation stands out to be “the one” then we can decide. Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: Jake Yip <jake.yip@ardc.edu.au> Sent: Monday, February 19, 2024 6:08:15 AM To: OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver Hi Mohmamed, all, I would like to contribute a few points from the POV of Magnum PTL and the Magnum Core Team. I think it is important to present my point of view to see how we came to the decisions we did. First of all, let me say we are not merging a 'default' driver for ClusterAPI. In fact, all the work so far has been towards the direction of supporting multiple drivers. A brief sequence of events (that were also captured in StackHPC's blog post): * StackHPC proposed a spec[1] in the beginning of 2022. * Code for CAPI driver was also proposed around the same time[2]. * Work was paused a for a while until 2023. People were busy, Magnum Core Team members moved on. Magnum was leaderless in Zed. I came onboard as PTL in Antelope, midway through this effort. I was unaware of the history and dynamics of this effort, which is important later on. * StackHPC picked up the work on their driver in 2023. It was marked as ready for review in Jun 2023. * VEXXHOST worked on their own driver[3] outside of Magnum. * StackHPC and VEXXHOST announced a talk together about ClusterAPI[4] in Jun 2023 in the Vancouver summit. I had (mistakenly) believed that the driver StackHPC is working on is fine by both VEXXHOST and StackHPC, since they were having a talk together. * Magnum Core was ready to merge StackHPC driver for Bobcat, when jrosser[5] asked out of the blue if this will conflict with VEXXHOST implementation. Realising that is the case, we decided to hold off merging in Bobcat. * This was not an easy decision, given that StackHPC have been putting in effort into getting the driver upstream, and many users have been asking for a Cluster API driver. However, I felt that the most important thing is DO NOT BREAK PRODUCTION implementations, which lead us to holding it off. * Subsequently in the vPTG, we discussed about the state of two drivers. * Magnum Core Team decided that, instead of choosing one driver or another, Magnum should first support multiple drivers. * Improving Magnum driver discovery mechanism became a Caracal goal. There is a spec[6] and an implementation[7]. Which brings us up to current date. --- My view as the PTL is as follow: * I work within OpenStack processes. The community propose changes, the Core Team review and merge (or drop). * StackHPC driver is not the 'default'. It is a driver, albeit the only one, contributed into Magnum code. * We are open to merging more CAPI drivers into Magnum code. (Other users have raised different CAPI ideas, we have encouraged them to send patches) * Magnum is open to supporting out of tree drivers too. * Magnum have put effort into ensuring multiple drivers can co-exist. It is important to us that VEXXHOST (and others) can continue using its driver. * Deployment tools like OpenStack-Ansible can continue to package any driver as an add-on, and use that instead as their default. (there are more thoughts... but these should suffice for now...) --- Mohammed, I will respond to your comments separately, but I may await the community's input first, since the initial email was directed the community. Regards, Jake [1] https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Freview.opendev.org%2Fc%2Fopenstack%2Fmagnum-specs%2F%2B%2F824488&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394093494%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=3MiXNaN%2B%2FRczyWklmN0nMAcVSXQc7KDQmg%2BDExLCrCA%3D&reserved=0<https://review.opendev.org/c/openstack/magnum-specs/+/824488> [2] https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Freview.opendev.org%2Fc%2Fopenstack%2Fmagnum%2F%2B%2F815521&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394100382%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=1p7MwQTjTwfjXYfC1e6HLm3NWBN2Y8m53dh5SybCWw0%3D&reserved=0<https://review.opendev.org/c/openstack/magnum/+/815521> [3] https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fvexxhost%2Fmagnum-cluster-api&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394105191%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=TRt1Go5VAOv9IKza7aLWq0P%2BvCuid9h%2BSNdm0gP8F8s%3D&reserved=0<https://github.com/vexxhost/magnum-cluster-api> [4] https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DGjSsP4-p-SQ&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394109167%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=RcuGFeN2a49m%2B16HINXeGr344kZ3oPJY8zZKCe89uMI%3D&reserved=0<https://www.youtube.com/watch?v=GjSsP4-p-SQ> [5] https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmeetings.opendev.org%2Fmeetings%2Fmagnum%2F2023%2Fmagnum.2023-08-30-09.01.log.html%23l-62&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394112626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Y701NJL6m35mVfsKROQ9Slk96fuve6UqYKUONPHfrMM%3D&reserved=0<https://meetings.opendev.org/meetings/magnum/2023/magnum.2023-08-30-09.01.log.html#l-62> [6] https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Freview.opendev.org%2Fc%2Fopenstack%2Fmagnum-specs%2F%2B%2F900410&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394116653%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=a%2FsfN0F4eHrMY6Fpok7b8CTP1kT%2BP%2FO9EacEb6bgXMI%3D&reserved=0<https://review.opendev.org/c/openstack/magnum-specs/+/900410> [7] https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Freview.opendev.org%2Fc%2Fopenstack%2Fmagnum%2F%2B%2F907297&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394120465%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=wqOUm3qwqhjXqflFRYjGwqb14vJXtxk3smoJ%2B2zFrds%3D&reserved=0<https://review.opendev.org/c/openstack/magnum/+/907297> On 17/2/2024 6:55 am, Mohammed Naser wrote:

...

Hi everyone,

It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this.

Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support.

Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features.

We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver.

Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here.

Thanks, Mohmamed

[1]: https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.stackhpc.com%2Fmagnum-clusterapi.html&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394124062%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=TL7DUTNaoQ5L%2ByHhCct35Aod%2Bf%2B%2BDd0pu8UIK%2FuAGXc%3D&reserved=0<https://www.stackhpc.com/magnum-clusterapi.html> <https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.stackhpc.com%2Fmagnum-clusterapi.html&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394127663%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=eqBvxEXPp8MSoh22UmT%2Bvo3EdLW4YE87fRr2IFu8RJk%3D&reserved=0<https://www.stackhpc.com/magnum-clusterapi.html>> [2]: https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fetherpad.opendev.org%2Fp%2Fr.5b94a5a9dbf4e3bf8202b5e57cb93770&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394131257%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Fg246gAMLPfPBED3MuIj4tKPgXG4rouKJmZcAguqVt4%3D&reserved=0<https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770> <https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fetherpad.opendev.org%2Fp%2Fr.5b94a5a9dbf4e3bf8202b5e57cb93770&data=05%7C02%7Cmnaser%40vexxhost.com%7Cba2e8c9994f84a6edd1b08dc313bddda%7C54e2b12264054dafa35bf65edc45c621%7C0%7C0%7C638439380394134730%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=x5RF90PP%2BVUrdbiwRsr7vuKZCpPutA1QfrW11%2FMmu9A%3D&reserved=0<https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770>>

Jake Yip

21 Feb 21 Feb

6:03 a.m.

On 20/2/2024 12:43 am, Mohammed Naser wrote:

...

Thanks for your reply Jake.

However, merging a driver into Magnum does ship the message that it is the default.

Possibly what I am trying to do is to correct the message. Magnum had multiple heat drivers, going from fedora atomic to coreos to fedora coreos. IIRC, at some point, there was also talk of a heat Ubuntu, which if contributed and found working, could have been accepted. (sadly that did not materialise). So, in my opinion, there never was a 'default' driver.

...

Also, I can’t imagine a world where there’s two CAPI drivers inside of Magnum.

Why not? People move on. CAPI implementation differs. On this point there was someone asking if they want to implement a CAPI driver without a control cluster, would it be accepted? The answer is Yes.

...

I just think everything should stay out of tree till an implementation stands out to be “the one” then we can decide.

I think this may be the crux of the issue, there is no objective standard to determining 'the one'. I cannot, in good conscious, deny a commit from the community because another organisation have developed something. (This is my personal opinion, if OpenStack TC wants to rule on this I am willing to accept that) If the driver was 'official' from the ClusterAPI sub-project, then the conversation may be different. We may be able to treat it like third party drivers in Cinder, and point interested contributors to the effort made by the vendor. Regards, Jake

Thomas Goirand

27 Feb 27 Feb

3:18 p.m.

On 2/19/24 14:43, Mohammed Naser wrote:

...

Thanks for your reply Jake.

However, merging a driver into Magnum does ship the message that it is the default.

Also, I can’t imagine a world where there’s two CAPI drivers inside of Magnum.

I just think everything should stay out of tree till an implementation stands out to be “the one” then we can decide.

I very much think this is a huge waste of time. StackHPC and Vexxhost should be working on merging the 2 drivers, if they are doing the same thing, no? From an operator perspective, I'm now *very* confused on what I should be using, which one is best, and why... Cheers, Thomas Goirand (zigo)

Dale Smith

21 Feb 21 Feb

8:33 p.m.

Hi Mohammed, thanks for raising the discussion. I've been reading with interest the ideas raised here. Firstly, I think support for Cluster API is the way forward for Magnum Kubernetes COE. The features and community effort around it are greater than Magnum can currently provide. It would be easier if there was one driver and we were all contributing to that instead of developing parallel features, but perhaps the drivers will end up in a fairly stable state over time as most change will be concentrated in the Cluster API and kubeadm codebases between Kubernetes versions. My company is actively using the StackHPC magnum-capi-helm driver, and contributing to the development (firstly, in Gerrit patches on top of the large chain, then in Github when it was held merging and moved location). Magnum in the past supported several COEs and drivers. Now, the Magnum project has only one maintained driver; the Fedora CoreOS Heat driver[1]. Once the project has documentation and most folk have moved to some CAPI driver, I'm not sure how long term this will be maintained; especially with in place upgrades removed and if Heat deprecate SoftwareDeployment[2]. That leaves the Magnum project with a bit of a choice: 1. Figure out how the project continues to keep stability with no in-tree drivers. 2. Merge one or more Cluster API drivers. As a Magnum Core member, I'm reviewing and proposing changes with the mindset that we want to actively enable out-of-tree drivers and ensuring these drivers can provide the features that matter to them (eg. CNIs, control plane resizing). It's important to me we do not introduce changes that break out-of-tree drivers or lock users into one driver with no path to migrating to another. However, I'm not sure I'm in support of asking another driver to be relocated and not contributed when it wants to be. That seems like the contentious bit. I don't think that's in line with the Magnum Development Policies[3], specifically the Rule of Thumb and How We Make Decisions sections. Having drivers in another Opendev repo is an option, this seems like the compromise that would suit most. It will need folk to lead that effort and keep the Magnum project stable with (eventually, as I see it) no production-ready in-tree drivers once Heat Fedora CoreOS is unmaintained. I do support the idea from the vPTG of adding documentation to Magnum to help with the decision of which driver to use, including out of tree drivers. A feature compatibility matrix and links to these drivers' documentation would be helpful to this end. Perhaps that's a level playing field enough, as drivers can be disabled or ignored if they aren't required (See: the other drivers removed recently). regards, Dale Smith [1] https://docs.openstack.org/magnum/latest/user/index.html#cluster-drivers [2] https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.... [3] https://docs.openstack.org/magnum/latest/contributor/policies.html On 17/02/24 08:55, Mohammed Naser wrote:

...

Hi everyone,

It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this.

Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support.

Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features.

We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver.

Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here.

Thanks, Mohmamed

[1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Satish Patel

23 Feb 23 Feb

7:20 p.m.

Folks, I am silently reading this thread. Recently we deployed magnum with "magnum-cluster-api" driver in production and I am very happy with all the functionality. I wasn't aware of the "magnum-capi-helm" driver and if you are saying it will be native driver then what should we expect in future? Is it going to break our deployment with future upgrades or we will have a choice to pick either one? Just looking for clarification and future expectations. On Wed, Feb 21, 2024 at 3:47 PM Dale Smith <dale@catalystcloud.nz> wrote:

...

Hi Mohammed, thanks for raising the discussion. I've been reading with interest the ideas raised here.

Firstly, I think support for Cluster API is the way forward for Magnum Kubernetes COE. The features and community effort around it are greater than Magnum can currently provide.

It would be easier if there was one driver and we were all contributing to that instead of developing parallel features, but perhaps the drivers will end up in a fairly stable state over time as most change will be concentrated in the Cluster API and kubeadm codebases between Kubernetes versions.

My company is actively using the StackHPC magnum-capi-helm driver, and contributing to the development (firstly, in Gerrit patches on top of the large chain, then in Github when it was held merging and moved location).

Magnum in the past supported several COEs and drivers. Now, the Magnum project has only one maintained driver; the Fedora CoreOS Heat driver[1]. Once the project has documentation and most folk have moved to some CAPI driver, I'm not sure how long term this will be maintained; especially with in place upgrades removed and if Heat deprecate SoftwareDeployment[2].

That leaves the Magnum project with a bit of a choice: 1. Figure out how the project continues to keep stability with no in-tree drivers. 2. Merge one or more Cluster API drivers.

As a Magnum Core member, I'm reviewing and proposing changes with the mindset that we want to actively enable out-of-tree drivers and ensuring these drivers can provide the features that matter to them (eg. CNIs, control plane resizing). It's important to me we do not introduce changes that break out-of-tree drivers or lock users into one driver with no path to migrating to another.

However, I'm not sure I'm in support of asking another driver to be relocated and not contributed when it wants to be. That seems like the contentious bit. I don't think that's in line with the Magnum Development Policies[3], specifically the Rule of Thumb and How We Make Decisions sections.

Having drivers in another Opendev repo is an option, this seems like the compromise that would suit most. It will need folk to lead that effort and keep the Magnum project stable with (eventually, as I see it) no production-ready in-tree drivers once Heat Fedora CoreOS is unmaintained.

I do support the idea from the vPTG of adding documentation to Magnum to help with the decision of which driver to use, including out of tree drivers. A feature compatibility matrix and links to these drivers' documentation would be helpful to this end. Perhaps that's a level playing field enough, as drivers can be disabled or ignored if they aren't required (See: the other drivers removed recently).

regards, Dale Smith

[1] https://docs.openstack.org/magnum/latest/user/index.html#cluster-drivers [2] https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.... [3] https://docs.openstack.org/magnum/latest/contributor/policies.html

On 17/02/24 08:55, Mohammed Naser wrote:

Hi everyone,

It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this.

Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support.

Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features.

We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver.

Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here.

Thanks, Mohmamed

[1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Mohammed Naser

7:28 p.m.

I’m aiming to make it a choice that you decide. The magnum team wants to make it so that choice is available but the other driver is pre installed by default. I think the fair thing for our users at this point is to make it for the user to decide. Needless to say, we will continue to contribute and build out our driver. Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: Satish Patel <satish.txt@gmail.com> Sent: Friday, February 23, 2024 2:20:15 PM To: Dale Smith <dale@catalystcloud.nz> Cc: openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver Folks, I am silently reading this thread. Recently we deployed magnum with "magnum-cluster-api" driver in production and I am very happy with all the functionality. I wasn't aware of the "magnum-capi-helm" driver and if you are saying it will be native driver then what should we expect in future? Is it going to break our deployment with future upgrades or we will have a choice to pick either one? Just looking for clarification and future expectations. On Wed, Feb 21, 2024 at 3:47 PM Dale Smith <dale@catalystcloud.nz<mailto:dale@catalystcloud.nz>> wrote: Hi Mohammed, thanks for raising the discussion. I've been reading with interest the ideas raised here. Firstly, I think support for Cluster API is the way forward for Magnum Kubernetes COE. The features and community effort around it are greater than Magnum can currently provide. It would be easier if there was one driver and we were all contributing to that instead of developing parallel features, but perhaps the drivers will end up in a fairly stable state over time as most change will be concentrated in the Cluster API and kubeadm codebases between Kubernetes versions. My company is actively using the StackHPC magnum-capi-helm driver, and contributing to the development (firstly, in Gerrit patches on top of the large chain, then in Github when it was held merging and moved location). Magnum in the past supported several COEs and drivers. Now, the Magnum project has only one maintained driver; the Fedora CoreOS Heat driver[1]. Once the project has documentation and most folk have moved to some CAPI driver, I'm not sure how long term this will be maintained; especially with in place upgrades removed and if Heat deprecate SoftwareDeployment[2]. That leaves the Magnum project with a bit of a choice: 1. Figure out how the project continues to keep stability with no in-tree drivers. 2. Merge one or more Cluster API drivers. As a Magnum Core member, I'm reviewing and proposing changes with the mindset that we want to actively enable out-of-tree drivers and ensuring these drivers can provide the features that matter to them (eg. CNIs, control plane resizing). It's important to me we do not introduce changes that break out-of-tree drivers or lock users into one driver with no path to migrating to another. However, I'm not sure I'm in support of asking another driver to be relocated and not contributed when it wants to be. That seems like the contentious bit. I don't think that's in line with the Magnum Development Policies[3], specifically the Rule of Thumb and How We Make Decisions sections. Having drivers in another Opendev repo is an option, this seems like the compromise that would suit most. It will need folk to lead that effort and keep the Magnum project stable with (eventually, as I see it) no production-ready in-tree drivers once Heat Fedora CoreOS is unmaintained. I do support the idea from the vPTG of adding documentation to Magnum to help with the decision of which driver to use, including out of tree drivers. A feature compatibility matrix and links to these drivers' documentation would be helpful to this end. Perhaps that's a level playing field enough, as drivers can be disabled or ignored if they aren't required (See: the other drivers removed recently). regards, Dale Smith [1] https://docs.openstack.org/magnum/latest/user/index.html#cluster-drivers [2] https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.... [3] https://docs.openstack.org/magnum/latest/contributor/policies.html On 17/02/24 08:55, Mohammed Naser wrote: Hi everyone, It seems that based on a recent post that I saw on the Twitter-sphere[1], it seems that the Magnum team has decided to move forward with setting a default driver for the Cluster API. I don't agree with this. Unfortunately, we don't have much recordings of our PTG conversations, and since the notes on the PTG are pretty light[2], I don't see any details of that. I don't see why we should be merging a native Cluster API driver and not allowing the operators and deployment tools decide which one they want to support. Personally, I think that the Helm driver should be built in a separate repository and then installed by the user as a Python package once they understand what are the backends and their options, as opposed to shipping one out of the box with far less features. We already have successful users which are using our implementation in production that have also posted blogs and shared their experience, we'd be doing them a disservice by now announcing a 'native' built-in driver. Can the decision be changed so that there is no built-in driver, and you must decide to install one yourself? Also, I would be really happy to hear feedback from users of the driver for their comments here. Thanks, Mohmamed [1]: https://www.stackhpc.com/magnum-clusterapi.html [2]: https://etherpad.opendev.org/p/r.5b94a5a9dbf4e3bf8202b5e57cb93770

Jake Yip

27 Feb 27 Feb

6:51 a.m.

On 24/2/24 06:20, Satish Patel wrote:

...

Folks,

I am silently reading this thread. Recently we deployed magnum with "magnum-cluster-api" driver in production and I am very happy with all the functionality. I wasn't aware of the "magnum-capi-helm" driver and if you are saying it will be native driver then what should we expect in future? Is it going to break our deployment with future upgrades or we will have a choice to pick either one?

Just looking for clarification and future expectations.

The contributed 'capi-helm' is A driver (emphasis not THE driver). You can install many other drivers if you wish. The work so far[1] (in this cycle) is to make sure multiple drivers work together without breaking, so production deployments using out-of-tree drivers (that magnum core team doesn't know about) don't get broken when we introduce new drivers. The breakage has nothing to do with CAPI, in fact if someone introduces a heat driver for Ubuntu OS it would have clashed with VEXXHOST's magnum-cluster-api driver too[2]. Depending on how many drivers you want to install, you may need to explicitly 'hint' what driver should be used, but the way of a glance tag. This also allows you to do cool things like have multiple drivers running if you want to migrate off one driver to another. Whether capi-helm deserves to be in-tree or not is a separate conversation; I just want to assure you of the effort towards not breaking your deployment. Regards, Jake [1] https://review.opendev.org/c/openstack/magnum/+/907297 [2] https://github.com/vexxhost/magnum-cluster-api/blob/v0.13.4/magnum_cluster_a...

thywyn＠hotmail.com

26 Feb 26 Feb

3:12 p.m.

Coming from a user's point of view (and first time posting in openstack forums in general) I would assume that this is the defacto recommendation if a driver is included and would stay away from others because the default would supposedly be the "most supported." I would think that the way that the kubernetes community is handling it would be more of my preference. Documenting various options (although not all options, which in itself is controversial) and their compatibility with various features, and then having the driver off core repos. They (the k8s community) are making very great efforts moving even the most common drivers out of core, so that they can focus on just the core. This allows the core team to determine their own destiny of sorts rather than possibly pandering to a single driver's team of devs which may or may not have other driver's in mind. To argue that the default driver would have the same influence on the core development as others who are utilizing the core components would be disenguinous. And since it was brought up... I am also a newb with commiting to opensource and the whole opendev/gerrit thing is certainly mouch more daunting compared to contributing to projects on Github as an example.

Jeremy Stanley

3:52 p.m.

New subject: Gerrit considered daunting (was: [magnum] Dropping default Cluster API driver)

On 2024-02-26 15:12:39 -0000 (-0000), thywyn@hotmail.com wrote: [...]

...

And since it was brought up... I am also a newb with commiting to opensource and the whole opendev/gerrit thing is certainly mouch more daunting compared to contributing to projects on Github as an example.

As one of the maintainers of our code review and hosting systems, I'm always interested in feedback like this, since we strive to make it as efficient and useful as possible to our projects. A little context around this would certainly help though. When you say you are "a newb with commiting to opensource," do you mean you've also got zero experience with GitHub and are learning both at roughly the same time? Or is your observation that a platform you've not used yet is daunting compared to one you've already learned to use? Have you also used GitLab, Bitbucket, Pagure, Perforce, or any other non-GitHub code review workflows, and if so how many? What aspects, specifically, are daunting with respect to Gerrit? When there are things we can improve in that regard, we do try to contribute upstream to the Gerrit project in order to elevate the experience for our users. We expect that learning to use new tools can seem daunting, and want to make that experience as smooth as possible, but everyone has to start somewhere and without learning new things there is little opportunity for growth. We generally prefer to choose the best possible tools for the task at hand, rather than choosing an inferior tool simply so that users won't be burdened with the need to learn. Far too often we get feedback along the lines of "Gerrit is inconvenient" when what the person really means is "learning to use something new is inconvenient," and that's not terribly helpful. -- Jeremy Stanley

smooney＠redhat.com

5:47 p.m.

New subject: Gerrit considered daunting (was: [magnum] Dropping default Cluster API driver)

On Mon, 2024-02-26 at 15:52 +0000, Jeremy Stanley wrote:

...

On 2024-02-26 15:12:39 -0000 (-0000), thywyn@hotmail.com wrote: [...]

...
And since it was brought up... I am also a newb with commiting to opensource and the whole opendev/gerrit thing is certainly mouch more daunting compared to contributing to projects on Github as an example.

As one of the maintainers of our code review and hosting systems, I'm always interested in feedback like this, since we strive to make it as efficient and useful as possible to our projects. A little context around this would certainly help though.

When you say you are "a newb with commiting to opensource," do you mean you've also got zero experience with GitHub and are learning both at roughly the same time? Or is your observation that a platform you've not used yet is daunting compared to one you've already learned to use? Have you also used GitLab, Bitbucket, Pagure, Perforce, or any other non-GitHub code review workflows, and if so how many? What aspects, specifically, are daunting with respect to Gerrit? When there are things we can improve in that regard, we do try to contribute upstream to the Gerrit project in order to elevate the experience for our users.

We expect that learning to use new tools can seem daunting, and want to make that experience as smooth as possible, but everyone has to start somewhere and without learning new things there is little opportunity for growth. We generally prefer to choose the best possible tools for the task at hand, rather than choosing an inferior tool simply so that users won't be burdened with the need to learn. Far too often we get feedback along the lines of "Gerrit is inconvenient" when what the person really means is "learning to use something new is inconvenient," and that's not terribly helpful.

as an experienced person contibuting via github, gitlab and gerrit (i have also used perforce and svn but with no code review) i have mixed feeling about this topic. if you don't know how to use git at all then github can hide some fo that behind a ui (web or ide) or behind its CLI "gh" meaning you never really need to learn how git works beyond the superficial level. from my recent observations of non developers interfacing with gitlab and github for the frist time this tend ot newer user making 10s of broken commits. in the github pr flow this kind of becomes the maintainer problem to resolve either by editing the PR branch directly or couching the new user to fix the issues and then squash merging... squash merging losses all the context and defers leaning how to create up a series of changes into meaningful single commits. so as a new user to a version control system i personally think leaning the Github flow is actully harmful to your personal growth as a software engineer but if all you want to so is fix a typo in a doc, and never plan to do it again it makes it simple to do one off changes. as someone who has recently been forced to use github pull request for a different project i honestly could not see us using that for our daily development. the code review expericne as a maintainer and as a contributor while simpliced in some respect is more complex in other. its much harder for two people to work on a feature at the same time and its harder to review one commit/patch at a time. one question i woudl ask new users of git is do they mainly interact with code review systems to submit reviews via the cli, ide or ui? if they are used to a ui drivven approch have they look that the gerrit vs.code pugin? i dont use it much but i have used it ocationally. my experince with using github prs lately is unless is a single relivitly simple commit you need to either pull down the pr locally or use a vs.code exteion or similar to proeprly review a pr. if they are trying to use the cli are they aware of git-review? when i first started workign with gerrit and openstack i was not using git reveiw for about 2 years. once i learned i exited it made working with gerrit much simpler then doing "git push gerrit HEAD:ref/for/chagne/<number>" i am not sure it ^ is right but its something like that. "git review" is a lot simpler and closer to "gh pr create" i will say getting a gerrit acocunt and setting up ssh in our gerrit because of the launchpad/openid pre step was somewhat hader then signing up for a github account on day one. when we had new peopel join my team in the past getting gerrit setup to login, upload ssh keys and assert you have read and agree to the icla were always stumbling blocks for all involved. That was mainly because you have to to create a ubuntu.one/lauchpad account which is not obvious and which results in more clinking the creating a gmail or github account. i think supporting github or google account login would help new commers get up to speed with gerrit https://github.com/davido/gerrit-oauth-provider can enable that but we do not use that plugin in our instance. the documentaiton for creating a new account is also not very obvious when you first go to gerrit you need to know that it exists and google for it instead of having a simple register button beside the login button or a banner when your not logged in saying to create a new accoutn read this -> https://docs.openstack.org/contributors/en_GB/common/setup-gerrit.html the little bug icon we have in our gerrit beside signin directs you to https://docs.opendev.org/opendev/system-config/latest/project.html#contribut... which does not actully metion https://docs.openstack.org/contributors/en_GB/common/setup-gerrit.html so if your google foo fials you or soemone those not tell you ^ exists is not a good on boarding experince. im not sure if that is helpful but i can see why they would think its daunting compared to https://github.com/signup

mnaser＠vexxhost.com

4:03 p.m.

I actually really like the approach you're bringing up here which is taking a similar approach to how Kubernetes has successfully scaled out the maintenance of it's different integrations, such as how the Cluster API has a dedicated repo for the OpenStack one, another one for AWS, etc. I don't see why Magnum couldn't remain focused on what it does and allow installing the "driver" that the user needs, focusing on the life cycle, API, certs, etc -- same way that the Cluster API does today.

Thomas Goirand

27 Feb 27 Feb

3:32 p.m.

New subject: Gerrit considered daunting (was: [magnum] Dropping default Cluster API driver)

On 2/26/24 16:12, thywyn@hotmail.com wrote:

...

And since it was brought up... I am also a newb with commiting to opensource and the whole opendev/gerrit thing is certainly mouch more daunting compared to contributing to projects on Github as an example.

I very much hate the Github workflow, and prefer opendev's Gerrit. Github is - clik, click, click ... wait ... click - git clone ... - git commit -a - git push - clik, click, click ... wait ... click It's so annoying compared to: - git clone ... - git commit -a - git review It's a *FACT*: git review is a way more efficient. Cheers, Thomas Goirand (zigo)

Ihar Hrachyshka

28 Feb 28 Feb

3:31 p.m.

New subject: Gerrit considered daunting (was: [magnum] Dropping default Cluster API driver)

On Tue, Feb 27, 2024 at 10:45 AM Thomas Goirand <thomas@goirand.fr> wrote:

...

On 2/26/24 16:12, thywyn@hotmail.com wrote:

...
And since it was brought up... I am also a newb with commiting to opensource and the whole opendev/gerrit thing is certainly mouch more daunting compared to contributing to projects on Github as an example.

I very much hate the Github workflow, and prefer opendev's Gerrit.

Github is - clik, click, click ... wait ... click - git clone ... - git commit -a - git push - clik, click, click ... wait ... click

gh pr list gh pr checkout ID gh pr comment gh pr create --title "Fix a bug" --body "Squash that thing" gh pr review gh pr merge That said, I completely agree that github *culture* promotes untidy branches with bare commit messages. I wouldn't say that opendev culture can't be transplanted to the github platform; but it would be an uphill battle against expectations contributors have built around what it means to "contribute on github".

...

It's so annoying compared to: - git clone ... - git commit -a - git review

It's a *FACT*: git review is a way more efficient.

Cheers,

Thomas Goirand (zigo)

Michael Knox

9:52 p.m.

New subject: Gerrit considered daunting (was: [magnum] Dropping default Cluster API driver)

I don't know that Openstack needs to go all in on GitHub. Gitea is great! Gerrit is a bit to get use too, signing up argh, it is disjointed, launchpad <-> opendev <-> gerrit and so on. There are some good docs on how to onboard, but it's not "sign in with google" and off to the races. Also, if for reasons you had a launchpad account for decades ago, it's not straightforward to recover. (at least in my specific use case). Folks on IRC are keen and willing to help, so not a people thing. If you are coming from an enterprise company, or even a smb/startup, that uses github, gitlab, bitbucket, the context switch is big, but not impossible. For sure there are some improvements that could be made, what is up for discussion no doubt, the gerrit oauth could help. One thing I find a bit disjointed is that bugs are in launchpad, patches are in gerrit, code is in gitea. Previous comments have focused on the git review process in and of itself, which is fine, but made losing context on the wider UX for newcomers? Openstack is a big project with a magnitude of moving parts. Cheers Michael On Wed, Feb 28, 2024 at 10:36 AM Ihar Hrachyshka <ihrachys@redhat.com> wrote:

...

On Tue, Feb 27, 2024 at 10:45 AM Thomas Goirand <thomas@goirand.fr> wrote:

...
On 2/26/24 16:12, thywyn@hotmail.com wrote:

...
And since it was brought up... I am also a newb with commiting to opensource and the whole opendev/gerrit thing is certainly mouch more daunting compared to contributing to projects on Github as an example.

I very much hate the Github workflow, and prefer opendev's Gerrit.

Github is - clik, click, click ... wait ... click - git clone ... - git commit -a - git push - clik, click, click ... wait ... click

gh pr list gh pr checkout ID gh pr comment gh pr create --title "Fix a bug" --body "Squash that thing" gh pr review gh pr merge

That said, I completely agree that github *culture* promotes untidy branches with bare commit messages.

I wouldn't say that opendev culture can't be transplanted to the github platform; but it would be an uphill battle against expectations contributors have built around what it means to "contribute on github".

...
It's so annoying compared to: - git clone ... - git commit -a - git review

It's a *FACT*: git review is a way more efficient.

Cheers,

Thomas Goirand (zigo)

Jeremy Stanley

10:20 p.m.

New subject: Gerrit considered daunting (was: [magnum] Dropping default Cluster API driver)

On 2024-02-28 16:52:29 -0500 (-0500), Michael Knox wrote: [...]

...

There are some good docs on how to onboard, but it's not "sign in with google" and off to the races. Also, if for reasons you had a launchpad account for decades ago, it's not straightforward to recover. (at least in my specific use case). Folks on IRC are keen and willing to help, so not a people thing. [...]

I fully agree, this is known and it's a priority for us to fix. We've got a plan (and a spec), we already have a Keycloak server in production slated to become a SSO for all OpenDev services from which we can enable a variety of "social auth" identity providers. The current hurdle is Launchpad itself (UbuntuOne SSO technically). It implements what is essentially Livejournal OpenId v1 and Keycloak does not support that most ancient of protocols, so we need to find someone with the available time and expertise to develop a sort of "bridge" to proxy UbuntuOne identities into our Keycloak so we can provide continuity for our existing Gerrit accounts. I've heard someone may have college interns interested in a neatly-scoped project like that soon, so fingers crossed we can unblock the plan in the near future. -- Jeremy Stanley

Thomas Goirand

27 Feb 27 Feb

3:26 p.m.

On 2/21/24 21:33, Dale Smith wrote:

...

However, I'm not sure I'm in support of asking another driver to be relocated and not contributed when it wants to be.

As a user of Magnum, I very much would prefer it "batteries included", with the driver in-tree. As a package maintainer, I would very much prefer *not* to have to maintain 2 out-of-tree driver packages. It's really a shame if this is happening for social reasons. The best possible outcome would be one single driver maintain in collaboration between StackHPC and Vexxhost, in-tree in Magnum, maintained in the OpenStack community. Please make this happen... Cheers, Thomas Goirand (zigo)

Mohammed Naser

3:47 p.m.

Hi Thomas, The issue is that the StackHPC has gone into a fundamentally different path: using Helm charts with manual resources. We've taken the more modern path which will be the future of CAPI which is managed topologies. I have not seen an interest in them wanting to explore our path, because they use these Helm charts in their own product (Azimuth?) and want to use them for that AND th Magnum driver. We have no interest in using plain Helm charts because we're heavily invested into managed topologies feature which has been in use for many months now. So it's a matter of choice/selection/decision, and I believe we have the better option here. tl;dr: we've taken two approaches, ours is more modern and compatible with future state of things. the other driver relies on basic Helm charts that were built for some other use case and being retrofitted for this case. Thanks Mohammed ________________________________ From: Thomas Goirand <zigo@debian.org> Sent: February 27, 2024 10:26 AM To: openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver [You don't often get email from zigo@debian.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] On 2/21/24 21:33, Dale Smith wrote:

...

However, I'm not sure I'm in support of asking another driver to be relocated and not contributed when it wants to be.

Stig Telfer

11:44 p.m.

Mohammed, I would prefer to focus on making a positive case rather than have to defend our contribution in this fashion. There is another side to the story you are presenting and I will make it inline.

...

On 27 Feb 2024, at 15:47, Mohammed Naser <mnaser@vexxhost.com> wrote:

Hi Thomas,

The issue is that the StackHPC has gone into a fundamentally different path: using Helm charts with manual resources. We've taken the more modern path which will be the future of CAPI which is managed topologies.

The Helm charts our driver uses have been used in production since 2021, and have been continuously supported and developed over that time. ClusterClass is a promising technology and merits investigation into adding to the CAPI Helm charts - particularly once it is out of "experimental (alpha)" status.

...

I have not seen an interest in them wanting to explore our path, because they use these Helm charts in their own product (Azimuth?) and want to use them for that AND th Magnum driver. We have no interest in using plain Helm charts because we're heavily invested into managed topologies feature which has been in use for many months now.

The CAPI Helm charts are a modular component and this is to their advantage. Besides the Magnum CAPI Helm driver they are used in Azimuth deployments, and also in other significant projects. A broader user community brings greater coverage and activity. Once the interoperability issues are resolved, the CAPI Helm charts are proposed to be added and maintained alongside the driver, under Magnum project governance. Operators have the advantage of being able to supply their own repo of differentiating or extended Helm charts.

...

So it's a matter of choice/selection/decision, and I believe we have the better option here.

As I understand it, you are not objecting to merging the driver. Are you claiming that the Magnum project is best served by defaulting to start with no driver loaded? Or the Heat driver, perhaps?

...

tl;dr: we've taken two approaches, ours is more modern and compatible with future state of things. the other driver relies on basic Helm charts that were built for some other use case and being retrofitted for this case.

tl;dr: the approaches have significant but not fundamental differences, except in governance. The advantage of Helm is that it provides a reusable and pluggable way to template resources, which could (once graduated) include ClusterClass resources. Thanks Stig

...

Thanks Mohammed

Mohammed Naser

28 Feb 28 Feb

12:02 a.m.

Personally, I have no patience in continuing to chase this subject, so I'll just sign off on this and mute this thread for my personal sanity 🙂 * The CAPI Helm driver was built outside governance, on GitHub, by the parties that are pushing to merge it into Magnum: https://github.com/stackhpc/magnum-capi-helm * No efforts was done to try and build both the Helm charts or the drivers under Governance, only today patches started to show up to add it. * No "open decision" was made about the use of managed topologies or not. It's in use by commercial products and supported by many upstream in CAPI. Instead, it is simply "being used because the Helm charts use them". At the end of the day, we're importing an implementation that was built outside governance, with dependencies outside our governance, and the whole subject of discussion wasn't on the implementation, but "which one gets imported by default". I will ask respectfully not to say this was built in our four open with governance. The driver had every opportunity to live out of tree (inside governance), be developed and then merged after deciding the path forward – but instead, we're speed running straight into merging to the main project. I've repeatedly mentioned that if we see active participation and see that it makes sense to bring it to a neutral ground, we'll be happy to. However, since the only contributions have been taking parts of our code for the other driver & requesting a relicense to do that, we don't see why we'd change our entire workflow to accommodate when we're the (unfortunately) the primary maintainers. We'll continue to maintain our Apache 2.0 licensed out of tree driver for those who want to continue to use it and document the differences for those users such as creating isolated clusters. We hope that there will be real open development and participation from those who are interested in contributing to it. ________________________________ From: Stig Telfer <stig@telfer.org> Sent: February 27, 2024 6:44 PM To: Mohammed Naser <mnaser@vexxhost.com> Cc: Thomas Goirand <zigo@debian.org>; openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] Dropping default Cluster API driver [You don't often get email from stig@telfer.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Mohammed, I would prefer to focus on making a positive case rather than have to defend our contribution in this fashion. There is another side to the story you are presenting and I will make it inline.

...

On 27 Feb 2024, at 15:47, Mohammed Naser <mnaser@vexxhost.com> wrote:

Hi Thomas,

The issue is that the StackHPC has gone into a fundamentally different path: using Helm charts with manual resources. We've taken the more modern path which will be the future of CAPI which is managed topologies.

...

I have not seen an interest in them wanting to explore our path, because they use these Helm charts in their own product (Azimuth?) and want to use them for that AND th Magnum driver. We have no interest in using plain Helm charts because we're heavily invested into managed topologies feature which has been in use for many months now.

...

So it's a matter of choice/selection/decision, and I believe we have the better option here.

As I understand it, you are not objecting to merging the driver. Are you claiming that the Magnum project is best served by defaulting to start with no driver loaded? Or the Heat driver, perhaps?

...

tl;dr: we've taken two approaches, ours is more modern and compatible with future state of things. the other driver relies on basic Helm charts that were built for some other use case and being retrofitted for this case.

...

Thanks Mohammed

Michael Knox

9:42 p.m.

Hey folks, Like a few others, I've been lurking on this thread. I work for an Openstack company, we really appreciate the work and ideas that the community does. So hopefully my opinion comes with that in mind. We have selected a solution that works with magnum, that provides us Openstack APi consistency with ClusterAPI, I am not venturing into which one we have chosen, since I don't think it matters in the intent of this discussion. However, the work to improve Magnum from both these contributing Openstack companies is amazing. I see no reason to "default" a driver. Include them both, include none of them. Why not make it configurable? Magnum is deployed, no configuration for any specific driver, but the service is ready. I can deploy Cinder with no driver, for example. Why couldn't we have mangum assume a Cinder like approach. We have clouds with Netapp, Pure and Powermax that are in cinder. We also have Dell EMC Unity, which is not and leverages StorOps. In or out of tree isn't bad, in tree is definity more conventiant. Perhaps I am over simplifying it or missing a magnum project context? I am not sure that it's constructive to pitch VEXXHOST vs StackHPC, both groups are doing great work, solving issues equally shared by themselves and the community, I also don't think that was the intent of the thread either and efforts in the conservation should avoid it. It's a useful discussion to be had, expected defaults, downstream user impact etc Cheers Michael On Tue, Feb 27, 2024 at 7:09 PM Mohammed Naser <mnaser@vexxhost.com> wrote:

...

Personally, I have no patience in continuing to chase this subject, so I'll just sign off on this and mute this thread for my personal sanity 🙂

- The CAPI Helm driver was built *outside governance*, on GitHub, by the parties that are pushing to merge it into Magnum: https://github.com/stackhpc/magnum-capi-helm - No efforts was done to try and build both the Helm charts or the drivers under Governance, only *today* patches started to show up to add it. - No "open decision" was made about the use of managed topologies or not. It's in use by commercial products and supported by many upstream in CAPI. Instead, it is simply "being used because the Helm charts use them".

At the end of the day, we're importing an implementation that was built *outside* governance, with dependencies *outside our governance*, and the whole subject of discussion wasn't on the implementation, but "which one gets imported by default". I will ask respectfully not to say this was built in our four open with governance.

The driver had every opportunity to live out of tree (inside governance), be developed and then merged after deciding the path forward – but instead, we're speed running straight into merging to the main project.

I've *repeatedly* mentioned that if we see active participation and see that it makes sense to bring it to a neutral ground, we'll be happy to. However, since the only contributions have been taking parts of our code for the other driver & requesting a relicense to do that, we don't see why we'd change our entire workflow to accommodate when we're the (unfortunately) the primary maintainers.

We'll continue to maintain our Apache 2.0 licensed out of tree driver for those who want to continue to use it and document the differences for those users such as creating isolated clusters. We hope that there will be *real* open development and participation from those who are interested in contributing to it. ------------------------------ *From:* Stig Telfer <stig@telfer.org> *Sent:* February 27, 2024 6:44 PM *To:* Mohammed Naser <mnaser@vexxhost.com> *Cc:* Thomas Goirand <zigo@debian.org>; openstack-discuss@lists.openstack.org < openstack-discuss@lists.openstack.org> *Subject:* Re: [magnum] Dropping default Cluster API driver

[You don't often get email from stig@telfer.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

Mohammed,

I would prefer to focus on making a positive case rather than have to defend our contribution in this fashion. There is another side to the story you are presenting and I will make it inline.

...
On 27 Feb 2024, at 15:47, Mohammed Naser <mnaser@vexxhost.com> wrote:

Hi Thomas,

The issue is that the StackHPC has gone into a fundamentally different path: using Helm charts with manual resources. We've taken the more modern path which will be the future of CAPI which is managed topologies.

The Helm charts our driver uses have been used in production since 2021, and have been continuously supported and developed over that time.

ClusterClass is a promising technology and merits investigation into adding to the CAPI Helm charts - particularly once it is out of "experimental (alpha)" status.

...
I have not seen an interest in them wanting to explore our path, because they use these Helm charts in their own product (Azimuth?) and want to use them for that AND th Magnum driver. We have no interest in using plain Helm charts because we're heavily invested into managed topologies feature which has been in use for many months now.

The CAPI Helm charts are a modular component and this is to their advantage. Besides the Magnum CAPI Helm driver they are used in Azimuth deployments, and also in other significant projects. A broader user community brings greater coverage and activity.

Once the interoperability issues are resolved, the CAPI Helm charts are proposed to be added and maintained alongside the driver, under Magnum project governance.

Operators have the advantage of being able to supply their own repo of differentiating or extended Helm charts.

...
So it's a matter of choice/selection/decision, and I believe we have the better option here.

As I understand it, you are not objecting to merging the driver.

Are you claiming that the Magnum project is best served by defaulting to start with no driver loaded? Or the Heat driver, perhaps?

...
tl;dr: we've taken two approaches, ours is more modern and compatible with future state of things. the other driver relies on basic Helm charts that were built for some other use case and being retrofitted for this case.

tl;dr: the approaches have significant but not fundamental differences, except in governance. The advantage of Helm is that it provides a reusable and pluggable way to template resources, which could (once graduated) include ClusterClass resources.

Thanks Stig

...
Thanks Mohammed

Clark Boylan

10:09 p.m.

On Wed, Feb 28, 2024, at 1:42 PM, Michael Knox wrote:

...

Hey folks,

Like a few others, I've been lurking on this thread.

I work for an Openstack company, we really appreciate the work and ideas that the community does. So hopefully my opinion comes with that in mind. We have selected a solution that works with magnum, that provides us Openstack APi consistency with ClusterAPI, I am not venturing into which one we have chosen, since I don't think it matters in the intent of this discussion. However, the work to improve Magnum from both these contributing Openstack companies is amazing.

I see no reason to "default" a driver. Include them both, include none of them. Why not make it configurable? Magnum is deployed, no configuration for any specific driver, but the service is ready. I can deploy Cinder with no driver, for example. Why couldn't we have mangum assume a Cinder like approach. We have clouds with Netapp, Pure and Powermax that are in cinder. We also have Dell EMC Unity, which is not and leverages StorOps. In or out of tree isn't bad, in tree is definity more conventiant. Perhaps I am over simplifying it or missing a magnum project context?

I too have been lurking and haven't chimed in as I am not a magnum dev or user. That said there is some "fun" openstack history that is worth calling out in the context of these questions. Once upon a time Glance implemented v2 of its API. This new API version implemented a task system that was expected to be used for image upload/import. Rather than direct byte transfer you submitted a task request, the task did the work in the background, and eventually it completed and your image was available for use. The intentions behind this were good; the glance team was trying to solve problems that existed with the old upload process. There was a major flaw though: there was no implementation of a tasks backend (at least upstream or open source from third parties) for image uploads via tasks and the API was too high level to provide a consistent user experience from one cloud to another. This wiki doc is about as good as the docs got: https://wiki.openstack.org/wiki/Glance-tasks-api. There was a schema to do things within glance, but implementation and consistency between clouds was left to the operator. Why was this bad? OpenStack couldn't test a major piece of functionality because implementing it was left to others. No OpenStack users knew how to use this system until Monty figured it out somehow and wrote shade. I think it is important for every OpenStack project to have at least one valid working upstream driver. This provides the ability to more easily run tests and understand how the service APIs are intended to function. The lines get blurry when you look at what goes in OpenStack and what doesn't, and it sounds like all of the options being discussed here are open source and can be used for upstream testing and so on. That said I think OpenStack projects should do their best to ensure they avoid writing API frontends that aren't much more than that. History has shown us that empty APIs without drivers are not testable and are bad for users.

...

I am not sure that it's constructive to pitch VEXXHOST vs StackHPC, both groups are doing great work, solving issues equally shared by themselves and the community, I also don't think that was the intent of the thread either and efforts in the conservation should avoid it.

It's a useful discussion to be had, expected defaults, downstream user impact etc

Cheers Michael

Dmitriy Rabotyagov

29 Feb 29 Feb

6:43 a.m.

...

I see no reason to "default" a driver. Include them both, include none of them. Why not make it configurable? Magnum is deployed, no configuration for any specific driver, but the service is ready. I can deploy Cinder with no driver, for example. Why couldn't we have mangum assume a Cinder like approach. We have clouds with Netapp, Pure and Powermax that are in cinder. We also have Dell EMC Unity, which is not and leverages StorOps. In or out of tree isn't bad, in tree is definity more conventiant. Perhaps I am over simplifying it or missing a magnum project context?

Making driver configurable was always implied. There was never a question if to allow external drivers or not, as well as leave to users to define which one to use (alike to cinder). The question was more or less only about - if have in-tree driver or do have only external ones. Magnum team did quite good job to ensure our of tree drivers are supported at a good level despite all this discussion.

...

That said I think OpenStack projects should do their best to ensure they avoid writing API frontends that aren't much more than that. History has shown us that empty APIs without drivers are not testable and are bad for users.

While I didn't know the history, I do fully agree with that, and basically why I was supporting idea to have *some* in-tree driver, despite how good or bad it is. However yesterday on the Magnum meeting [1] it was decided to place StackHPC driver to a separate repo, meaning not having any in-tree driver and making Magnum a useful API reference project more or less. Patches to create a repo and add it to the governance were already created. [2] So probably no reason for any further discussions on this topic... [1] https://meetings.opendev.org/meetings/magnum/2024/magnum.2024-02-28-09.00.lo... [2] https://review.opendev.org/c/openstack/project-config/+/910239 On Wed, Feb 28, 2024, 23:16 Clark Boylan <cboylan@sapwetik.org> wrote:

...

On Wed, Feb 28, 2024, at 1:42 PM, Michael Knox wrote:

...
Hey folks,

Like a few others, I've been lurking on this thread.

I work for an Openstack company, we really appreciate the work and ideas that the community does. So hopefully my opinion comes with that in mind. We have selected a solution that works with magnum, that provides us Openstack APi consistency with ClusterAPI, I am not venturing into which one we have chosen, since I don't think it matters in the intent of this discussion. However, the work to improve Magnum from both these contributing Openstack companies is amazing.

I see no reason to "default" a driver. Include them both, include none of them. Why not make it configurable? Magnum is deployed, no configuration for any specific driver, but the service is ready. I can deploy Cinder with no driver, for example. Why couldn't we have mangum assume a Cinder like approach. We have clouds with Netapp, Pure and Powermax that are in cinder. We also have Dell EMC Unity, which is not and leverages StorOps. In or out of tree isn't bad, in tree is definity more conventiant. Perhaps I am over simplifying it or missing a magnum project context?

I too have been lurking and haven't chimed in as I am not a magnum dev or user. That said there is some "fun" openstack history that is worth calling out in the context of these questions. Once upon a time Glance implemented v2 of its API. This new API version implemented a task system that was expected to be used for image upload/import. Rather than direct byte transfer you submitted a task request, the task did the work in the background, and eventually it completed and your image was available for use. The intentions behind this were good; the glance team was trying to solve problems that existed with the old upload process.

There was a major flaw though: there was no implementation of a tasks backend (at least upstream or open source from third parties) for image uploads via tasks and the API was too high level to provide a consistent user experience from one cloud to another. This wiki doc is about as good as the docs got: https://wiki.openstack.org/wiki/Glance-tasks-api. There was a schema to do things within glance, but implementation and consistency between clouds was left to the operator.

Why was this bad? OpenStack couldn't test a major piece of functionality because implementing it was left to others. No OpenStack users knew how to use this system until Monty figured it out somehow and wrote shade. I think it is important for every OpenStack project to have at least one valid working upstream driver. This provides the ability to more easily run tests and understand how the service APIs are intended to function. The lines get blurry when you look at what goes in OpenStack and what doesn't, and it sounds like all of the options being discussed here are open source and can be used for upstream testing and so on.

That said I think OpenStack projects should do their best to ensure they avoid writing API frontends that aren't much more than that. History has shown us that empty APIs without drivers are not testable and are bad for users.

...
I am not sure that it's constructive to pitch VEXXHOST vs StackHPC, both groups are doing great work, solving issues equally shared by themselves and the community, I also don't think that was the intent of the thread either and efforts in the conservation should avoid it.

It's a useful discussion to be had, expected defaults, downstream user impact etc

Cheers Michael

Michał Nasiadka

7:29 a.m.

W dniu czw., 29.02.2024 o 07:53 Dmitriy Rabotyagov <noonedeadpunk@gmail.com> napisał(a):

...

...
I see no reason to "default" a driver. Include them both, include none of them. Why not make it configurable? Magnum is deployed, no configuration for any specific driver, but the service is ready. I can deploy Cinder with no driver, for example. Why couldn't we have mangum assume a Cinder like approach. We have clouds with Netapp, Pure and Powermax that are in cinder. We also have Dell EMC Unity, which is not and leverages StorOps. In or out of tree isn't bad, in tree is definity more conventiant. Perhaps I am over simplifying it or missing a magnum project context?

Making driver configurable was always implied. There was never a question if to allow external drivers or not, as well as leave to users to define which one to use (alike to cinder). The question was more or less only about - if have in-tree driver or do have only external ones. Magnum team did quite good job to ensure our of tree drivers are supported at a good level despite all this discussion.

...
That said I think OpenStack projects should do their best to ensure they avoid writing API frontends that aren't much more than that. History has shown us that empty APIs without drivers are not testable and are bad for users.

While I didn't know the history, I do fully agree with that, and basically why I was supporting idea to have *some* in-tree driver, despite how good or bad it is.

There is a driver, the Heat based driver - it’s not deprecated yet, and Magnum community understand the significance of an in-tree reference driver - it just doesn’t need to be one of the two CAPI drivers (maybe a simplified version of current driver that would allow users to stand up CAPI management cluster). That is going to be discussed again at the PTG surely.

...

However yesterday on the Magnum meeting [1] it was decided to place StackHPC driver to a separate repo, meaning not having any in-tree driver and making Magnum a useful API reference project more or less.

Patches to create a repo and add it to the governance were already created. [2]

So probably no reason for any further discussions on this topic...

[1] https://meetings.opendev.org/meetings/magnum/2024/magnum.2024-02-28-09.00.lo... [2] https://review.opendev.org/c/openstack/project-config/+/910239

The decision was made, to go forward out of this situation - instead of continuing endless discussion between two parties that would probably never lead to a consensus. This way we can have CI in Magnum repo for at least one CAPI driver. It doesn’t mean Magnum will never have an in-tree CAPI driver - but I feel it’s not going to happen soon.

...

On Wed, Feb 28, 2024, 23:16 Clark Boylan <cboylan@sapwetik.org> wrote:

...
On Wed, Feb 28, 2024, at 1:42 PM, Michael Knox wrote:

...
Hey folks,

Like a few others, I've been lurking on this thread.

I work for an Openstack company, we really appreciate the work and ideas that the community does. So hopefully my opinion comes with that in mind. We have selected a solution that works with magnum, that provides us Openstack APi consistency with ClusterAPI, I am not venturing into which one we have chosen, since I don't think it matters in the intent of this discussion. However, the work to improve Magnum from both these contributing Openstack companies is amazing.

I see no reason to "default" a driver. Include them both, include none of them. Why not make it configurable? Magnum is deployed, no configuration for any specific driver, but the service is ready. I can deploy Cinder with no driver, for example. Why couldn't we have mangum assume a Cinder like approach. We have clouds with Netapp, Pure and Powermax that are in cinder. We also have Dell EMC Unity, which is not and leverages StorOps. In or out of tree isn't bad, in tree is definity more conventiant. Perhaps I am over simplifying it or missing a magnum project context?

I too have been lurking and haven't chimed in as I am not a magnum dev or user. That said there is some "fun" openstack history that is worth calling out in the context of these questions. Once upon a time Glance implemented v2 of its API. This new API version implemented a task system that was expected to be used for image upload/import. Rather than direct byte transfer you submitted a task request, the task did the work in the background, and eventually it completed and your image was available for use. The intentions behind this were good; the glance team was trying to solve problems that existed with the old upload process.

There was a major flaw though: there was no implementation of a tasks backend (at least upstream or open source from third parties) for image uploads via tasks and the API was too high level to provide a consistent user experience from one cloud to another. This wiki doc is about as good as the docs got: https://wiki.openstack.org/wiki/Glance-tasks-api. There was a schema to do things within glance, but implementation and consistency between clouds was left to the operator.

Why was this bad? OpenStack couldn't test a major piece of functionality because implementing it was left to others. No OpenStack users knew how to use this system until Monty figured it out somehow and wrote shade. I think it is important for every OpenStack project to have at least one valid working upstream driver. This provides the ability to more easily run tests and understand how the service APIs are intended to function. The lines get blurry when you look at what goes in OpenStack and what doesn't, and it sounds like all of the options being discussed here are open source and can be used for upstream testing and so on.

That said I think OpenStack projects should do their best to ensure they avoid writing API frontends that aren't much more than that. History has shown us that empty APIs without drivers are not testable and are bad for users.

...
I am not sure that it's constructive to pitch VEXXHOST vs StackHPC, both groups are doing great work, solving issues equally shared by themselves and the community, I also don't think that was the intent of the thread either and efforts in the conservation should avoid it.

It's a useful discussion to be had, expected defaults, downstream user impact etc

Cheers Michael

Dmitriy Rabotyagov

7:53 a.m.

...

There is a driver, the Heat based driver - it’s not deprecated yet

If I'm not mistaken, the intention was to depreciate it for 2024.2? So should we then give false promises about the driver future?

...

instead of continuing endless discussion between two parties that would probably never lead to a consensus

A slightly more then just 2 parties were involved in this ML ;) But anyway good that consensus was reached. On Thu, Feb 29, 2024, 08:29 Michał Nasiadka <mnasiadka@gmail.com> wrote:

...

W dniu czw., 29.02.2024 o 07:53 Dmitriy Rabotyagov < noonedeadpunk@gmail.com> napisał(a):

...
...
I see no reason to "default" a driver. Include them both, include none of them. Why not make it configurable? Magnum is deployed, no configuration for any specific driver, but the service is ready. I can deploy Cinder with no driver, for example. Why couldn't we have mangum assume a Cinder like approach. We have clouds with Netapp, Pure and Powermax that are in cinder. We also have Dell EMC Unity, which is not and leverages StorOps. In or out of tree isn't bad, in tree is definity more conventiant. Perhaps I am over simplifying it or missing a magnum project context?

Making driver configurable was always implied. There was never a question if to allow external drivers or not, as well as leave to users to define which one to use (alike to cinder). The question was more or less only about - if have in-tree driver or do have only external ones. Magnum team did quite good job to ensure our of tree drivers are supported at a good level despite all this discussion.

...
That said I think OpenStack projects should do their best to ensure they avoid writing API frontends that aren't much more than that. History has shown us that empty APIs without drivers are not testable and are bad for users.

While I didn't know the history, I do fully agree with that, and basically why I was supporting idea to have *some* in-tree driver, despite how good or bad it is.

There is a driver, the Heat based driver - it’s not deprecated yet, and Magnum community understand the significance of an in-tree reference driver - it just doesn’t need to be one of the two CAPI drivers (maybe a simplified version of current driver that would allow users to stand up CAPI management cluster). That is going to be discussed again at the PTG surely.

...
However yesterday on the Magnum meeting [1] it was decided to place StackHPC driver to a separate repo, meaning not having any in-tree driver and making Magnum a useful API reference project more or less.

Patches to create a repo and add it to the governance were already created. [2]

So probably no reason for any further discussions on this topic...

[1] https://meetings.opendev.org/meetings/magnum/2024/magnum.2024-02-28-09.00.lo... [2] https://review.opendev.org/c/openstack/project-config/+/910239

The decision was made, to go forward out of this situation - instead of continuing endless discussion between two parties that would probably never lead to a consensus. This way we can have CI in Magnum repo for at least one CAPI driver.

It doesn’t mean Magnum will never have an in-tree CAPI driver - but I feel it’s not going to happen soon.

...
On Wed, Feb 28, 2024, 23:16 Clark Boylan <cboylan@sapwetik.org> wrote:

...
On Wed, Feb 28, 2024, at 1:42 PM, Michael Knox wrote:

...
Hey folks,

Like a few others, I've been lurking on this thread.

I work for an Openstack company, we really appreciate the work and ideas that the community does. So hopefully my opinion comes with that in mind. We have selected a solution that works with magnum, that provides us Openstack APi consistency with ClusterAPI, I am not venturing into which one we have chosen, since I don't think it matters in the intent of this discussion. However, the work to improve Magnum from both these contributing Openstack companies is amazing.

I see no reason to "default" a driver. Include them both, include none of them. Why not make it configurable? Magnum is deployed, no configuration for any specific driver, but the service is ready. I can deploy Cinder with no driver, for example. Why couldn't we have mangum assume a Cinder like approach. We have clouds with Netapp, Pure and Powermax that are in cinder. We also have Dell EMC Unity, which is not and leverages StorOps. In or out of tree isn't bad, in tree is definity more conventiant. Perhaps I am over simplifying it or missing a magnum project context?

I too have been lurking and haven't chimed in as I am not a magnum dev or user. That said there is some "fun" openstack history that is worth calling out in the context of these questions. Once upon a time Glance implemented v2 of its API. This new API version implemented a task system that was expected to be used for image upload/import. Rather than direct byte transfer you submitted a task request, the task did the work in the background, and eventually it completed and your image was available for use. The intentions behind this were good; the glance team was trying to solve problems that existed with the old upload process.

There was a major flaw though: there was no implementation of a tasks backend (at least upstream or open source from third parties) for image uploads via tasks and the API was too high level to provide a consistent user experience from one cloud to another. This wiki doc is about as good as the docs got: https://wiki.openstack.org/wiki/Glance-tasks-api. There was a schema to do things within glance, but implementation and consistency between clouds was left to the operator.

Why was this bad? OpenStack couldn't test a major piece of functionality because implementing it was left to others. No OpenStack users knew how to use this system until Monty figured it out somehow and wrote shade. I think it is important for every OpenStack project to have at least one valid working upstream driver. This provides the ability to more easily run tests and understand how the service APIs are intended to function. The lines get blurry when you look at what goes in OpenStack and what doesn't, and it sounds like all of the options being discussed here are open source and can be used for upstream testing and so on.

That said I think OpenStack projects should do their best to ensure they avoid writing API frontends that aren't much more than that. History has shown us that empty APIs without drivers are not testable and are bad for users.

...
I am not sure that it's constructive to pitch VEXXHOST vs StackHPC, both groups are doing great work, solving issues equally shared by themselves and the community, I also don't think that was the intent

of

...
the thread either and efforts in the conservation should avoid it.

It's a useful discussion to be had, expected defaults, downstream user impact etc

Cheers Michael

519

Age (days ago)

532

Last active (days ago)

List overview

Download

34 comments

19 participants

participants (19)

Clark Boylan
Dale Smith
Dmitriy Rabotyagov
Ihar Hrachyshka
Jake Yip
Jeremy Stanley
Julia Kreger
Michael Knox
Michał Nasiadka
mnaser＠vexxhost.com
Mohammed Naser
Nguyễn Hữu Khôi
Satish Patel
smooney＠redhat.com
Stig Telfer
Stig Telfer
Thomas Goirand
Thomas Goirand
thywyn＠hotmail.com