Hello, Please note that the list provided was just from my own notes and I didn’t put much emphasis on making it complete or accurate. If you think back to the early days of OpenStack the pain was in OpenStack itself, today it’s a challenge instead to manage OpenStack scale and fit into the things we see, like moving OpenStack into containers (for example managed by Kubernetes). I would like OpenStack design to more embrace the distributed, cloud-native approach that Ceph and Kubernetes brings, and the resiliency of Ceph (and yes, I’m a major Ceph enthusiast) and there I’m seeing messaging and database as potential blockers to continue on that path. I’m not saying that’s the only thing, for example stuff like [1] _really_ matter in real world deployments so working on other OpenStack parts for resilience is also crucial. There is things I’m interested in that would impact the overall design, I can list some of them but I think it might be to broad of a subject for this thread. * Like brought up my Mohammed Naser before, I would like to investigate an effort for containers as an OpenStack deliverable for projects * Investigate cloud-native, highly available and resilient alternatives for messaging and database * Make OpenStack more resilient with above and [1] is a great example on what I mean I’ll respond to your questions with my views inline below. P.S The opionions stated here is my own personal opinions and should not be assumed to be the opinions of any other entity. Best regards Tobias [1] https://bugs.launchpad.net/neutron/+bug/1987780 Begin forwarded message: From: Radosław Piliszek <radoslaw.piliszek@gmail.com<mailto:radoslaw.piliszek@gmail.com>> Subject: Re: [all] [oslo.messaging] Interest in collaboration on a NATS driver Date: 29 August 2022 at 18:33:15 CEST To: Tobias Urdin <tobias.urdin@binero.com<mailto:tobias.urdin@binero.com>> Cc: openstack-discuss <openstack-discuss@lists.openstack.org<mailto:openstack-discuss@lists.openstack.org>> Hi Tobias, Good to see RMQ alternatives appearing. A couple of questions from me. On Mon, 29 Aug 2022 at 15:47, Tobias Urdin <tobias.urdin@binero.com<mailto:tobias.urdin@binero.com>> wrote: • Do retries and acknowledgements in the library (since NATS does NOT persist messages like RabbitMQ could) What do you mean? Is NATS only a router? (I have not used this technology yet.) It does not persist messages, if there is no backend to respond, the message will be dropped without any action hence why I want the RPC layer in oslo.messaging (that already does acknowledge calls in the driver) to notify client side that it’s being processed before client side waits for reply. • Find or maintain a NATS python library that doesn't use async like the official one does Why is async a bad thing? For messaging it's the right thing. This is actually just myself, I would love to just being able to use the official that is async based instead it’s just me that doesn’t understand how that would be implemented. https://github.com/nats-io/nats.py instead of the one in POC https://github.com/Gr1N/nats-python which has a lot of shortcomings and issues, my idea was just to investigate if was even possible to implement in a feasible way. Finally, have you considered just trying out ZeroMQ? Does not exist anymore. I mean, NATS is probably an overkill for OpenStack services since the majority of them stay static on the hosts they control (think nova-compute, neutron agents - and these are also the pain points that operators want to ease). I don’t think it it, or even if it is, why not use a better solution or stable approach than RabbitMQ? This is also the whole point, I don’t want OpenStack to become or be static, I want it to be more dynamic and cloud-native in it’s approach and support viable integrations that takes it there, we cannot live in the past forever, let’s envision and dream of the future as we want it! :) NATS seems to me to cater for a different use case. It actually caters to a lot of use cases. I might be wrong because I have read only the front page but that is the feeling I have. Cheers, Radek -yoctozepto Begin forwarded message: From: Radosław Piliszek <radoslaw.piliszek@gmail.com<mailto:radoslaw.piliszek@gmail.com>> Subject: Re: [all] [oslo.messaging] Interest in collaboration on a NATS driver Date: 29 August 2022 at 20:20:15 CEST To: Sean Mooney <smooney@redhat.com<mailto:smooney@redhat.com>> Cc: Tobias Urdin <tobias.urdin@binero.com<mailto:tobias.urdin@binero.com>>, openstack-discuss <openstack-discuss@lists.openstack.org<mailto:openstack-discuss@lists.openstack.org>> On Mon, 29 Aug 2022 at 20:03, Sean Mooney <smooney@redhat.com<mailto:smooney@redhat.com>> wrote: Finally, have you considered just trying out ZeroMQ? ZeroMQ used to be supported in the past but then it was remvoed if i understand correctly it only supprot notificaiton or RPC but not both i dont recall which but perhapse im miss rememebrign on that point. I believe it would be better suited for RPC than notifications, at least in the simplest form. As it’s advertised as scalable and performant I would argue that, why not use it for notifications as well? If anything according to your observations above it’s more suited for that than RPC, even though request-reply (that we can use for RPC) is a strong first-class implementation in NATS as well. I mean, NATS is probably an overkill for OpenStack services since the majority of them stay static on the hosts they control (think nova-compute, neutron agents - and these are also the pain points that operators want to ease). its not any more overkill then rabbitmq is True that. Probably. I agree with that, also if you think about it, how many issues related to stability, performance and outages is related to RabbitMQ? It’s quite a few if you ask me. Just the resource utilization and clustering in RabbitMQ makes me feel bad. It’s here that I mean that the cloud-native and scalable implementation would shine, you should be able to rely on it, if sometimes dies so what, things should just continue to work and that’s not my experience with RabbitMQ but it is my experience with Ceph because in the end the design really matters. i also dont know waht you mean when you say "majority of them stay static on the hosts they control" NATS is intended a s a cloud native horrizontally scaleable message bus. which is exactly what openstack need IMO. NATS seems to be tweaked for "come and go" situations which is an exception in the OpenStack world, not the rule (at least in my view). I mean, one normally expects to have a preset number of hypervisors and not them coming and going (which, I agree, is a nice vision, could be a proper NATS driver, with more awareness in the client projects I believe, would be an enabler for more dynamic clouds). It could, but it also doesn’t have to be that. Why not strive for more dynamic? I don’t think anybody would argue that more dynamic is a bad thing even if you were to have a more static approach to your cloud. Cheers, Radek -yoctozepto